Exploratory Data Analysis Using Global Terrorism Dataset | Capstone Projects Help | Sample Paper | Realcode4you
- realcode4you
- 6 hours ago
- 4 min read
Project Explanation
The Global Terrorism Database is an open-source database including information on terrorist around the world from 1970 through 2017. the GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 180, 000 attacks. The database is maintained by researchers at the national Consortium for the Study of Terrorism and Responses to Terrorism(START), headquartered of the University of Maryland. Explore and Analyze the data to discover key finding pertaining to terrorist activities.
Main Libraries to be used:
Pandas for data manipulation aggregation
matplotlib and seaborn for visualization and behaviour with respect to the target variable.
Use at least 5 different visualizations.
NumPy for computationally efficient operations
Problem Solution
Exploring Global Terrorism Trends: A Geospatial Analysis
Introduction:
According to a recent survey, the world faces a dual challenge - natural and man-made calamities. Each year, an astonishing 218 million people are affected by these calamities, resulting in the tragic loss of approximately 68,000 lives. While the frequency of natural disasters such as earthquakes and volcanoes has remained relatively constant, a concerning trend emerges on the global stage - the steady growth in the number of terrorist activities over the years.
About the dataset
The Dataset was extracted from the Global Terrorism Database (GTD) - an open-source database including information on terrorist attacks around the world from 1970 through 2017. The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 180,000 attacks.
# [$] Importing the required libraries >>
# [$] Pandas for Dealing with DataFrame >>
import pandas as pd
# [$] Numpy for dealing with calculation >>
import numpy as np
# [$] Seaborn for data visualization >>
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
plt.style.use('fivethirtyeight')
import warnings
warnings.filterwarnings('ignore')
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.tools as tls
# !pip install basemap
from mpl_toolkits.basemap import Basemap
import folium
import folium.plugins
from matplotlib import animation,rc
# [$] Importing modules for Displaying Map >>
import io
import base64
from IPython.display import HTML, display
import codecs
from subprocess import check_output
output:
Collecting basemap
Downloading basemap-1.3.8-cp310-cp310-manylinux1_x86_64.whl (860 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 860.7/860.7 kB 9.8 MB/s eta 0:00:00
Collecting basemap-data<1.4,>=1.3.2 (from basemap)
Downloading basemap_data-1.3.2-py2.py3-none-any.whl (30.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.5/30.5 MB 34.2 MB/s eta 0:00:00
Requirement already satisfied: pyshp<2.4,>=1.2 in /opt/conda/lib/python3.10/site-packages (from basemap) (2.3.1)
Requirement already satisfied: matplotlib<3.8,>=1.5 in /opt/conda/lib/python3.10/site-packages (from basemap) (3.7.2)
Requirement already satisfied: pyproj<3.7.0,>=1.9.3 in /opt/conda/lib/python3.10/site-packages (from basemap) (3.6.0)
Requirement already satisfied: numpy<1.26,>=1.21 in /opt/conda/lib/python3.10/site-packages (from basemap) (1.23.5)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (4.40.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (21.3)
Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (9.5.0)
Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.10/site-packages (from matplotlib<3.8,>=1.5->basemap) (2.8.2)
Requirement already satisfied: certifi in /opt/conda/lib/python3.10/site-packages (from pyproj<3.7.0,>=1.9.3->basemap) (2023.7.22)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib<3.8,>=1.5->basemap) (1.16.0)
Installing collected packages: basemap-data, basemap
Successfully installed basemap-1.3.8 basemap-data-1.3.2Reading Data
# [$] Reading the CSV File >>
df = pd.read_csv('globalterrorism.csv', encoding = 'ISO-8859-1')# [$] Lets explore the datset >>
df.head()output:

# [$] Print the Columns of Dataset >>
df.columnsout:
Index(['eventid', 'iyear', 'imonth', 'iday', 'approxdate', 'extended',
'resolution', 'country', 'country_txt', 'region',
...
'addnotes', 'scite1', 'scite2', 'scite3', 'dbsource', 'INT_LOG',
'INT_IDEO', 'INT_MISC', 'INT_ANY', 'related'],
dtype='object', length=135)# [$] lets check Shape No Of Rows & Columns in dataset >>
df.shapeout:
(181691, 135)# [$] Statstical description of Dataset >>
df.describe()output:

Data Preprocessing
# [$] Renaming Columns >>
df.rename(columns={'iyear':'Year','imonth':'Month','iday':'Day','country_txt':'Country','provstate':'state','region_txt':'Region','attacktype1_txt':'AttackType','target1':'Target','nkill':'Killed','nwound': 'Wounded','summary':'Summary','gname':'Group','targtype1_txt':'Target_type','weaptype1_txt':'Weapon_type','motive':'Motive'
},inplace=True)# [$] Remove unnecessary columns & take which are important for analysis >>
df = df[['eventid','Year','Month','Day','Country','Region','state','city','latitude','longitude','AttackType','Killed','Wounded','Target','Summary','Group','Target_type','Weapon_type','Motive','success']]Explanation of selected columns:
success - Success of a terrorist strike
suicide - 1 = "Yes" The incident was a suicide attack. 0 = "No" There is no indication that the incident - was a suicide
attacktype1 - The general method of attack
attacktype1_txt - The general method of attack and broad class of tactics used.
targtype1_txt - The general type of target/victim
targsubtype1_txt - The more specific target category
target1 - The specific person, building, installation that was targeted and/or victimized
natlty1_txt - The nationality of the target that was attacked
gname - The name of the group that carried out the attack
gsubname - Additional details about group that carried out the attack like fractions
nperps - The total number of terrorists participating in the incident
weaptype1_txt - General type of weapon used in the incident
weapsubtype1_txt - More specific value for most of the Weapon Types
nkill - The number of total confirmed fatalities for the incident
nkillus - The number of U.S. citizens who died as a result of the incident
df['Killed'].sample(10)output:
24480 0.0
145806 0.0
65993 2.0
68775 0.0
110176 0.0
177181 1.0
99190 0.0
35210 0.0
6386 NaN
133783 0.0
Name: Killed, dtype: float64Create a new column 'casualties' by adding 'killed' and 'wounded'
# [$] Create new feature by adding column casualities >>
df['casualities']=df['Killed']+df['Wounded']
df.head(3)output:

# [$] Lets again check the shape of dataset Rows & Columns >> New Rows & Columns
df.shapeoutput:
(181691, 21)# [$] Check null counts of Dataframe >>
df.isna().sum()output:

# [$] Statstical description Of dataset >>
df.describe()output:

EDA Exploratory Data Analysis
Number Of Terrorist Acticity Each Years
# [$] Number Of Terrorist Activity According to Year >>
year_attacks = df.groupby('Year').size().reset_index(name='count')
sns.lineplot(x='Year', y='count', data=year_attacks, color='red')
plt.xlabel('Year')
plt.ylabel('Number of Attacks')
plt.title("Number of Terrorist Acticity")
plt.show()output:

# [$] Number Of Terrorist Activity Per Each Year >>
plt.subplots(figsize=(15,6))
sns.countplot(data=df, x='Year', palette='inferno')
plt.xticks(rotation=90)
plt.title('Number Of Terrorist Activities Each Year')
plt.show()output:

There has been gradually increase in global terrorist activities year by year. The year 2014 has the highest recorded incidents. but there has been a subsequent decline in terrorist activity after 2014, may be some improvement in global security efforts.
Terrorist Attacks Trends in Regions
# [$] Group the Data By Year & Region >>
year_attacks_region = df.groupby(['Year','Region']).size().reset_index(name='count')plt.subplots(figsize=(15,6))
sns.lineplot(x='Year',y='count',hue='Region',data=year_attacks_region)
plt.title('Terrorist Attacks Trends in Regions')
plt.xlabel('Year')
plt.ylabel('Number of Attacks')
plt.show()output:

If you need complete solution of this then you can contact us or send your requirement details at:



Comments