Data Collection Methodology

The Global Terrorism Database (GTD) was developed to be a comprehensive, methodologically robust set of longitudinal data on incidents of domestic and international terrorism. Its primary purpose is to enable researchers and analysts to increase understanding of the phenomenon of terrorism. The GTD is specifically designed to be amenable to the latest quantitative analytic techniques used in the social and computational sciences.

Scope of Data

The GTD was designed to gather a wide variety of etiological and situational variables pertaining to each terrorist incident. Depending on availability of information, the database records up to 120 separate attributes of each incident, including approximately 75 coded variables that can be used for statistical analysis. These are collected under eight broad categories, as identified in the GTD Codebook, and include, whenever possible:

  • incident date
  • region
  • country
  • state/province
  • city
  • latitude and longitude (beta)
  • perpetrator group name
  • tactic used in attack
  • nature of the target (type and sub-type, up to three targets)
  • identity, corporation, and nationality of the target (up to three nationalities)
  • type of weapons used (type and sub-type, up to three weapons types)
  • whether the incident was considered a success
  • if and how a claim(s) of responsibility was made
  • amount of damage, and more narrowly, the amount of United States damage
  • total number of fatalities (persons, United States nationals, terrorists)
  • total number of injured (persons, United States nationals, terrorists)
  • indication of whether the attack is international or domestic

Other variables provide information unique to specific types of cases, including kidnappings, hostage incidents, and hijackings.

Sources

Information in the GTD is drawn entirely from publicly available, open-source materials. These include electronic news archives, existing data sets, secondary source materials such as books and journals, and legal documents. All information contained in the GTD reflects what is reported in those sources. While the database developers attempt, to the best of their abilities, to corroborate each piece of information among multiple independent open sources, they make no further claims as to the veracity of this information. Users should not infer any additional actions or results beyond what is presented in a GTD entry and specifically, users should not infer an individual associated with a particular incident was tried and convicted of terrorism or any other criminal offense. If new documentation about an event becomes available, an entry may be modified, as necessary and appropriate.

As discussed in more detail below, the first phase of data for the GTD (GTD1: 1970-1997) was collected by the Pinkerton Global Intelligence Service (PGIS)—a private security agency. Cases that occurred between 1998 and March 2008 were identified and coded by the Center for Terrorism and Intelligence Studies (CETIS), in partnership with START. A third data collection phase was instituted for cases that occurred between April 2008 and October 2011, with efforts led by the Institute for the Study of Violent Groups (ISVG) at the University of New Haven. Beginning with cases that occurred in November 2011, GTD data collection is done by START staff at the University of Maryland. In addition, GTD researchers have worked to supplement information on additional cases throughout the full duration of the GTD.

In addition to data originally collected by PGIS, CETIS, and ISVG, cases identified in other archives of terrorism incidents have also been incorporated, including:

Data Collection and the Definition of Terrorism

Data for GTD1 (1970-1997) were collected by PGIS. The collectors of the PGIS database aimed to record every known terrorist event within and across countries and over time, as identified in multi-lingual news sources, for the purpose of performing risk analysis for U.S. businesses. Incidents were collected according to the following definition of terrorism:

"the threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation."

It is well-recognized that divergent definitions of terrorism abound and that the nature and causes of terrorism are hotly contested by both governments and scholars. While certain broad elements of terrorism are generally agreed upon (such as the intentional use of violence), many other factors (such as whether the victims of terrorism must be non-combatants or whether terrorism requires a political motive) continue to be debated. Indeed, even where there is some consensus at the broadest level, there is often disagreement on the details.

While the original GTD1 employed the definition of terrorism utilized by PGIS, the second phase of data collection for the GTD (GTD2: 1998-2007) parsed the PGIS definition into parts and coded each incident so as to allow users to identify only those cases that meet their own definition of terrorism. Based on the original GTD1 definition, each incident included in the GTD2 had to be an intentional act of violence or threat of violence by a non-state actor. In addition, two of the following three criteria also had to be met for inclusion in GTD2:

  1. The violent act was aimed at attaining a political, economic, religious, or social goal;
  2. The violent act included evidence of an intention to coerce, intimidate, or convey some other message to a larger audience (or audiences) other than the immediate victims; and
  3. The violent act was outside the precepts of International Humanitarian Law.

These criteria--which continue to be employed by data collectors in post-2007 collection efforts--were constructed to allow analysts and scholars flexibility in applying various definitions of terrorism to meet different operational needs. Therefore, users of the database can select which definitional criteria most closely matches the definition of terrorism they are using and then filter the data set accordingly when performing searches or other analyses. For more details about the various criteria and how to use this in practice, please see the GTD Codebook.

Users can also exclude cases in which there was some doubt as to whether the incident was truly a terrorist act. Some incidents simply do not have enough information to make a definitive distinction between, for example, terrorism and insurgency. For more details about this filtering function, please see the GTD Codebook.

Synthesis of GTD1 and GTD2

Until 2008, the Global Terrorism Database remained divided into two separate data sets. Integrating the two data sets was challenging primarily because of the definitional differences between GTD1 and GTD2. We first had to determine which GTD1 incidents met the GTD2 criteria for inclusion and therefore belonged in the comprehensive GTD database. Some GTD1 incidents—for example, those better described as guerrilla warfare—did not meet GTD2 inclusion criteria and were excluded from the combined database. In addition, GTD1 data originally had 44 descriptive variables per incident, while GTD2 had an additional 84 variables per incident. Thus, the synthesis involved developing corresponding GTD1 information for the additional GTD2 data fields, where possible.

To synthesize the two data sets, START, in conjunction with CETIS, implemented a system whereby every incident from the GTD1 was reviewed, codes for the three definitional criteria and all other fields within GTD2 were added to GTD1, and coders evaluated the incident for inclusion within a synthesized new GTD. Seventeen coders were trained and a review of the incidents occurred from April 2008 to December 2008. Incidents that failed to meet two of the three criteria developed for GTD2 were removed from the new synthesized GTD.

Caution about Data Consistency

Even though efforts have been made to assure the continuity of the data from 1970 to the present, users should keep in mind that the collection was done in real time for cases between 1970 and 1997, was retrospective between 1998 and 2007, and is again in real time after 2007. This distinction is significant because some media sources have since become unavailable, undoubtedly impeding efforts to collect a complete census of terrorist attacks between 1998 and 2007. Moreover, since moving the ongoing collection of the GTD to the University of Maryland in the Spring of 2012, START staff have made significant improvements to the methodology that is used to compile the database. These changes, which are described both in the GTD codebook and in this START Discussion Point on The Benefits and Drawbacks of Methodological Advancements in Data Collection and Coding: Insights from the Global Terrorism Database (GTD), have improved the comprehensiveness of the database. Thus, users should note that differences in levels of attacks and casualties before and after January 1, 1998, before and after April 1, 2008, and before and after January 1, 2012 are at least partially explained by differences in data collection; researchers should adjust for these differences when modeling the data. Furthermore, cases from 1993 were lost prior to receiving the data from PGIS. Efforts thus far have been unsuccessful in fully recovering the 1993 data. Instead of providing a partial listing of cases for 1993, we refer the user to a table provided by PGIS of the total number of attacks in 1993 for each country (see the GTD Codebook's Appendix).

Codebook Development

The criteria for incident inclusion and the coding scheme used in GTD were developed by a START Advisory Board, which consisted of recognized experts in terrorism and data collection. A detailed description of the database criteria and coding scheme can be found in the GTD Codebook. For more details on the GTD, please see our Frequently Asked Questions page, or download the GTD Codebook.