Data Collection Methodology
The Big Allied and Dangerous Dataset Version 2 (BAAD2) was developed to be a comprehensive, methodologically robust set of longitudinal data on terrorist and insurgent organizations (which we group together under the umbrella term “violent non-state actors - VNSAs”), the network connections that exist between VNSAs, and the link to countries that host, sponsor, or are targeted by a given VNSA. Its primary purpose is to enable researchers and analysts to increase understanding of terrorism, insurgency, and links between both forms of violent political action.
Inclusion Criteria
An entity is included in the BAAD2 dataset if the entity:
- Committed at least one terrorist attack as defined by the Global Terrorism Database (GTD) criteria between 1998 and 2012, and/or
- Was recorded in the Profiles of Incidents involving CBRN by Non-state Actors (POICN) dataset as having used, attempted to use, or pursued a chemical, biological, radiological, or nuclear weapon at least once between 1998 and 2012, and/or
- Was recorded in the Uppsala Conflict Data Program (UCDP) Battle Deaths dataset as having committed at least 25 battle deaths in an insurgency between 1998 and 2012
AND
- Was an organization. We excluded individuals, generic groups (“Chechens”, “Palestinians”, etc.), and ad hoc groups that lacked key characteristics of organizations: boundaries to clearly delineate members and non-members, persistence over time, at least minimal internal differentiation (hierarchy, functional specialization, etc.), and resources held and/or owned for a collectivity rather than for individuals.
- Garnered enough coverage in our various sources to allow us to characterize a minimal set of variables: name, “homebase” country, and ideology.
Thus inclusion in the database was driven by definition of terrorism and insurgency that are drawn from the major event datasets used to capture information on terrorist and insurgent acts. To learn more about these datasets, their inclusion criteria, and their definitions of terrorism and insurgency, see:
- Global Terrorism Database: http://www.start.umd.edu/gtd/using-gtd/
- Uppsala Conflict Data Program Battle Deaths Dataset: http://www.pcr.uu.se/research/ucdp/datasets/ucdp_battle-related_deaths_dataset/
- Profiles of Incidents involving CBRN by Non-state Actors: https://www.start.umd.edu/news/cbrn-terrorism-non-state-actors
Scope of Data
The Big Allied and Dangerous Dataset Version 2 was designed to gather a wide variety of organizational and network variables pertaining to each VNSA. The data presented in this website is drawn from three components of BAAD2:
Terrorist Organization (TORG) identification system. The TORG system is designed to provide an authoritative list of primary terrorist and insurgent entities and associated aliases across time. For each primary entity, TORG includes the primary “homebase” country code (both the Correlates of War coding system and the ISO country coding system), age of founding, and associated ID numbers from allied datasets, including the Global Terrorism Database (GTD), the Minorities at Risk-Organizational Behavior (MAROB) dataset, the Profiles of Incidents involving CBRN by Non-state Actors (POICN) dataset, and the Uppsala Conflict Program (UCDP) Battle Deaths dataset. Using country codes, it is also possible to link data in BAAD2 to information from a wide range of country characteristic datasets. Using the TORG “crosswalk” yearly summaries of, for instance, incident and fatality counts may be extracted from the Global Terrorism Database, participation in CBRN activities may be included from POICN, and country-level characteristics for a given year may be drawn from variables in the Correlates of War. The TORG system is designed to help researchers broaden the range of factors that may be included in models of terrorist organizational behavior and to bring information on terrorism and terrorist organizations into studies of war, insurgency, and non-lethal forms of political violence. The TORG system currently contains entries for more than 2,400 primary entities and over 2,800 aliases.
The Organizational Covariates (OC) Dataset. The OC dataset contains 62 variables that characterize the nature of each organization. These variables include: ideology, membership size, age, structure, financial and material support, electoral and political involvement, leadership loss, territorial control, provision of social services, and counterterrorism efforts directed at the organization (see the BAAD2 codebook for more detail). For this website we have also included as organizational variables the count of fatalities that have been attributed to each organization in the GTD.
The VNSA Network (VN) Dataset. The VN dataset characterizes relationships (1) between VNSAs and (2) between countries and VNSAs. Relationships are coded for categories such as: suspected ally, ally, faction, splinter group, rival, enemy, target, and state sponsor. This data is used to create dynamic visualizations found on this web site. See the BAAD2 codebook for more details.
Sources
Information in BAAD2 is drawn entirely from publicly available, open-source materials. These include electronic news archives, existing data sets, secondary source materials such as books and journals, and legal documents. All information contained in BAAD2 reflects what is reported in those sources. While the developers attempt, to the best of their abilities, to corroborate each piece of information among multiple independent open sources, they make no further claims as to the veracity of this information.
Information is not added to BAAD unless and until we have determined the sources are credible, though research decisions are constantly made regarding coding. In some instances, we will choose to “carry forward” a specific coding even in the absence of clear supporting documentation. For instance, if Group A is coded as religiously inspired for 1999 based on reliable sources, we may choose to continue that coding for additional years even without a published source for, say 2000 or 2001, given that research has proven that ideology is relatively consistent over time. These decisions are made on a variable by variable basis by the faculty and staff involved in the project. Data released for research purposes fully documents instances where values have been inferred from previously reported values.
Users should not infer any additional actions or results beyond what is presented in a BAAD2 entry and specifically, users should not infer an individual associated with a particular organization was tried and convicted of terrorism or any other criminal offense. If new documentation about an organization becomes available, an entry may be modified, as necessary and appropriate.
Caution about Data Consistency
Because BAAD2 depends on reporting, users should be aware of the potential for “availability bias.” Availability bias can take several forms:
- Differences in global awareness of terrorism and insurgency. While political violence was clearly a major concern before the September 11, 2001, attacks, there is no doubt that media and academic attention to terrorism and insurgency rose immensely in the aftermath. Over time there have also been ebbs and flows in attention that can be traced in the level of media effort to report on violent non-state actors. This bias may manifest, for instance, in greater ease in finding and coding network relations between VNSAs after September 11th than before, greater information on certain hard to code pieces of information (such as funding levels), and greater overall availability of data on small and relatively inactive organizations.
- Differences in Internet penetration. One reason why BAAD2 does not code back to the early 1990s is that the Internet was only reaching maturity in the late 1990s. Just as the September 11th attacks increased reporting, the Internet provides more outlets for those interested in terrorism to make their reporting known more broadly. As the Internet has grown and made it easier for even small media outlets to distribute internationally, it has become easier to find sources for difficult to code issues like funding sources and difficult to code small groups.
- Language. While we have at various times used coders that speak languages other than English (including speakers of Arabic, Spanish, Urdu, and Chinese) and we have worked with automated coding vendors that can process multiple languages, the bulk of our sourcing is in English. Thus there are undoubtedly reliable sources in other languages that we have not consulted and that could enrich our current knowledge and possibly contradict our current codings. We continue to work on several fronts to improve our coverage in other languages.
Codebook Development
The criteria for incident inclusion and the coding scheme used in BAAD2 were developed by Victor Asal and Karl Rethemeyer with input from Ian Andersen, Corina Simonelli, and Ken Cousins. A detailed description of the database criteria and coding scheme can be found in the BAAD2 Codebook. For more details on BAAD2, please see our Frequently Asked Questions page, or download the BAAD2 Codebook.