Sourcing

Reliability, quality control, and accounting for bias

Published on: 20 March 2023 | Last updated: 20 September 2024

What types of sources does ACLED use?

ACLED uses four types of sources. Every week, ACLED researchers assess thousands of sources in dozens of languages to provide the most comprehensive database on political violence and demonstrations. All types are reviewed each week. These include:

Traditional Media: This includes all subnational, national, regional, and international media outlets that are governed by journalistic principles of verification.
Reports: International institutions and non-governmental organizations – such as aid groups, human rights organizations, and investigative journalism groups – regularly publish reports on political violence. Where applicable, ACLED incorporates events from these reports. Under certain conditions, reports from groups involved in conflict themselves are also included (Ministries of Defense, armed groups, NATO, etc.).
Local Partner Data: The past decades have seen an increase in conflict observatories established at the local level as both social activism and the ability to report political violence have increased. These organizations leverage their local knowledge as they collect and obtain information through primary and/or secondary means. ACLED develops relationships with local partners to enhance the depth and quality of its data.
New Media (targeted and verified): ‘New media’ (e.g. Twitter, Telegram, WhatsApp) can be a powerful supplemental source, but varies widely in terms of quality. Therefore, ACLED does not crowdsource or scrape large amounts of social media. Rather, a targeted approach to the inclusion of new media is preferred through either the establishment of relationships with the source directly, or the verification of the quality of each source.

Do more sources mean that data are more reliable?

Not necessarily. Neither the number of sources nor the specific types of sources will guarantee that the data are more reliable. It is important to remember that countries have unique conflict and media landscapes,¹ and that all sources contain some biases² and specific focuses. The inclusion of more sources, or sources not tailored to a country’s specific context, means simply reproducing these variables on a larger scale. For example, international media will generally report on different types of violence, actors, and locations relative to local media.³ New media will generally report more violence in urban and heavily populated areas, as a result of where its users are based.⁴ Well-researched reports will have a focus on specific types of violence (e.g. human rights violations) and may discard those events that cannot be corroborated. The content from local partners relying on primary data collection will depend on the networks that these partners have developed, typically leaving them confined to particular areas and social groups. Financial constraints, the patterns of war, or donor funding may also impact local partner coverage. Simply increasing the number and types of sources will not account for these variables or produce more reliable data.

Moreover, every country also has unique variables which need to be considered, such as geography, freedom of the press, and types of violence. Some countries experience violence in hard-to-access or remote areas, while others may experience violence primarily in well-defined urban areas. In some contexts, state or other armed actors may temporarily lock down the Internet to perpetrate specific types of violence (e.g. violence against civilians), while others will repress media year-round.⁵ On the other hand, many countries have a generally free and well-funded press; these contexts could pose the alternative obstacle of forcing researchers to wade through numerous reports to find unique or relevant ones. Types of violence also differ from country to country.⁶ Some experience high-intensity violence (e.g. suicide bombings), which tends to be well-reported; others may experience a type of disorder that is less reported, such as small protests that fall under the radar or sexual violence that societal norms may push towards under-reporting. Again, simply increasing the number and types of sources will simply perpetuate these patterns of bias.

Hence, while it may seem intuitive that more reports lead to increased reliability, ACLED does not seek to simply increase the number and types of sources as a means to improve reliability. The quantity of information does not ensure quality. In fact, more sources may lead to data of a lesser quality as inherent biases will be amplified.

**What does make sourcing more reliable, then?**

Only a tailor-made sourcing process for individual regions/countries will make data more reliable.⁷ Given the variation in types of violence, available sources, and potential biases amongst countries, ACLED develops sourcing strategies adapted to the specific challenges at hand in each unique context. The goal of these strategies is to construct source combinations that approximate the reality of violence in each country/region.⁸

In addition, ACLED has found that one practice, in particular, tends to increase the reliability of data: prioritizing local sources. In prioritizing local sources, the ACLED approach starkly contrasts approaches taken by other databases that generate conflict data based on traditional media alone⁹ (for more see ACLED’s report on Comparing Conflict Data). Traditional media (specifically international traditional media) has a number of known biases that create a less accurate picture when taken as the only source type (and this can be further exacerbated if looking at English-language traditional media alone). First, there are certain remote or dangerous locations to which reporters cannot or will not go (e.g. parts of Somalia or Yemen). Second, these sources tend to focus on large or ‘sensational’ events,¹⁰ ignoring those of a smaller scale or protracted conflicts which lack major changes. These biases stem from a number of limitations experienced by traditional media, such as readers’ attention, available space in newspapers, the process of verifying information, and the demands of the 24-hour news-cycle audience. The results are a lack of events that feature violence in rural areas, small-scale skirmishes, violence targeting women, or ongoing conflicts for which a source’s audience has lost appetite, to name a few. Data generated from these sources may not show actual conflict patterns but rather depict the reporting patterns of media.

While no panacea, ACLED finds that local partner and subnational media generally produce more reliable data in the sense that the above-noted biases of traditional international media are avoided. They are thus incredibly useful when attempting to balance against the deficiencies of traditional media. As the mandates of many local organizations are focused on maintaining and building upon existing social networks, they generally account for smaller-scale events and will do so consistently over time. For example, in Myanmar, when faced with particularly biased traditional and new media sources, ACLED uses sources from local partners to fill in reporting gaps.¹¹ Similarly, the complex case of the Congo requires the use of local partner data to account for media fatigue and micro-complexities, given the nature of the violence in the country. However, biases remain.¹² Local organizations will often capture only specific types of violence, often in line with their mandate (e.g. Airwars primarily collects information on violence by airstrikes), or violence from a single region only (e.g. Deep South Watch collects information only on violence in the southern states of Thailand). Finding multiple local partners and stitching together information from various organizations thus may allow for a fuller picture to be created – as is done in ACLED’s coverage of the Syrian War (for more, see this ACLED report on Reliable data on the Syrian conflict by design).

What is the process for producing country and regional sourcing strategies?

ACLED first creates a preliminary source list based on established and prolific local news sources from each country. ACLED Researchers are hired from around the world with relevant language skills as well as regional context knowledge to cultivate an appropriate source list. Next, that preliminary source list is expanded upon with news sources from adjacent countries (e.g. Iraqi sources on Turkey and vice-versa), reports from non-governmental agencies which operate in the region, and vetted social media accounts from journalists, analysts, and organizations. With this information, the reporting patterns and viability of media sources are assessed, as is reporting from active local armed groups. This is often done with advice and expertise from local partners, researchers familiar with the area, country experts, regional media consultants, local organizations, or universities in the region, amongst others.

Each source that is used for coding is assigned a specific source scale value, which depends on the relative ‘distance’ of the source from an event, i.e. a source based in Sao Paulo state reporting on an event in Sao Paulo would be given the scale ‘Subnational,’ while the same source reporting on an event in China would be given the scale ‘International.’ From lowest to highest, the seven source scales used are:

Source_scale	Description
Local partner	Partners that provide ACLED with data (e.g. Protect Defenders).
Other	Reports from international organizations, humanitarian groups, conflict parties, governments; essentially any source that is not traditional media or new media (e.g. reports from the Afghan Ministry of Defense in Afghanistan or Human Rights Watch in Brazil).
New media	Sources that are social media accounts (e.g. on Twitter, Telegram, etc.), or rely on social media.
Subnational	Traditional media sources based in the same subnational region of the country as that which is coded in the event (e.g. when Mogadishu Times is used to code an event in Mogadishu, Somalia).
National	Traditional media sources based in the same country as the one coded in the event, though outside of the subnational region within that country where the event occurred (e.g. when the Addis Tribune is used to code an event in western Ethiopia).
Regional	Traditional media sources based in a country outside of the country coded in the event, though in the same region as said country (e.g. when Radio Afrique France is used to code an event in Mali).
International	Traditional media sources based in a region outside of the country coded in the event (e.g. when Reuters is used to code an event in Nigeria)

The source scale value assigned to each source is then used to automatically generate the ‘Source_scale’ variable in the ACLED dataset. The variable tracks (a maximum of) two of the lowest source scales, listed as a range, used to code an event. If national, regional and international source scales are used to code an event, the ‘Source scale’ column would be ‘National-Regional,’ indicating that the lowest source used to code the event was at the national level, and the next lowest was at the regional level. The ‘International’ scale would not be represented in the ‘Source_scale’ column in this case. One scale over another does not guarantee more direct information, accuracy, or legitimacy, but ACLED supports gathering and using local sources whenever possible.

This information can then be used to determine the ‘reporting profile’ of each country: what source(s) produce the most unique events; what source(s) tend to cover certain areas of the country or certain types of conflict; what source(s) may have obvious reporting biases. Based on this, a country-specific sourcing strategy is developed and local partnerships are sought to help address gaps.

If some of ACLED’s sources are known to be biased, why are they being used?

It is true that the credibility of information varies according to the source. Reporting bias is prevalent, especially in the context of an ongoing conflict where political and armed groups have reason to inflate their own achievements and deflate those of their opponents. ACLED has found that reports by local sources, reputed human rights organizations, and the United Nations generally have more detailed verification processes and are less prone to these conflict biases. They are, therefore, preferred in cases of conflicting details.

However, ACLED does use sources that are known to be biased when it is found that they provide reliable information. For example, the Syrian Observatory for Human Rights (SOHR) is sometimes accused of reporting incorrect information; however, after comparing their data with various verified local and regional providers, ACLED has found that SOHR captures the same conflict patterns.

Conflict parties, in particular, have an incentive to exaggerate their achievements, while playing down those of their opponents. At the same time, armed groups typically report small-scale skirmishes or assaults in remote areas where more reputable sources lack access, such as in certain parts of Africa where the Islamic State (IS) may be the most regular source for events within their areas of activity. Therefore, ACLED relies on the same process as with other biased sources and tests whether a conflict actor provides credible information. This may mean that data provided by a particular actor is used for one region, but not another if it is determined to be more reliable within a certain regional context. For example, ACLED includes Telegram reports about IS in Iraq but does not do so for reports from Burkina Faso, where the group’s presence is limited. To account for potential reporting incentives of armed actors, ACLED Researchers corroborate large or unusual events reported by armed groups by triangulating such events with other sources as these are the types of events that other sources are expected to have also captured had they occurred. Moreover, ACLED considers reported fatalities from these sources to be less reliable and notes this reservation in the notes, as well as in the majority of its published literature.

Does ACLED have processes in place for sourcing quality control?

Yes. In addition to developing country and region-specific sourcing strategies and involving specialized country and regional researchers, ACLED has four mechanisms in place to ensure the continued monitoring of sources, their usage, and the quality of data:

Source control: Every week, ACLED Researchers review thousands of sources in multiple languages, and collect instances of political violence and demonstrations. After Researchers have read and coded these data, ACLED Research Managers check whether all sources have been covered.¹³ Moreover, they use a system to trace the number of citations per source and the geographic spread over time to detect if sources have been missed. Lastly, sources that prove to be inconsistent or unproductive over a long period of time will be taken off the source list to increase efficiency. Any coverage gaps resulting from such changes are then addressed with the addition of equivalent new sources. These quality processes ensure consistency in the data.
Continued identification of new sources: Fixed source lists based on a country strategy carry the risk that new sources are not identified, and important information may be missed. For this reason, new sources are added to ACLED data on an ongoing basis. These include media sources discovered during supplemental research, newly vetted social media accounts, and local research groups that have sought to establish a partnership with ACLED. Sometimes, sources that were prolific may be compelled to cease operations – either temporarily or permanently – when a new government comes into power. In such cases, ACLED proactively seeks out new sources that might be suited to replace the sources that can no longer operate effectively. New sources are reviewed, tested, and – once established as useful – added to the weekly list of sources that Researchers track. To ensure that new source additions do not introduce an artificial spike into trends, supplemental coding is undertaken to review and code that source for past periods.¹⁴
Corrections: The addition of new information means that new events are identified and added to the data. It also means that existing published data have to be updated with more accurate information as new information comes to light. Corrections to data are made in instances where a source offers additional or improved information on a published event. The most common corrections include: a more specific location or event date, updated fatality counts, or updates to actors involved (e.g. if a group claims responsibility for an attack). Corrections to data are made alongside weekly data releases.¹⁵
Anonymization: As ACLED works with local partners in conflict zones, certain partners may ask to remain anonymous for safety reasons, such as ACLED partners in Syria, Burundi, and Somalia, amongst others. These relationships are guided by ACLED’s do-no-harm policy, which ensures regular discussion between ACLED and partners to assess the potential implications of data sharing. Events coded from reports from such partners are attributed in the data as ‘Local partner.’