What is covered by these tags?
ACLED uses four automatically generated infrastructure tags when coding events that occur in Ukraine, each covering a vital sector that focuses on civilian infrastructure: energy, health, education, and residential infrastructure. A tag is applied if the ‘notes’ of an event indicate that the given sector’s infrastructure was damaged. If more than one sector’s infrastructure was damaged in an event, more than one tag will be included in the column, and a semi-colon will separate each tag. The scope of each tag is briefly described below.
Energy:
Broadly covers sectors of electricity, oil, natural gas, nuclear energy, coal, and hydroelectricity (e.g., production, treatment, storage, transmission, distribution, supply)
- Caveats
- Excluded: military energy-related sites (such as military fuel depots), gas stations (as gas stations relate to transport infrastructure)
- Included: event notes stating a ‘power outage’ as the result of an attack, even if there is no report on damage (as it indicates damage to other areas in the sector)
Health:
Broadly covers hospitals, clinics, medical centers, pharmacies, dispensaries, and ambulances
- Caveats
- Included: military health infrastructure, nursing homes
Education:
Broadly covers schools, kindergartens, childcare facilities, colleges, universities, orphanages, and any other educational institutions
- Caveats
- Excluded: dormitories (of universities/schools), as these are covered under the residential tag
Residential:
Broadly covers residential houses, high-rise and apartment buildings, and residential areas
- Caveats
- Included: high-rise buildings (even if ‘residential’ is not mentioned specifically, as such buildings have a high likelihood of being residential in Ukraine), dormitories (of universities/schools)
What types of events are covered by the automated tags?
The infrastructure tags are applied to ACLED events with the event date (event_date column) starting in January 2022. Only events occurring within Ukraine (country column) are considered, and only event types (event_type column) ‘Battles’ and ‘Explosions/Remote violence’ are covered.
How do these tags differ from ACLED’s other tags?
Tags that record damaged infrastructure in Ukraine differ from other ACLED tags as they are automatically created using a large language model (LLM), whereas all other tags are coded manually by our researchers. An LLM is a type of artificial intelligence that processes and, in the case of certain LLMs, generates text based on patterns learned from vast amounts of data. In the context of classification models, like those used here, an LLM can analyze and categorize text — like identifying whether a text is about a certain topic — by recognizing keywords, phrases, and context.
Using the ‘notes’ column, we fine-tuned an LLM specifically for classifying infrastructure-related ACLED events. ACLED manually collated training data for the model, which was improved until its performance was deemed satisfactory. The model was applied and its performance was monitored weekly. Corrections were fed back into the model for further tuning. This process has resulted in a consistently high-performing model over a six-month period of internal testing.
How accurate are these LLM-based tags?
After creating the initial fine-tuned versions of the four models, one for each sector, accuracy was determined using data set aside for testing. The test data was balanced across relevant and irrelevant events. When applying the models to this data, they achieved the following levels of accuracy (calculated as the share of events tagged correctly): energy: 96.8%, health: 100%, education: 97.9%, residential: 96.2%.
After achieving this performance, we began applying these models to our data on a weekly basis, followed by a manual review. Specifically, events that received either a tag or matched a keyword of one of the four sectors were reviewed so that false positives and probable false negatives would be caught. After six months of evaluations, we again assessed performance across each sector. This was done by taking the share of all correctly tagged events (positive or negative) of all events with either a tag or keyword matched per sector. For example, in the case of the ‘energy’ infrastructure tag, we summed up all events where either an energy-related keyword was matched in the event ‘notes,’ the model assigned it an ‘energy’ tag, or both (all energy-related events). We then summed up all events that were assigned the ‘energy’ tag correctly, plus those events that the model correctly identified as irrelevant (i.e., did not assign an ‘energy’ tag to). These events were divided by the number of all energy-related events to arrive at the final accuracy score of each of the tags:
- Energy: 95.6%
- Health: 98.2%
- Education: 96.9%
- Residential: 98.7%
Is there any manual supervision of these tags?
ACLED uploads each week’s newest tagged data one day after the latest event data are published on our website, i.e., each Wednesday. To upload this tagged data promptly, newly tagged events are not manually reviewed at the initial point of publication. Over the next week, however, our team reviews the tags manually and corrected data are uploaded along with the latest week’s tagged data the following Wednesday. In other words, the tags are reviewed on a one-week lag. Users can either wait another week after ACLED’s newest events are published and use data lagged by a week or download this dataset immediately and cross-check data for corrections afterward. Finally, please note that new events added to this dataset at weekly intervals will mostly have occurred in the past week but may have an event date anytime between January 2022 and the most recent data upload due to backcoding exercises that might result in additional historical events or event corrections. As tagged data are available from January 2022, all tags up until 31 January 2025 have been manually reviewed at the time of launching this curated dataset.
How is this classification model superior to a keyword search?
Training and applying a supervised classification model to the ‘notes’ column can offer advantages over a keyword search in a range of use cases. For instance, when searching for events with damaged infrastructure, a model can be trained to not only recognize events relating to such infrastructure but specifically to it being damaged or destroyed. This approach will thus automatically exclude events mentioning, for instance, a missile hitting near such infrastructure but not harming it.
Another use case may occur when one is interested in a relatively broad category (e.g., damage to energy infrastructure). The ‘notes’ will mention a range of cases relating to this broader category, such as damage to gas pipelines or power plants. Instead of having to catch every single type of energy infrastructure in a keyword list, a model may catch further variations after having been trained on a variety of examples of such events and can do this with high accuracy (depending on the size and quality of training data, accuracy can reach up to 99%, as per ACLED’s own experiments).
What is the step-by-step approach for fine-tuning such a model and ensuring that rare events/ edge cases are caught?
For more information on the modeling approach and best practices for text classification using the ACLED ‘notes’ column, visit ACLED’s Knowledge Base.