This blog is a part of International Justice Monitor’s technology for truth series, which focuses on the use of technology for evidence and features views from key proponents in the field.
This blog post builds upon, and amplifies, Megan Price’s excellent analysis of the challenges of using found data and user generated content in human rights investigations in the previous technology for truth blog post. As someone who has been engaged in the development of new technologies and approaches to human rights fact-finding for the past several years, her warnings rang especially true to me. That said, I would also like to make the case that user generated videos or photographs can provide crucial information and reliable legal evidence in certain instances, even when the information emerges from a biased and unrepresentative sample. These situations revolve primarily around determining what happened at a particular time and place, identifying who was present when an event happened, or determining conditions in a place that is not currently accessible by human rights investigators.
In almost all cases, user generated visual content is the starting point for an investigation, not an end point. No human rights researcher should ever take any data source, particularly one as complex as social media content, at face value as “the Truth.” Rather, the researcher must take the opposite approach: assume that each individual piece of content is potentially false or misleading and that the collected dataset as a whole is biased and incomplete. The investigation begins by seeking to verify the content using the latest available approaches, such as the Verification Handbook for Investigative Reporting, a resource guide for using open source information and user-generated content in investigative work, and Citizen Evidence Lab’s Human Rights Citizen Video Assessment Tool.
Once a video or photograph has been verified as having been taken in the place and at the approximate time stated by the user who uploaded it, the human rights researcher must then gather as much contextual information about the media as possible in order to assess its probative/situational value. Ideally, this will involve communicating directly with the creator of the content to find out the circumstances in which it was generated and why this individual chose to capture a scene in the way he or she did. Did the content producer edit certain things out, did she start her camera after some initial set of events, or did she stop filming before something else happened?
In addition to this communication, and especially if it is not possible (and there are many reasons why it may be impossible to communicate with the person responsible for creating the video or image), the researcher should seek out as many additional accounts of the same situation as possible. Do these other accounts tell the same story as the initial video or photograph, or do other narratives emerge? The researcher should also seek out people with specialized knowledge of the relevant place, situation, or region—either through lived experience or academic study—to determine whether the initial interpretation of the user generated content is plausible. All media, whether created by amateurs or professionals, must be understood in its unique political, cultural, and historical context.
As Price notes, any individual source, or even a large body of videos or photographs, must be assumed to be biased and incomplete unless it can be determined that every single possible angle and time point pertinent to the situation have been captured and that no other information could exist to contradict the narrative that emerges from some body of material. Did certain aspects of an event take place in areas that could not be accessed by those with mobile phones or cameras? Is the Internet shut down in a region or is access unavailable to certain groups that might suffer from egregious human rights violations? Do particular kinds of people not share videos for cultural or other reasons (such as the assumption that posting such information will put them in grave danger without leading to any meaningful short-term benefits)? This is obviously a very high threshold that cannot be met in most cases. Even if it can be determined that an entire event was captured from start to finish, there are likely to be other events that remain unknown to all but the perpetrators because they are carried out in secret or are reported in very different ways.
Thus, in order to properly analyze social media for trends and patterns, it is crucial to combine that content with as many other data sources as possible. Even then, as Price outlines, rigorous statistical analysis is required to determine possible weaknesses or inconsistencies in the data. The best hope we have of understanding the uncertainty and unreliability of a given source of data is to collect and preserve as much of it as possible before it begins to degrade or disappears entirely.
Along these lines, Carnegie Mellon’s Center for Human Rights Science is working with our partners to develop a system we are calling “Human Rights Media Central.” The purpose of this project is to develop a data management system that will allow practitioners and ordinary people to upload, store, verify, organize, search for duplicates or near-duplicates, analyze, and extract information from rights-related user generated content including images, video recordings, and text. Eventually, we would like to semi-automate as much of this work as possible to free up human rights researchers to do what they do best: analysis and advocacy. Through semi-automation, more information can be gathered and stored than is possible through human action alone.
However, there are circumstances, albeit limited ones, in which identifying broad patterns and trends are not required for human rights actors to achieve their goals—e.g., in the context of humanitarian rescue, certain kinds of advocacy, criminal trials revolving around specific events, and ensuring that crimes are not whitewashed from history. User generated content can provide clear and convincing evidence that crimes occurred and that particular individuals, organizations, or governments perpetrated them, were complicit in them, or stood by and watched while they happened. This is particularly important given the long-standing desire of perpetrators and their supporters to conceal history or deny their role in crimes against humanity and human rights abuses. Moving forward, user generated content will be able to help answer questions like:
- Did a particular armed group or military unit knowingly attack civilians?
- Was a particular type of weapon used on civilian populations?
- Did a particular government site house victims of extrajudicial detention in the hours or days leading up to their deaths?
- Did police use excessive force in quelling political demonstrators?
- Who was present during, or participating in, human rights abuses?
User generated content can also help answer questions about the violation of economic, cultural, and political rights.
In most cases, answers to questions like these do not require extensive statistical analysis—it is enough to prove a single violation. There have already been many instances where video and photographs, including those created by perpetrators themselves, have contributed to the resolution of questions of vital importance to justice and accountability efforts, such as whether Thomas Lubanga employed child soldiers in the DRC, whether Theoneste Bagosora was responsible for ordering the Interhamwe to massacre Tutsis during the Rwandan genocide, or whether troops under Radislav Krstic’s command participated in the massacre of Bosniak men and boys at Srebrenica.
It would be scientifically incorrect, and a violation of common sense, to claim that any found data source is a complete and unbiased account of a given situation. Yet, at the same time, not mining such content for all relevant information would be an affront to the victims of human rights abuse, and the brave men and women who record and transmit accounts of events that have significant negative impacts on their lives. Human rights researchers must be wary of the agendas that users seek to advance through social media, but they must also encourage ordinary people to continue to share information about the conditions in which they live. The human rights community will only continue to have access to the vast trove of information that social media represents as long as individuals know that someone is listening to them and that they have an open channel to challenge the impunity and political neglect they face on a daily basis.
Jay D. Aronson is Associate Professor of Science, Technology, and Society at Carnegie Mellon University and the Director of Carnegie Mellon’s Center for Human Rights Science.