24 May 2021

If you ever played with the idea of crowdsourcing your project, but you took a step back for the lack of information and practical guidelines, now is the time to read along.

Crowdsourcing of data and participative research is becoming much more accessible thanks to advances in digital technology and data science, but in reality, citizen science projects are highly complex and require significant resources to be successful. In order to offer support to those wanting to embark on the crowdsourcing ship, SSHOC, in collaboration with UCL ISH, organised an interactive workshop to explore key considerations in planning successful citizen science projects, with a focus on cultural heritage, where the opportunities for participative data are vast. However, many other disciplines are equally relevant candidates for incorporating participative research in their projects, since, as Josep Grau-Bove pointed out, there are several types of citizen science models that vary with regard to project’s aim, resource availability, the level of participants’ skills required, etc. More generally, citizen science projects can serve two distinct purposes that can overlap to various degrees: educational and scientific purpose, and it is precisely at this intersection that the specific added-value and importance of citizen science projects for our contemporary society is established.


Case study learning: when and what to crowdsource

The key difference between a citizen science project and other research projects is that a successful citizen science research project needs to be engaging for the community, both at an intellectual and a practical level. The essential step in the planning stages of a crowdsourcing project is community scoping. So, the researcher should start with answering the first question: considering what data you need and how complex this data might be to obtain, whom should you engage?

Rosie Brigham, software engineer and researcher at the UCL Institute for Sustainable Heritage (UCL ISH), drew on her ample experience and presented two distinct speculative projects lending themselves perfectly to different applications of crowdsourcing – with different objectives and requirements, very different approaches were used in these two projects in order to promote community engagement and participation. The first project, Plant Monitoring at Angkor Wat, aims to monitor the physical change of a historic site, while the second, Digitizing the Naga ´Queen´ diaries, aims to digitize a collection of hand-written texts and make it available online. Based on these two examples, the complexity of crowdsourcing research data was amply discussed.


Crowdsourced data collection

Monitoring physical change through time can be effectively done by processing simple imaging data: photographs. In popular cultural heritage sites, visitor photographs are a rich data source. Once it is established that there is potential for a citizen science project given (1) the appropriateness of the task and (2) existence/availability of a relevant and interested community, the second question that needs to be answered is what are the key considerations in planning crowdsourcing data collection.

Workshop participants contributed to compiling a list of wide-ranging factors, including how to mobilise participation; collect photograph settings and metadata; ensure the entire site is covered and not only the popular areas; develop privacy policies and image processing approaches of people featured in photographs. The discussion highlighted the complexity of the task.

From a technical perspective, an essential step that needs to be fully planned and scoped before the project begins, is planning end user access to the sourced data. Data storage needs to be planned and budgeted from the start, considering possible duration of the project and relevance of the data. Data transfer from collectors to users also deserves careful consideration: while social media might be considered a readily accessible means where engagement can be channelled, most social networks compress images and strip their metadata, possibly rendering them unusable for the desired analysis. Last but not least, contributor perspective should also be considered. It is important to clearly define the requirements of the images that are collected for a project addressing the questions how specific does the data need to be, and how will this be communicated to participants.


Crowdsourced data tagging

Crowdsourcing can also be effective in research that requires the processing of large amounts of data, where that processing needs a degree of human validation, such as in the digitizing of hand-written texts. Workshop participants contributed to identifying the key issues to consider in a data tagging project linked to the processing of personal diaries and correspondence. Again, a broad range of challenges was discussed, from legibility of the texts to ethical issues arising from the contents or authenticity concerns.

Once again, the end use of the digitized documents needs to be clearly understood when planning the crowdsourced data tagging exercise. Data augmentation and searchability are two essential factors. To this end, keywords can be considered and part of the participative process. Researchers can help themselves by using platforms designed for citizen science, like Zooniverse, that offer capabilities for enhancing searchability. A level of technical literacy is, however, required to process the data output from these platforms.


Community engagement

A key difference between the two case studies discussed in this workshop is the need for participant skills and level of engagement. While a data collection project can be streamlined to arrive at a simple yet effective mode of participation, a more complex data tagging task might require more significant contributions from individual participants, often including some training and input validation. These differences will impose different strategies for community engagement, and different challenges that go beyond the technical.

Community engagement in heritage science projects is an exciting and effective way of making participants aware of the conservation and management challenges of heritage sites. The core motivations for participants is often the opportunity of participating in scientific research and contributing actively to the conservation of a heritage site or asset.

Participation of communities comes with responsibilities for researchers: it should be conceptualised as a two-way relationship between contributor and researcher, with regular feedback and accountability on how participation has served the research. This meets not only ethical principles, but can help maintain engagement throughout the project, which is an essential element for success.


Written by Alejandra Albuerne


Presentation slides


Workshop Video