Crowdsourcing: Case Studies

In this post we discover the power of a crowd in these mini case studies. We have selected three quite different examples for you: the first a transcription project, the second an image based DAM solution, and the final case is a commercial solution. Each case came with unique challenges that were solved with a customised crowdsource solution.

The Case Studies

1: Ngā Ripo pilot project

The problem:

The Alexander Heritage & Research Library at Wanganui has a huge historical index of people and subjects, covering the period from 1840 to 2002, Volunteers for many years have been indexing birth, marriage and death notices from local newspapers and other sources onto ‘index cards’.

The Library wanted to make the information contained on the cards accessible online and made a plan to digitise the cards. But that left them with a pile of images that were not searchable. Worse, because a lot of the information on the cards was handwritten, rather than typed, OCR wouldn’t work for more than a percentage of them. Paying for professional transcription was beyond the budget. So what to do?

The solution

Create an online tool using the Recollect platform that harness the power of enthusiastic volunteers. Because this meant the crowd was ‘wild’ a short online tutorial was developed to train the volunteers to capture the desired content in the correct way. The tutorial had to be successfully completed before they were ‘let loose’ on real data. Cards were displayed one at a time for the volunteers to transcribe, with each card presented and transcribed at least twice by different people. The transcription entries were compared programmatically, and if they met a prescribed level of accuracy the card was ‘accepted’ and made live.

Progress graphs and charts were included on the project dashboard. This allowed the Library team to track the overall project, and also showed the volunteers see how many cards they had done, how many were live, and where they sat on the leader board.

The pilot and this project was a trial run on over 1000 cards. The project was a major success, with all index cards successfully transcribed by volunteers in a matter of weeks.

2: Antarctica NZ’s ADAM slide collection

The problem

Antarctica NZ had an enormous slide collection – around 25,000 images – that they wanted to digitise before depositing them with Archives New Zealand. Information about each slide was written by hand around the slide mount – including subject, photographer and dates. This provided two challenges: The challenge for digitisation was to capture both the entire mount – so the handwritten metadata could be transcribed AND capture the image at a high enough resolution. The other was to find a cost effective method to check and transcribe the digital images, as well as supply additional metadata to enhance the discoverability of the photographs.

The solution.

NZMS developed an innovative solution that allowed both slide image and slide mount to be captured at once. The Recollect team then set up the crowdsourcing module for ADAM’s ‘tame’ crowd to:

  1. Transcribe the metadata from the mount into prescribed fields
  2. Apply new categories of metadata including subject tags and voting an image into the “Gallery of Awesomeness”
  3. Hand crop the photo portion from the slide mount to create the final digital image (rotating and flipping the final image where necessary).

The site administrator checked the submissions and ran the process to finalise the crop and make the image live on the site. Again, graphs and charts were used to track progress, but as the crowd was a team of contract staff (a ‘tame’ crowd) an online tutorial was not necessary. A huge benefit for Antarctica NZ was the fact that the project was web based and so the team members involved were based all over the world.

3: Commercial Solution

The problem

A corporate client had 50,000 lines of transcribed data that needed to be checked and validated against images – and additional information such as subject keywords added where appropriate. This task was being undertaken as a commercial service so timings, throughput and efficiency were directly related to their profit margin.

The solution

Initially the client was working from an excel spreadsheet, so when the Recollect team discussed a more visual, user friendly, and efficient option using a private, password protected version of the Recollect platform our client jumped on the idea. Some of the tasks could then be automated before the (tame) crowd of contract staff even got started on the project, and others could be streamlined by predictively searching the approved thesaurus of subject terms – making the whole process faster and more controlled.

Additional tools were developed and implemented to allow the project administrator to track timings and individual throughput as these were critical to the project’s success.

What could you do?

These are just three sample projects to illustrate how harnessing a crowd can make light work of a large project. Each case had unique requirements which required customised solutions to achieve the desired outcome, but all of which were overwhelmingly successful with large scale projects completed in a timely and cost-effective manner.

Hopefully these case studies will inspire you to think outside the box for your own project. If the numbers (whether items, hours or dollars) seem overwhelming, harnessing a crowd (tame or wild) using our platform (public or private) may allow you to tackle those ‘wouldn’t it be awesome if…’ blue sky projects.


