Clausner, C ORCID: https://orcid.org/0000-0001-6041-1002, Hayes, J, Antonacopoulos, A
ORCID: https://orcid.org/0000-0001-9552-0233 and Pletschacher, S
ORCID: https://orcid.org/0000-0003-0541-0968
2017,
Creating a complete workflow for digitising historical census documents : considerations and evaluation
, in: 2017 Workshop on Historical Document Imaging and Processing (HIP2017), 10 November 2017, Kyoto, Japan.
|
PDF (Author's accepted manuscript)
- Accepted Version
Download (1MB) | Preview |
Abstract
The 1961 Census of England and Wales was the first UK census to make use of computers. However, only bound volumes and microfilm copies of printouts remain, locking a wealth of information in a form that is practically unusable for research. In this paper, we describe process of creating the digitisation workflow that was developed as part of a pilot study for the Office for National Statistics. The emphasis of the paper is on the issues originating from the historical nature of the material and how they were resolved. The steps described include image pre-processing, OCR setup, table recognition, post-processing, data ingestion, crowdsourcing, and quality assurance. Evaluation methods and results are presented for all steps.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Schools: | Schools > School of Computing, Science and Engineering |
Journal or Publication Title: | Proceedings of the 2017 Workshop on Historical Document Imaging and Processing (HIP2017) |
Publisher: | ACM Digital Library |
Related URLs: | |
Funders: | Office for National Statistics |
Depositing User: | Mr Christian Clausner |
Date Deposited: | 20 Nov 2017 14:25 |
Last Modified: | 15 Feb 2022 22:40 |
URI: | https://usir.salford.ac.uk/id/eprint/44371 |
Actions (login required)
![]() |
Edit record (repository staff only) |