Skip to the content

Making Europe’s historical newspapers searchable

Neudecker, C and Antonacopoulos, Apostolos 2016, 'Making Europe’s historical newspapers searchable' , Proceedings of the 12th IAPR International Workshop on Document Analysis Systems (DAS2016) . (In Press)

Full text not available from this repository. (Request a copy)

Abstract

This paper provides a rare glimpse into the overall approach for the refinement, i.e. the enrichment of scanned historical newspapers with text and layout recognition, in the Europeana Newspapers project. Within three years, the project processed more than 10 million pages of historical newspapers from 12 national and major libraries to produce the largest open access and fully searchable text collection of digital historical newspapers in Europe. In this, a wide variety of legal, logistical, technical and other challenges were encountered. After introducing the background issues in newspaper digitization in Europe, the paper discusses the technical aspects of refinement in greater detail. It explains what decisions were taken in the design of the large-scale processing workflow to address these challenges, what were the results produced and what were identified as best practices.

Item Type: Article
Schools: Schools > School of Computing, Science and Engineering
Journal or Publication Title: Proceedings of the 12th IAPR International Workshop on Document Analysis Systems (DAS2016)
Funders: European Commission
Depositing User: Professor Apostolos Antonacopoulos
Date Deposited: 22 Mar 2016 16:13
Last Modified: 22 Mar 2016 16:13
URI: http://usir.salford.ac.uk/id/eprint/38465

Actions (login required)

Edit record (repository staff only) Edit record (repository staff only)