A digital Workflow for Historical Corpora – from HTR to NER
When & where
- Date: 26th–28th August 2024
- Location: Online via Zoom
- Invited are students, but also historians of any qualification level, from BA to professorship, archivists, and all other interested parties
Topic
Scholars working in the field of history must actively face up to the diverse possibilities and methods arising from the rapid developments in the field of digitalisation and artificial intelligence. Digital humanities and digital history are currently changing historical work in many ways and opening up new ways of working with and accessing sources. Particularly in the field of manuscript recognition and the processing of large amounts of data, comprehensive progress can be recorded. Against this backdrop, the FGHO, in collaboration with the Universities of Bern and Bielefeld, is offering a practice-oriented summer school, which is being organised this year in the context of the joint research project ‘The Flow - from Deep-Learning to Digital Analysis and their Role in the Humanities. Creating, Evaluating and Critiquing Workflows for Historical Corpora’.
In particular, working with original sources and the associated competences are subject to change. In the field of Handwritten Text Recognition (HTR), new tools are constantly being developed to support the transcription of handwritten sources. It is not always easy for researchers to maintain an overview of the numerous algorithms and applications with all their advantages and disadvantages. This is why the first block of the event is dedicated to comparing different providers. After an introduction, the focus will be on working with the freely accessible tool eScriptorium. Participants will have the opportunity to try out the necessary HTR work steps – from uploading the files to segmenting and recognising the text – using prepared practice material.
The second thematic block deals with Named Entity Recognition and Nested Entity Recognition: Which questions can be answered with the help of NER? What advantages does NER offer for historical research? After an introduction to the topic, various tools will be presented and participants will then have the opportunity to practise NER tagging with the annotation tool INCEpTION.
Registration
Students and historians of all levels, from BA to professor, archive staff and other interested parties are invited. Applicants should already have some experience in the field of HTR. The conference will be held in English. Participants will need internet access and a Google account. An application for participation requires a short CV (max. 1 page A4) and a letter of motivation of no more than one page, which should also comment on existing previous knowledge and personal expectations or own current projects. If interested, please submit the above information in a PDF to info [at] fgho.eu. The registration deadline is 10th August 2024.
Programme
- Day 1 (Monday, 26nd August)
- 9:30–12:00: Introduction
- 12:00–13:30: Lunch break
- 13:30–15:30: Introduction to HTR/NER
- 15:30–16:00: Coffee break
- 16:00–16:30/17:00: Input
- Day 2 (Tuesday, 27th August)
- 9:30–12:00: Introduction to various HTR providers
- 12:00–13:30: Lunch break
- 13:30–15:30: Introduction to eScriptorium
- 15:30–16:00: Coffee break
- 16:00–17:00: Time for questions
- from 17:00: Virtual aperitif
- Day 3 (Wednesday, 28th August)
- 9:30–12:00: Introduction to NER
- 12:00–13:30: Lunch break
- 13:30–15:30: Intoduction to INCEpTION
- 15:30–16:00: Coffee break
- 16:00–17:00: Closure with open questions and discussion