Sorbonne University Library
PRET 19 Project
France ✧ 2023-2024
Corpus
82 registers
Artificial intelligence technologies used
- OCR/HTR for historical documents
- Layout analysis and segmentation
- Named entities extraction
- Matching with external repositories
Objective
Digitization and scientific study of the loan registers of the Sorbonne libraries
Extraction of information from the loan registers of the libraries of the Sorbonne, the École Normale Supérieure and Sainte-Geneviève in the 19th century
- Autonomous extraction of borrowers names and borrowed books titles
- Automatic association of additional information from online databases
Processing workflow
- Exclusion of pages not relevant to the project
- Training of two borrower zone segmentation models (annotations provided by the client): one for BIS and ENS, and the other for BSG
- Training of two models for extracting information on borrowers (name, surname, addresses, qualifications mainly);one for BIS and ENS and the other for BSG
- Matching with name references derived from repositories
- Export of CSV files that will populate the project's database
Link to the project webpage : -> PRET19, Projet de Répertoire des Emprunteurs et Titres empruntés au XIXe siècle à l’université

Image : Salle Romilly, BIS | © Lise Hébuterne, BIS