Musée du Louvre
Departement of Islamic Arts
France ✧ 2025
Project
Indexing of auction catalogues from the documentation library to improve the referencing of works in the collection.
Corpus
Auction catalogues in English and French (Sotheby’s, Bonhams, Christie’s…)
150, 000 pages
Processing workflow
- Optical Character Recognition (OCR)
- Segmentation of elements on the page and matching captions to images
- Metadata creation using automatic key/value pair recognition: extraction of elements from captions (title, date, media…) to a spreadsheet
- Extracting inventory numbers and associating them with the corresponding image for object indexing
