Skip to Content

Open Source

By making our tools open source, we enable our users to engage with a system that offers long-term reliability and trust. Control over the code means that customers can tailor the tools to their specific needs and have the assurance of continuity in the services offered by TEKLIA.


Code improvements and user feedback

All improvements made by TEKLIA to the open source code are made available to all our users. This inclusivity extends to customers hosted by Teklia as well as those who choose to self-host. The open source nature of our tools fosters a collaborative environment where improvements and updates are requested by the community, ensuring that the tools evolve according to user needs.


Development sharing flexibility

Custom developments made for TEKLIA service customers can be made open source and accessible to the public, thereby benefiting the user community. If a customer prefers exclusivity, these developments can also be reserved for internal use, giving customers control over how they are distributed. 

Arkindex

Arkindex is TEKLIA's platform for managing and processing large collections of digitised documents. We have been actively developing Arkindex since 2019 and use it intensively in all our projects.

Callico

Callico is the annotation and validation platform for digitised documents developed by TEKLIA. We use it in all our projects to generate training data for our Deep Learning models. It is available as open source​.

Deep Learning libraries and tools

We publish and maintain our code as open source on Gitlab.

  • Doc-UFCN, a library for detecting objects in scanned documents. See it on PyPi and our GitLab
  • PyLaia, a handwriting recognition library. See it on PyPi and our Gitlab
  • Nerval, a named entity extraction evaluation library. See on GitLab
  • DISS, a document image segmentation scoring library. See it on GitLab

Open deep learning models

We publish our models in free access on  HuggingFace :

  • Handwriting recognition models for PyLaia PyLaia -> 
  • Modèles d’analyse de mise en page de documents pour Doc-UFCN -> 
  • Modèles de reconnaissance d’entités nommées pour spaCy -> 

Data tools

Arkindex tool

Open-source tools to interact with Arkindex, the document processing platform.

  • Arkindex command line client: a command line interface to Arkindex instance. See it on  PyPi and GitLabSee documentation -> 
  • Arkindex API client: a python library to communicate with Arkindex API. See it on PyPi and GitLab. See documentation ->
  • Arkindex Export: a library for exploring and using Arkindex exports in sqlite format. See it on PyPi and GitLab.
  • Arkindex base worker: a base class for integrating processing algorithms in Arkindex. See it on PyPi and GitLab.

Public Databases from TEKLIA projects

We publish ready to use datasets on HugginFace :

  • RIMES: Handwritten documents in French
  • NorHand : a dataset for handwritten text recognition in Norwegian
  • SIMARA : a dataset of handwritten index cards.