arkindex_socface

Document management

Arkindex is the platform developed by Teklia for the automatic processing of large collections of scanned documents. Arkindex offers the following features:

  • Document management: import and organize images of document from files (jpeg, tiff, png), PDF, IIIF manifests . See our Video
  • Manual annotation: annotate images with
    • zones of elements on the image, with type and position
    • text transcriptions at any level (page, paragraph, line, word)
    • classifications
    • meta-data
  • Arkindex is fully integrated with Callico, for advanced collaborative annotation campaigns

Document processing

Arkindex is a platform for executing any document processing algorithm: OCR, HTR, feature extraction, captioning, translation, etc. Its architecture has been designed to be generic, enabling it to store any type of result, with generic and configurable types. The following processing types are possible with Arkindex :

Processing type Description
Image classification Associate a class with an image or a portion of an image
Object detection Detect an object in an image using a bounding box and identify its type
Object segmentation Detect the precise outline of an object in an image and identify its type
Image captionning Generate a caption for an image
Transcription Transcribe printed or handwritten text from an image. Video
Text classification Associate a class with a text
Key-value extraction Extract information from an image or text in the form of a key-value association
Table recognition Detect and transcribe information presented in the form of a table while preserving its structure
Named entity recognition Detect and type named entities in text
Entity linking Link a named entity to an existing reference system
Translation Translate a text from a source language to a target language
Geolocation Associate GPS coordinates with an image or text
Grouping objects Grouping elements in the same structure

✅ See our Video tutorials

✅ See our comparison page : Arkindex versus the other platforms

Document processing workflow

Arkindex offers extensive capabilities, unmatched by its competitors, for managing complex workflows tailored to your document processing needs:

  1. Customisable Workflow Design: Arkindex gives you the freedom to define complex workflows tailored to your unique processing requirements. From layout analysis and classification to text recognition (OCR/HTR), named entity recognition and metadata generation, you can curate each step to achieve your desired outcome.
  2. Real-time monitoring: Stay informed at all times. With Arkindex, you can monitor the progress of each task within your workflow in real time. This powerful feature provides you with an estimated time of arrival for each step, ensuring you can make informed decisions and adjust resources as necessary.
  3. Error Analysis & Rerun: Not all processes run perfectly every time. Arkindex understands this and provides tools to analyse any errors that may occur in your workflow. Once identified, you can easily rerun processes for those specific elements, ensuring consistency and accuracy.
  4. Flexible Processing Nodes: To accommodate different infrastructure requirements, Arkindex provides the flexibility to distribute your processing tasks across multiple nodes. Whether it's on-premises, in a cloud environment or even on high performance clusters using SLURM, we've got you covered.
  5. Seamless integration with custom & open source components: Arkindex is not limited to its built-in functionality. You can effortlessly define your processing steps using your proprietary code or benefit from the vast ocean of open source components available. Docker integration makes integrating these components easy.

Arkindex is based on IIIF (https://iiif.io/) for images and is fully accessible through a REST API.

Arkindex can be used in the cloud or installed on-premise.

Try it : https://demo.arkindex.org

Projects done with Arkindex:

Arkindex code and releases:

We aim to produce high quality open-source software at Teklia: