Maybe of interest...
---------- Forwarded message ----------
From: Antoine Doucet <[log in to unmask]>
Date: Wed, Feb 22, 2017 at 9:02 AM
Subject: [SIG-IRList] CFP: ICDAR2017 Competition on Post-OCR Text Correction
To: [log in to unmask]
================= *Call for Participation* ==================
*ICDAR2017 Competition on Post-OCR Text Correction * *more information
on* *http://l3i.univ-larochelle.fr/ICDAR2017PostOCR
<http://l3i.univ-larochelle.fr/ICDAR2017PostOCR>*
================================================
**** GOAL ****
Improve/denoise OCR-ed texts, based on an a original corpus of OCRed
documents composed of 12M characters (2.1M tokens) aligned with their
corresponding Gold Standard, with an equal share of English- and
French-written materials. You can participate in English, French, or
both. Given
the noisy OCR of printed text, the participants are invited to participate
in the two following independent subtasks : *1)* *DETECTION* and *2)
CORRECTION*. The participant are encouraged to compute separate scores for
the English- and the French-written parts of the collection, allowing for
the evaluation of language-specific approaches.
* - TASK 1)* Detection of OCR errors: given the raw OCR-ed text, the
participants are asked to provide the position and the length of the
suspected errors.
* - **TASK 2) *Correction of OCR errors: given the OCR errors in their
context, the participants are asked to provide, for each error, either a)
only their one correction or b) a ranked list of candidates for correction.
Providing multiple candidates enables the indirect evaluation of
semi-automated techniques, with a human in the loop.
**** IMPORTANT DATES ****
• Training dataset available online : End-Feb. 2017
• Registration deadline : 30th March 2017
• Result submission : 15th June 2017
• ICDAR2017 Conference : 10th November 2017
A* competition report including both the description of the methodology and
a comparative analysis of participants' performances will be submitted for
publication at the ICDAR2017 conference.*
**** CONTACTS FOR SUBSCRIPTION ****
• Guillaume CHIRON (BnF/L3i) : guillaume.chiron(at)bnf.fr
• Antoine DOUCET (L3i) : antoine.doucet(at)univ-lr.fr
• Jean-Philippe MOREUX (BnF) : jean-philippe.moreux(at)bnf.fr
**** WEBSITE ****
http://l3i.univ-larochelle.fr/ICDAR2017PostOCR
|