Repository logo
  • English
  • ÄŒeÅ¡tina
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • LatvieÅ¡u
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Log In
    or
    New user? Click here to register. Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Projects
  • People
  • Statistics
  • English
  • ÄŒeÅ¡tina
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • LatvieÅ¡u
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Log In
    or
    New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Scientific Publication
  3. Journal Articles
  4. Journal Articles - Computer and Information Technology
  5. Journal Articles - Computer and Information Technology - 2019
  6. OCR Error Correction for Unconstrained Vietnamese Handwritten Text
 
Options

OCR Error Correction for Unconstrained Vietnamese Handwritten Text

Journal
Proceedings of the Tenth International Symposium on Information and Communication Technology - SoICT 2019
Date Issued
2019
Author(s)
Quoc-Dung Nguyen
Duc-Anh Le
Ivan Zelinka
DOI
10.1145/3368926.3369686
Abstract
Post-processing is an essential step in detecting and correcting errors in OCR-generated texts. In this paper, we present an automatic OCR post-processing model which comprises both error detection and error correction phases for OCR output texts of unconstrained Vietnamese handwriting. We propose a hybrid approach of generating and scoring correction candidates for both non-syllable and real-syllable errors based on the linguistic features as well as the error characteristics of OCR outputs. We evaluate our proposed model on a Vietnamese benchmark database at the line level. The experimental results show that our model achieves 4.17% of character error rate (CER) and 9.82% of word error rate (WER), which helps improve both CER and WER of an attention-based encoder-decoder approach by 0.5% and 3.5% respectively on the VNOnDB-Line dataset of the Vietnamese online handwritten text recognition competition (VOHTR2018). These results outperform those obtained by various recognition systems in the VOHTR2018 competition.
Subjects
  • Unconstrained Vietnam...

  • OCR

  • Post-processing

  • Error detection

  • Error correction

File(s)
AS383.pdf (800.8 KB)
google-scholar
Views
Downloads
VAN LANG UNIVERSITY LIBRARY

Phone: (+84) 28.71099217 (3220)

Email: thuvien@vlu.edu.vn

Office: 6th Floor, Building A, 69/68 Dang Thuy Tram Street, Binh Loi Trung Ward, Ho Chi Minh City

VAN LANG UNIVERSITY

Main Campus: 69/68 Dang Thuy Tram Street, Binh Loi Trung Ward, Ho Chi Minh City

Campus 1: 45 Nguyen Khac Nhu Street, Cau Ong Lanh Ward, Ho Chi Minh City

Campus 2: 233A Phan Van Tri Street, Binh Loi Trung Ward, Ho Chi Minh City