DocMatcher: Document Image Dewarping via Structural and Textual Line Matching

Document image dewarping is a crucial step in the digitization of physical documents, as it aims to remove the distortions induced by challenging environment settings and document sheet deformations often encountered when using smartphone cameras for image capture. Recently, deep learning-based methods were combined with knowledge about the expected document structure, also known as a template, at inference time to improve the dewarping results.

Our contributions in this work are threefold:

we propose a novel document image dewarping approach that leverages the prior knowledge about the document structure effectively by detecting and matching lines from the warped and the template domain, and
we introduce a novel evaluation metric called matched normalized character error rate (mnCER) to overcome the limitations of existing metrics in evaluating the dewarping process.
Finally, we evaluate our approach on the Inv3DReal dataset and show that our approach outperforms the state-of-the-art methods in terms of visual and text-based metrics.

Our approach improves upon the state-of-the-art methods by 32.6% in Local Distortion and 40.2% in mnCER. Our code and models are available on this website.

DocMatcher

Document Image Dewarping via Structural and Textual Line Matching

Abstract

BibTeX