SAGEA / Research First AI Company Building Frontier AI, Agents, Assistants & Services

The Challenge with Devanagari OCR

Modern sequence models demand extensive data scaling. While Latin scripts benefit from massive and diverse datasets, low-resource abugida scripts like Devanagari remain severely bottlenecked by data scarcity.

The intuitive solution—synthesizing word-level data by assembling isolated character fragments—fails unpredictably in Devanagari. Words are anchored by a continuous horizontal top-line, the Shirorekha. Naive concatenation of characters sampled from diverse writers results in broken continuous lines, erratic baseline misalignment, and discontinuous stroke thickness. We refer to this phenomenon as the "Ransom Note" effect. Models trained on such artificial corpora optimize for arbitrary juncture artifacts rather than underlying stroke morphology.

Introducing the GLUE Engine

To resolve this distributional shift, we propose GLUE (Generative Ligature & Union Engine). GLUE is a spatial and procedural optimization framework that algorithmically resolves stroke boundary conditions and morphological rules without naive image concatenation.

GLUE formulates Devanagari construction as an algebraic image integration problem. Instead of hard concatenation, it employs Shirorekha-anchored alignment distributions and Poisson boundary blending to enforce gradient continuity across character junctions, simulating the fluid flow of a single continuous pen stroke.

Beyond continuous alignment, GLUE natively generates conjuncts (half-characters) and matras (vowel modifiers) via a procedural linguistic ruleset. Instead of treating them as edge cases, GLUE uses them as core synthetic primitives.

Results and Advancements

We demonstrate the empirical efficacy of GLUE via Arva, a proprietary recognition model leveraging a ResNet sequence backbone pre-trained exclusively on GLUE corpora and fine-tuned on real-world datasets.

Seamless Integration: Employs Poisson blending to mathematically eliminate traditional junction seam artifacts.
Procedural Fidelity: Algorithmically synthesizes linguistically correct conjuncts dynamically.
Reduced Error Rates: Models trained on GLUE corpora achieve substantial CER (Character Error Rate) reductions over naive synthesis baselines.
Extensible Framework: Proves the efficacy of synthetic geometry constraints over mere pixel variations, defining an extensible method applicable to generic abugida scripts like Tibetan and Bengali.

Scaling Devanagari OCR With GLUE Engine

The Challenge with Devanagari OCR

Introducing the GLUE Engine

Results and Advancements