Highly variable handwriting
Different sales reps write with different styles, pressure, angles, cross-line notes, corrections, signatures, and overwritten fields. Traditional OCR often fails at the full-row level.
This is one entry from Boilingwater Technology's AI solution library. For handwritten order recognition, we benchmarked general OCR APIs, multimodal large models, and a domain-specific OCR model, then delivered a hybrid production pipeline. On messy handwriting, folded paper, cross-line corrections, and overlapping fields, field-level accuracy improved from 68.4% to 96.1%.
The customer is a fast-moving consumer goods distributor serving hundreds of retail outlets. Orders arrive as handwritten paper forms and must be entered into ERP for fulfillment and reconciliation. The actual paper conditions are far messier than a clean demo sample.
Different sales reps write with different styles, pressure, angles, cross-line notes, corrections, signatures, and overwritten fields. Traditional OCR often fails at the full-row level.
Quantity, unit price, product name, and remarks are often mixed together. Product shorthand must be normalized against business vocabulary and SKU context.
The team had to process 1,200–1,800 orders per day within a narrow window. Two operators were working overtime and still produced avoidable entry errors.
A wrong quantity, price, or customer name affects fulfillment, reconciliation, and month-end settlement. The impact is not inconvenience; it is real financial exposure.
The sample below shows a desensitized handwritten order from a real workflow. We place the original image, visual recognition overlay, and final structured JSON side by side so stakeholders can see what the AI actually does.


Creases shift field positions, but all 12 rows are correctly aligned.
Crossed-out prices and handwritten replacements are interpreted with an audit trail.
Remarks overlapping the price column are reassigned through structured post-processing.
For production AI, we do not choose a model first. We benchmark practical options against real samples, then design a hybrid pipeline where each model handles the part it is best suited for.
The delivered system uses a domain OCR primary path, a multimodal semantic correction path, and cloud OCR fallback. Most routine orders finish in under one second. Low-confidence fields are routed to the multimodal model, and poor-quality samples or service failures fall back automatically with manual-review flags.
This structure balances accuracy, cost, latency, and controllability. The engineering principle is simple: use software architecture to turn model uncertainty into business certainty.
We treat AI as a pipeline, not a black box. Every step has a defined responsibility, input, output, and fallback strategy.
Images enter through mobile, scanners, or forms. Distortion, shadow, white balance, and layout are normalized before recognition.
A layout-aware detector identifies headers, rows, columns, and field roles before recognition.
GS-OCR-Hand v2 is fine-tuned on real handwritten samples. Low-confidence fields are routed forward for semantic review.
For corrections, cross-line notes, and context-heavy fields, a multimodal model reads against SKU dictionaries and historical order context.
The output is normalized with unit conversion, price-range checks, customer matching, total checks, and audit logs.
Five business channels with idempotency, rate limits, and desensitization.
Three inference paths plus a confidence-aware router.
Connects OCR output to ERP, reconciliation, and review workflows.
Online corrections flow back into datasets so the model improves monthly.
We deliver AI systems as accepted, measurable engineering projects. Before launch, the work is milestone-based; after launch, the data loop keeps improving the model.
Walk through the real order flow with business and IT stakeholders.
Collect 4,600 real forms and build the first training and evaluation sets.
Run cloud OCR, multimodal reading, and custom OCR on the same samples.
Build confidence routing, semantic correction, fallback, and pressure tests.
Run AI and human entry in parallel at one warehouse for reconciliation.
Roll out to six warehouses and sign off against KPI targets.
Online errors flow back into the sample store for incremental improvement.
Instead of vague claims, we use same-sample before-and-after metrics and customer feedback to show whether the system solved the real problem.
“Month-end reconciliation used to be our biggest headache. Since this OCR system went live, orders from six warehouses are basically scanned, structured in seconds, and written into ERP. More importantly, the models and data stay in our private cloud.
ITIT Director · FMCG distributor in South China
The hybrid OCR + multimodal correction + fallback pattern can be reused for forms, tickets, handwritten logs, and business documents where structure matters.
Stock replenishment forms can go directly into inventory systems.
Sales teams can scan slips into monthly ledgers.
Robust handling for outdoor stains, folds, and handwriting.
Sensitive units and dosage fields can be checked against dictionaries.
Signatures and remarks can be separated from operational fields.
Supports local deployment and end-to-end audit trails.
We are a software engineering and AI implementation team. Over the past three years, we have moved more than 20 AI scenarios from promising demo to stable production operation.

The first strategy call is free. We will unpack the workflow, judge whether AI is worth using, identify the right technical route, and provide a practical initial plan and estimate within five business days.