Translation Operating System
Internal command centre — job flow, status tracking, worker pipeline, review, delivery.
Case study · AI document translation operations
How a controlled AI translation operating system was designed and built for a visa, legal, and official document business — replacing manual file-by-file translation while retaining every document's context, format, and structure, with humans kept in the approval seat. An AI systems engagement delivered with Ongkrong Consulting.
An account of the Ampor Translation engagement — product design, AI pipeline engineering, OCR, structured quality evaluation, human review, layout mirroring for official documents, client operations, invoicing, portal delivery, production hosting, and a companion Telegram AI receptionist. Delivered with Ongkrong Consulting · accurate as of June 2026.
Ampor Translation is a translation and consultancy business. It does not only receive clean Word documents. Clients send visa applications, official forms, legal contracts, certificates, scans, phone photos, low-quality PDFs, and mixed-format material. A useful system had to work with the messy front door of the business, not only with ideal inputs. The operational requirement was broader than translation accuracy — before Ampor Hub, the workflow was fragmented at every step.
Replace human translators for visa applications, official documents, and legal contracts with a controlled AI-assisted workflow — while retaining every document's context, structure, and format. Not just translation: translation operations — intake, OCR, AI pipeline, quality evaluation, human review, layout mirroring, client records, invoices, portal delivery, and customer intake in one system.
The work spanned product, engineering, AI workflow design, deployment, security, and operational handoff — ten workstreams from first intake to customer support, designed as an integrated system where each feeds the next.
Internal command centre — job flow, status tracking, worker pipeline, review, delivery.
DOCX, digital PDF, scanned PDF, image, browser scan — real formats, automatic routing.
12-language registry, language-hub rule, glossary, templates, protected patterns.
AI vision OCR, document classification, 7-dimension QAF, segment approval flow.
Layout analysis, region mapping, layout editor, final DOCX/PDF export.
Client records, job history, notes, services catalogue, invoices, VAT, deposits.
Token-gated portal — clients view and download their own jobs and invoices.
Single container, Postgres, HTTPS, background worker, health checks, runbooks.
Telegram bot — FAQs, booking, document triage, RAG, staff handoff, notifications.
Auth, role gating, upload validation, portal token model, secret handling, runbooks.
| Input type | Detection | Processing path | Pipeline entry |
|---|---|---|---|
| DOCX | File extension + MIME type | Structural XML parser → Document tree extractor → Section analyser | Structured segments |
| Digital PDF | Text layer present | Position-aware text extractor → Region grouper → Section analyser | Positioned segments |
| Scanned PDF | No extractable text layer | Page renderer → AI Vision OCR per page → Text assembler | OCR segments |
| JPG / PNG image | Image MIME type | Direct AI Vision OCR → Text assembler | OCR segments |
| Browser scan | Staff-initiated capture | Multi-page PDF export → Scanned PDF path | OCR segments |
| Class | Detection signals | Translation register | Controls activated |
|---|---|---|---|
| Legal | Legal terminology, clause structures, section numbering | High formality — precise legal register | Name + reference protection, glossary enforcement, strict fidelity |
| Official | Government headers, seal markers, form-field patterns | Formal — government / administrative register | Date + ID preservation, place-name and entity protection, layout-mirror path |
| Form | Field labels, blank fields, table-grid structure | Literal / field-mapped translation | Field mapping, exact value preservation, position-aware output |
| Report | Section headers, paragraph structure, numbered lists | Professional — section-aware | Heading translation, section integrity check, structure validation |
| General | Default — no strong structural signal detected | Standard — neutral register | Basic date, ID, and code protection only |
AI accelerates the work; staff still own the final output. No translated document reaches a client without human approval — and the QAF tells staff exactly where to look first. Before translation the system analyses and classifies each document; after translation the 7-dimension Quality Assessment Framework evaluates every segment before it reaches the review surface.
| # | Dimension | What is measured | Signal | Threshold |
|---|---|---|---|---|
| 01 | Protected-pattern integrity | Dates, IDs, codes, emails, reference numbers detected in source, verified in output | Hard block | Any mutation detected |
| 02 | Script integrity | Target-language output validated for correct Unicode range and script encoding | Hard block | Invalid characters found |
| 03 | OCR confidence | Character-recognition score assigned per segment by AI Vision OCR | Auto-warn | Score < 0.80 |
| 04 | Semantic fidelity | Word-count ratio between source and translation — extreme deviation signals truncation or hallucination | Auto-warn | > 1.6× or < 0.55× |
| 05 | Terminology compliance | Glossary term-match rate — required terms checked against the active job glossary | Auto-warn | < 95% term match |
| 06 | Format fidelity | Structural element count — headings, tables, lists, field labels matched source to output | Review flag | Any count mismatch |
| 07 | Glossary coverage | % of job-specific terms present in the active glossary before translation | Advisory | < 80% coverage |
All 7 dimensions pass · confidence ≥ 0.85 · no flags. Batch-approve eligible segments without individual review.
One or more dimensions flagged. Staff must review each segment individually before the job can be approved.
Protected pattern mutated or script integrity failed. Segment must be manually corrected. Delivery blocked until resolved.
Many translation jobs are not plain text. Official documents need layout sensitivity: forms, certificates, tables, headings, seals, margins, and field positions. The system analyses the original layout, maps translated content back into regions by section and field position, lets staff adjust in an editing surface, and exports a DOCX or PDF suitable for client delivery. For official documents, layout fidelity is not cosmetic — it is the delivery standard the client expects. Around that, Ampor Hub also carries the business operations: client records and job history, a services catalogue, and invoices in USD and KHR with deposits, discounts, and VAT — every invoice linked to its job, every job to its client. No separate spreadsheet, no separate invoicing tool, no separate file store.
The platform supports a 12-language registry with a configurable language-hub rule: every job must involve one of the configured hub languages on one side. That lets the business handle multilingual demand while keeping the operational centre of gravity aligned to the markets it serves. The hub model governs which pairs are valid; review-driven feedback makes every subsequent job smarter.
| Pairing type | Validity | Routing |
|---|---|---|
| Hub A ↔ Hub B | ✓ Supported — primary hub bridge | Direct translation · highest-priority pair |
| Hub A ↔ Other | ✓ Supported — hub-A spoke | Direct translation with hub-A-side controls |
| Hub B ↔ Other | ✓ Supported — hub-B spoke | Direct translation with target-script integrity guardrails |
| Other ↔ Other | – Not in scope | Must involve a hub language — server-side validation prevents invalid pairs |
Document enters; pipeline classifies and extracts content. Active glossary, templates, and patterns applied.
Segments translated using the current knowledge base. QAF evaluates every segment before review.
Staff correct errors, edit segments, flag issues. Every edit is a signal the system captures.
Terminology errors, missed patterns, register issues, and glossary gaps identified and logged.
Glossary extended, protected patterns refined, template instructions improved.
Updated controls applied to all subsequent translations — compounding accuracy over time.
Clients receive a token-gated portal link to view their own jobs, download files, and see invoices — scoped to one client, isolated from staff routes, admin-managed tokens. No ad-hoc file sharing, no email attachments, no ambiguity about which version is final.
Single web container with HTTPS ingress and health checks; internal Postgres with database-backed job queueing and atomic claiming; an in-process background worker with heartbeat monitoring and restart handling; migration runner at boot, plus backup, restore, and handoff runbooks.
A companion Telegram AI receptionist handles FAQs and price guidance from a controlled knowledge base (RAG), appointment booking, document-upload triage, language switching, and staff/owner handoff with context — so staff receive a warm, documented enquiry, not a cold one.
Staff authentication with admin and staff roles and server-side route gating; strong session-secret enforcement, password hashing, and login throttling; portal token isolation; upload validation and size limits; backup and secret-handling runbooks documented for handoff.
Major artifacts produced across the ten workstreams — from translation pipeline to production operations.
Seven evaluation dimensions, three signal types, and three gate outcomes — applied to every segment of every job. Quality is measured continuously through the pipeline, not reviewed only at the end.
Five document formats, automatic classification into five document types, and an OCR path that handles the real inputs a translation business receives — not only clean Word documents.
The continuous-improvement loop feeds review corrections back into glossary, protected patterns, and templates — so each visa, contract, and official document makes the next one better.
If your business processes visa applications, official documents, or legal contracts through manual translators, we can build you a controlled AI translation operation that retains document context, format, and structure while keeping humans in the approval seat. That is the brief Ampor Hub was built against.
Prepared with Ongkrong Consulting. Accurate as of June 2026.