Accessible Thai Document Conversion System

Convert Thai and multilingual documents into accessible formats for blind and visually impaired users. This form supports individual upload, institutional batch processing, and future API integration.

Output targets include accessible HTML, DOCX, TXT, EPUB, MP3 audio, DAISY, Braille BRF, and tagged PDF/PDF-UA.

Conversion Options

Input: PDF, DOCX, TXT, JPG, PNG, TIFF, ZIP

Audio: MP3 and DAISY-ready structure

Braille: BRF output option

Batch: Upload or watch-folder processing

Start a Conversion

Complete the form below to prepare a document conversion request. Every field has a programmatic label and can be used with a screen reader and keyboard only.

Select one or more documents, choose the desired accessible output formats, and submit the request.

Document files

Accepted files: PDF, Word, text, image files, or ZIP archive for batch upload.

Output formats
Processing options
Language and delivery

Optional. Add any special instructions for this conversion request.

Conversion Status

No conversion request has been submitted yet.

Minimum Open-Source Configuration

The demonstration system is designed around proven open-source technologies. These components form the minimum technical foundation required to process Thai documents, manage batch conversion, and generate accessible output formats.

  • OCRmyPDF

    Adds searchable text layers to scanned PDF files.

  • Tesseract OCR

    Performs optical character recognition with Thai and English language support.

  • PaddleOCR

    Optional OCR engine for improved Thai character recognition and complex layouts.

  • Poppler

    Converts PDF pages into images for OCR and preprocessing.

  • ImageMagick

    Prepares and optimizes scanned images before OCR processing.

  • Python

    Main processing language for document workflows, automation, and AI integration.

  • FastAPI

    Provides REST API access for external systems and institutional integration.

  • Celery

    Handles background jobs and large-scale batch processing.

  • Redis

    Acts as the queue broker for conversion jobs and retry handling.

  • Watchdog

    Monitors incoming directories and triggers automatic batch processing.

  • Pandoc

    Converts structured content into HTML, DOCX, EPUB, and plain text outputs.

  • Calibre

    Supports eBook conversion and EPUB handling.

  • Piper TTS

    Generates offline text-to-speech audio output where Thai voice models are available.

  • eSpeak NG

    Provides lightweight speech synthesis and fallback audio generation.

  • Liblouis

    Converts text into digital Braille formats such as BRF.

  • DAISY Pipeline

    Supports generation of accessible DAISY-style structured audio publications.

  • veraPDF

    Validates accessible PDF output against PDF/UA requirements.

  • Ace by DAISY

    Checks EPUB accessibility and identifies accessibility issues.

  • MySQL / MariaDB

    Stores metadata, job status, user requests, audit records, and output references.

  • Debian Linux

    Provides a stable and secure server operating environment for deployment.

Recommended Initial Server Configuration

For the initial demonstration and launch phase, the system should run on a dedicated server with sufficient storage and processing capacity for OCR, audio generation, and batch conversion workloads.

Debian LinuxDedicated server8TB storagePython workersRedis queueMySQL metadata