A Machine Translation Research Prototype

Context‑Augmented Machine Translation for CAT Tools

Echo bridges Machine Translation (MT) and Computer‑Assisted Translation (CAT) using smaller, context‑enriched LLMs. Leverage translation memories, lexicons, and recent edits to improve quality while keeping data private.

✨ Highlights

🧠 Context‑Enriched LLMs

Smaller models + rich context (TMs, termbases, recency cache) can rival larger LLMs without context.

RAGHyDECacheQdrant

🔁 Self‑Updating Memory

Translator corrections flow back into the system automatically—no manual TM uploads required.

🔌 CAT Tool Integration

Trados TranslationProvider plugin retrieves context and suggestions directly in the translator’s workflow.

🛡️ Privacy‑First

Local‑first and on‑prem friendly. Keep sensitive data in‑house while improving quality.

🛠️ How it works

1) Ingest

Upload TMX/SDLTM and lexicons. Content is embedded into Qdrant for retrieval.

2) Translate

REST API + Trados plugin call LLMs (OpenAI, Anthropic, etc.) with relevant context.

3) Improve

Edits are captured and cached (FIFO per session) to increase consistency and quality.

🗺️ Roadmap

Prototype

  • ✅ Stage 1: End‑to‑end demo (API + plugin) and basic metrics
  • ✅ Stage 2: Automatic feedback to memory
  • ✅ Stage 3: Projects & auth basics
  • 🚧 Stage 4: Session cache (needs online backend)
  • 🎯 Stage 5: HyDE + new models + evaluations

Research

Goal: show that smaller, context‑enriched models match or exceed larger LLMs on standard MT metrics.

BLEUROUGEChrFMETEORCOMET

📊 Metrics Testbed

Our research

🎯 Who is this for?

👩‍💻 Translators

Faster, more consistent output within Trados; no new tools to learn.

🏢 Agencies

Quality and throughput improvements with existing TMs and workflows.

🏛️ Enterprise & Gov

On‑prem options for sensitive content; clear path to compliance.

🎓 Academia

A reproducible testbed and publishable results via shared tasks/benchmarks.