For decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide direct correspondences between source and target sentences. The notion that translation systems could be trained on non-parallel texts, independently written in different languages, was long considered unrealistic. Fast forward to the era of large language models (LLMs), and we now know that given their sufficient computational resources, LLMs exploit incidental parallelism in their vast training data, i.e., they identify parallel messages across...
For decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide d...