Professor Jörg Tiedemann gave a talk at the Charles University in Czech Republic on the 18th June as part of their Fred Jelinek Seminar Series.
Translated texts are semantic mirrors of the original text and the significant variations that we can observe across languages can be used to disambiguate the meaning of a given expression using the linguistic signal that is grounded in translation. We are interested in massively parallel corpora consisting of hundreds up to a thousand different languages and how they can be applied as implicit supervision to learn abstractions that could lead to significant improvements in natural language understanding. As a side-effect, we can also see how multilingual models can pick up essential relationships between languages building a continuous space with reasonable language clusters. I will talk about some initial results and plans for the future and I would like to get your feedback about those ideas.