Introduction

With the current technologies, it is possible to make use of the vast possibilities of digital data processing techniques, such as machine learning approaches, to address long standing scientific questions formulated within the humanistic sciences. These questions include investigations of the linguistic interactions of the world, which have led to a rich variety of the world’s languages and their typological relationships. These questions of course are vastly complex, as is the linguistic data needed to study typology. On the other hand, machine learning approaches are able to handle large amounts of linguistic data. This makes language typology a fruitful opportunity to aim to combine the technological advances of machine learning and the longstanding research questions of humanities. 

Traditionally, technology and humanities have maintained a separateness in both methodology and questions that are of interest, and only in recent decades, the aforementioned recognition of how they can be useful to one another has come to light. To bring these two, technology and humanities, together has not been a simple task. The disparage itself already stems from the current educational structure of humanities and technology not interacting in a meaningful way, and the students of humanities not being familiar enough with the possibilities current technological advances can offer for their research, and vice versa, technology students not being familiar with the diversity of possible questions they could apply for instance machine learning techniques to. 

We started this project with an idea to bridge this gap. This project involves teams and specialists from two institutions who work as academic collaborators, Indian Institute of Technology in Guwahati (IITG), India, and the University of Helsinki (UH) in Finland. Both of the collaborators conduct active research in prosodic language typology and in application of machine learning techniques for typological research. Both are also actively involved in preparing new courses and in supervision of Master and PhD students in the domain of digital language typology.