FinSyn – Synthesis experiments

FinSyn – a New Speech Synthesis Corpus for Finnish

Publicly available Finnish speech resources are becoming plentiful, with, for example, a large parliamentary corpus and the recent Lahjoita puhetta, “donate speech”, campaign.  These are eminently suitable for many speech technology applications, like speech recognition, but of limited value for speech synthesis training, where large, high quality single-speaker corpora are ideal. So far, such corpora have been absent for Finnish language.

To fill this gap, we at Helsinki phonetics  group designed and recorded a new speech corpus in autumn 2021, intended for Finnish speech synthesis research and applications.  A dataset consisting of ~ 60h of speech was collected, recording two voice talents almost daily during a one month period.

