Speech samples for the paper "Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems" which is presented at IEEE SLT 2018 - Workshop on Spoken Language Technology.
A pre-print version of this paper can be found at https://arxiv.org/abs/1807.11632
More publications (arxiv pre-print, posters,...) about speaker adaptation and speech synthesis can be found at my website.
samples synthesized using WORLD vocoder
samples synthesized using Wavenet vocoder for selected strategies
1st sample
| Non-linear | Linear | |||
|---|---|---|---|---|
| 10 | 320 | 10 | 320 | |
| Natural | ► Play | |||
| Biasm | ► Play | ► Play | ||
| Bias | ► Play | ► Play | not available | not available |
| Scale | not available | not available | not available | not available |
| Affine | not available | not available | ► Play | ► Play |
| Level | not available | not available | ► Play | ► Play |
| Bottle | not available | not available | ► Play | ► Play |
2nd sample
| Non-linear | Linear | |||
|---|---|---|---|---|
| 10 | 320 | 10 | 320 | |
| Natural | ► Play | |||
| Biasm | ► Play | ► Play | ||
| Bias | ► Play | ► Play | not available | not available |
| Scale | not available | not available | not available | not available |
| Affine | not available | not available | ► Play | ► Play |
| Level | not available | not available | ► Play | ► Play |
| Bottle | not available | not available | ► Play | ► Play |