xref: /aosp_15_r20/external/sonic/README (revision b290403dc9d28f89f133eb7e190ea8185d440ecd)
1*b290403dSRicardo GarciaSonic is a simple algorithm for speeding up or slowing down speech.  However,
2*b290403dSRicardo Garciait's optimized for speed ups of over 2X, unlike previous algorithms for changing
3*b290403dSRicardo Garciaspeech rate.  The Sonic library is a very simple ANSI C library that is designed
4*b290403dSRicardo Garciato easily be integrated into streaming voice applications, like TTS back ends.
5*b290403dSRicardo Garcia
6*b290403dSRicardo GarciaThe primary motivation behind Sonic is to enable the blind and visually impaired
7*b290403dSRicardo Garciato improve their productivity with open source speech engines, like espeak.
8*b290403dSRicardo GarciaSonic can also be used by the sighted.  For example, Sonic can improve the
9*b290403dSRicardo Garciaexperience of listening to an audio book on an Android phone.
10*b290403dSRicardo Garcia
11*b290403dSRicardo GarciaA native Java port of Sonic is in Sonic.java.  Main.java is a simple example of
12*b290403dSRicardo Garciahow to use Sonic.java.  To play with it, you'll need a "talking.wav" file in the
13*b290403dSRicardo Garciacurrent directory, and you'll want to change the speed, pitch or other
14*b290403dSRicardo Garciaparameters manually in Main.java, in the main method.
15*b290403dSRicardo Garcia
16*b290403dSRicardo GarciaSonic is Copyright 2010, 2011, Bill Cox, all rights reserved.  It is released
17*b290403dSRicardo Garciaunder the Apache 2.0 license, to promote usage as widely as possible.
18*b290403dSRicardo Garcia
19*b290403dSRicardo GarciaPerformance test:
20*b290403dSRicardo Garcia
21*b290403dSRicardo GarciaI sped up a 751958176 byte wav file with sonic (a 9 hour, 28 minute mono audio
22*b290403dSRicardo Garciafile encoded at 16-bit 11.KHz), but with the output writing disabled.  The
23*b290403dSRicardo Garciareported time, running Ubuntu 11.04 on my HP Pavilion dm4 laptop was:
24*b290403dSRicardo Garcia
25*b290403dSRicardo Garciareal    0m50.839s
26*b290403dSRicardo Garciauser    0m47.370s
27*b290403dSRicardo Garciasys     0m0.620s
28*b290403dSRicardo Garcia
29*b290403dSRicardo GarciaThe Java version is not much slower.  It reported:
30*b290403dSRicardo Garcia
31*b290403dSRicardo Garciareal    0m52.043s
32*b290403dSRicardo Garciauser    0m51.190s
33*b290403dSRicardo Garciasys     0m0.310s
34*b290403dSRicardo Garcia
35*b290403dSRicardo GarciaUpdate, May 7, 2017
36*b290403dSRicardo Garcia-------------------
37*b290403dSRicardo GarciaI upgraded the pitch change algorithm to use a 12-point sinc FIR filter for
38*b290403dSRicardo Garciainterpolation, rather than linearly interpolating between points.  This
39*b290403dSRicardo Garciasignificantly reduces noise introduced by the pitch change algorithm.  It is
40*b290403dSRicardo Garciamost noticable in low-sample-rate streams, such as the 11,025 Hz output of the
41*b290403dSRicardo GarciaEloquence TTS engine.  The upgrade is in both the C and Java versions.
42*b290403dSRicardo Garcia
43*b290403dSRicardo Garcia
44*b290403dSRicardo GarciaAuthor: Bill Cox
45*b290403dSRicardo Garciaemail: [email protected]
46