Doctoral Degrees (Computer Science)
Permanent URI for this collectionhttps://hdl.handle.net/10413/7113
Browse
Browsing Doctoral Degrees (Computer Science) by Subject "Automatic speech recognition."
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item The investigation into an algorithm based on wavelet basis functions for the spatial and frequency decomposition of arbitrary signals.(1994) Goldstein, Hilton.; Sartori-Angus, Alan G.The research was directed toward the viability of an O(n) algorithm which could decompose an arbitrary signal (sound, vibration etc.) into its time-frequency space. The well known Fourier Transform uses sine and cosine functions (having infinite support on t) as orthonormal basis functions to decompose a signal i(t) in the time domain to F(w) in the frequency . domain, where the Fourier coefficients F(w) are the contributions of each frequency in the original signal. Due to the non-local support of these basis functions, a signal containing a sharp localised transient does not have localised coefficients, but rather coefficients that decay slowly. Another problem is that the coefficients F(w) do not convey any time information. The windowed Fourier Transform, or short-time Fourier Transform, does attempt to resolve the latter, but has had limited success. Wavelets are basis functions, usually mutually orthonormal, having finite support in t and are therefore spatially local. Using non-orthogonal wavelets, the Dominant Scale Transform (DST) designed by the author, decomposes a signal into its approximate time-frequency space. The associated Dominant Scale Algorithm (DSA) has O(n) complexity and is integer-based. These two characteristics make the DSA extremely efficient. The thesis also investigates the problem of converting a music signal into it's equivalent music score. The old problem of speech recognition is also examined. The results obtained from the DST are shown to be consistent with those of other authors who have utilised other methods. The resulting DST coefficients are shown to render the DST particularly useful in speech segmentation (silence regions, voiced speech regions, and frication). Moreover, the Spectrogram Dominant Scale Transform (SDST), formulated from the DST, was shown to approximate the Fourier coefficients over fixed time intervals within vowel regions of human speech.