THESIS
2022
1 online resource (xii, 114 pages) : illustrations (some color)
Abstract
Humans can localize sound source(s) with two ears - binaural sound localization. Conventional
methods to model binaural localization focused on artificial spatial cues such
as Interaural Time Difference (ITD) and Interaural Level Difference (ILD) to decode the
locational information. In this work, we extracted spatial features with sparse coding
algorithms and further mapped the features to predict sound locations with Deep Neural
Network (DNN). The use of GASSOM (Generative Adaptive Subspace Self-organizing
Map) and Independent Component Analysis (ICA) as the sparse coding algorithms were
compared. Results indicate that GASSOM outperforms ICA. Map size and basis function
length have been shown to affect the performance of GASSOM and the optimal
selections of both parameters are reporte...[
Read more ]
Humans can localize sound source(s) with two ears - binaural sound localization. Conventional
methods to model binaural localization focused on artificial spatial cues such
as Interaural Time Difference (ITD) and Interaural Level Difference (ILD) to decode the
locational information. In this work, we extracted spatial features with sparse coding
algorithms and further mapped the features to predict sound locations with Deep Neural
Network (DNN). The use of GASSOM (Generative Adaptive Subspace Self-organizing
Map) and Independent Component Analysis (ICA) as the sparse coding algorithms were
compared. Results indicate that GASSOM outperforms ICA. Map size and basis function
length have been shown to affect the performance of GASSOM and the optimal
selections of both parameters are reported in the thesis. In order to verify the ability
of GASSOM-DNN sound localization model to simulate human binaural localization
performance, benchmark studies with past reported empirical data were conducted. Factors
investigated included: the influence of bandwidth, center frequency and duration of
binaural cues; and the mismatch of non-individualized HRTFs. Performance of computational
model was compared with previously reported human data and similarity was
achieved. Future potentials on the use of GASSOM to model binaural sound localization
are discussed.
Post a Comment