|PEZZOLI MIRCO||Cycle: XXXIII |
Tutor: MONTI-GUARNIERI ANDREA VIRGILIO
Advisor: SARTI AUGUSTO Major Research topic
:Space-time Parametric approach to Extended Audio Reality (SP-EAR)Abstract:
The term extended reality refers to the all possible interactions between real and virtual (computed generated) elements and environments. Extended reality field is rapidly growing, especially through augmented and virtual reality applications. The former allows users to bring digital elements into the real world, while the latter let us experience and interact with a fully virtual environment. While currently extended reality implementations primarily focus on the visual domain, the impact of auditory perception cannot be underestimated in order to provide a full immersive experience. As a matter of fact, an effective handling of the acoustic content is able to enrich the engagement of users. We refer to Extended Audio Reality (EAR) as the subset of extended reality operations related to the audio domain. In this thesis, we propose a parametric approach to EAR conceived in order to provide an effective and intuitive framework for the implementation of EAR applications. It is clear that the main challenges of EAR regard the processing of real sound fields and the rendering of virtual acoustic sources (VSs), hence, EAR requires the development of properly designed sound field representations.
As far as sound field representation is concerned, two main paradigms are present in the literature: parametric and non-parametric. In the context of EAR, parametric models represent an appealing approach. In fact, they provide a compressed and intuitive description of the sound field. This characteristic promotes the integration of VSs through the parameters of the model and the manipulation of thereof.
Here, we introduce a novel parametric model for sound field representation based on few parameters that allows both the navigation and manipulation of a recorded sound scene. The main feature of the proposed solution is represented by the modeling of the acoustic source directivity integrated among the parameters of the representation. The directivity is a function describing the spatial characteristic of the source sound radiation. As a matter of fact, sound sources typically present a directional acoustic emission imposed by their physical characteristics. It follows that our acoustic scene perception is influenced by the source directivity information. Therefore, the integration of the directivity is a fundamental aspect for providing a more natural and immersive EAR, enhancing the user experience. In order to analyze the sound field we adopted spatially distributed acoustic sensors. This configuration allows us to evaluate the acoustic field from different observation points in order to estimate the parameters required by the proposed representation. Successively, we exploit the estimated parameters to provide a sound field reconstruction technique that can be used for the six-degrees-of-freedom interaction (virtual navigation) with the sound field.
Conveniently, the parameters adopted for describing the acoustic sources can be exploited for characterizing a VS. Therefore, we can seamlessly implement EAR within the same parametric representation. Here, the addition of the source directivity into the model is appealing, since it allows the accurate rendering of VSs including their directional characteristics. Hence, we can further lead the real-virtual interaction by implementing VS replicas of actual acoustic sources. A VS replica mimics the source spatial sound radiation through the VS directivity parameters. As instance, the VS parameters can be estimated from measurements on the real source. Conversely, we can rely on fully simulated acoustic sources, e.g., by means of Finite Element Method (FEM) simulations, from which the VS parameters are derived. It follows that an accurate estimate, prediction and analysis of the directivity of VSs is fundamental in order to obtain an effective EAR.
In this thesis, we studied the VS implementation through a case study. In particular, we focused on the VS implementation of violins. Whereas violins present a peculiar directional radiation characteristic, we need to carefully analyze and model their directivity in order to provide an accurate VS implementation. Regarding the analysis of the violin directivity, we can outline different solutions according to their invasiviness. In the first place, one can perform measurements directly on played violin. During our collaboration with Museo del Violino settled in Cremona (Italy), we had the unique opportunity to measure, for the first time, a relevant number of valuable historical violins made by the renowned old masters such as Antonio Stradivari and played by professional violinists. From the acquired data, we derived a compressed representation of the violin directivity pattern based on the spherical harmonics expansion. Besides the VS modeling, the adopted representation allowed us to study and characterize the directivity patterns of the instruments, giving insights of their directional behavior. Despite the measurement of played instruments allows an analysis scenario closer to the actual listening conditions, it might not be applicable for particularly fragile instruments.
Less invasive techniques, such as nearfield acoustic holography (NAH) can be adopted when conventional measurements cannot be carried out. It is known that the acoustic radiation of a vibrating object, such as the violin, is determined by its dynamical behavior. Hence, from the knowledge of the vibration velocity field we can estimate the directivity of the source. NAH allows the contactless estimation of the velocity field of a vibrating source from acoustic pressure measured in its proximity. Here, we introduced a novel NAH technique based on deep learning. In particular, we proposed a convolutional neural networks (CNN) with structure of an autoencoder, in order to estimate the velocity field of both rectangular and violin plates.
Alternatively, simulations allow us to predict the directivity of a source relying on the FEM simulation of its vibroacoustic behavior. This approach minimizes the invasiveness at the cost of reducing the accuracy due to the inherent approximations of the simulated model. It follows that in order to effectively simulate a violin, a 3D model of the instrument geometry and its mechanical parameters are required. Unfortunately, for existing instrument we can typically acquire only their outer surface. Therefore, we developed a practical technique for the reconstruction of the 3D model of violin plates, starting from outer surface scans and sparse thickness measurements taken at reference points. Furthermore, as regards the estimation of the material mechanical parameters, we proposed the evaluation of the Young's modulus from the sound wave velocity of wood. As a matter of fact the Young's modulus is a fundamental parameter for mechanical simulations. In particular, the developed technique estimates the sound wave velocity from responses of the wood to an impulsive excitation in a rake receiver fashion. Successively, from the knowledge of the sound wave velocity, the Young's modulus is indirectly derived.
Lastly, we propose a EAR proof of concept though which we showcase the vision of the proposed parametric approach to EAR. We display a EAR scenario in which two VSs, a VS replica of a prestigious violin and a simulated generic model of the instrument are virtually co-located in a real sound scene with the presence of actual sound sources. The results give a sneak peek of the power of EAR, showing that the proposed parametric approach is able to provide the interaction between real and virtual sound elements. Hence, we envision that the proposed solutions will pave the way to the development of parametric EAR frameworks for extended reality applications.