Document details

Modelling and segmentation of the vocal tract during speech production by using...

Author(s): Vasconcelos, Maria João ; Ventura, Sandra Moreira Rua ; Freitas, Diamantino Rui ; Tavares, João Manuel

Date: 2010

Persistent ID: http://hdl.handle.net/10400.22/3528

Origin: Repositório Científico do Instituto Politécnico do Porto

Subject(s): Modelling and simulation; iomedical engi neering systems for future diagnosis & therapy; Image analysis

Description
The first and second authors would like to thank the support of the PhD grants with references SFRH/BD/28817/2006 and SFRH/PROTEC/49517/2009, respectively, from Fundação para a Ciência e Tecnol ogia (FCT). This work was partially done in the scope of the project “Methodologies to Analyze Organs from Complex Medical Images – Applications to Fema le Pelvic Cavity”, wi th reference PTDC/EEA- CRO/103320/2008, financially supported by FCT. Since ancient times, speech production has attracted particularly interest aiming at reaching a deeper understanding of the mechanisms involved by considering both morphological and speech acoustic aspects. The central anatomical aspects and the physiology of the human vocal tract are common to all individuals. However, speech production is an exceptionally complex and individualistic process. Therefore, the modelling of the mechanisms involved in speech production implies the enclosing of adequate flexibility in order to consider individual variations accurately. In this work, the shape of vocal tract in the articulation of some European Portuguese (EP) sounds is evaluated by using deformable models applied in Magnetic Resonance (MR) images. Additionally, the deformable models built are afterwards used to automatically segment the modelled vocal tract in MR images. From the imaging modalities that have been take n into consideration in order to study the vocal tract shape and articulators, Magnetic Resonance Imaging (MRI) has been the most commonly accepted. Actually, the use of MRI allows the study of the entire human vocal tract and, in addition, the quality and resolution of soft-tissues and the use of non-ionizing radiation are key advantages presented by MRI. The deformable model used, commonly known as Point Distribution Model (PDM), was built from a set of training images acquired du ring artificially sustained articulations of 21 EP sounds. In a brief review, one can assert that PDM’s are obtained by a statistical analysis done on the co-ordinates of landmark points that represent the shape to be modelled: after aligning the training shapes, a Principal Component Analysis is performed in order to obtain the model mean shape and the modes of variation relatively to this mean shap e. Combining the geometrical information of the PDM with the grey levels of the landmark points us ed in its building one can build the Active Shape Models (ASM) and the Active Appearance Models (AAM). With these enhanced models is possible to segment the modelled shape in new images in a fully automated way. From the experimental results obtained in this work, one may conclude that the PDM built could efficiently characterize the behaviour of the voca l tract shape during the production of the EP sounds studied with MRI. Furthermore, one can ve rify that the ASM and the AAM built could be used to segment the modelled vocal tract in MR images in a successful manner. Therefore, the deformable models built should be considered towards the efficient and automatic study of the vocal tract during speech production with MRI, in particular for enhanced speech production simulation and speech rehabilitation therapies.

Document Type Conference Object
Language English

			Financiadores do RCAAP

Document details

Modelling and segmentation of the vocal tract during speech production by using...

Related documents

Speech articulation assessment using dynamic magnetic resonance imaging techniques

Analysis of tongue shape and motion in speech production using statistical modeling