Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment

Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech reha...

Full description

Bibliographic Details
Main Author:	Ahmad Azhar, Nur Syahmina
Format:	Thesis
Language:	English English
Published:	2024
Online Access:	http://eprints.utem.edu.my/id/eprint/28630/ https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
Abstract	Abstract here

_version_	1855619834084065280
author	Ahmad Azhar, Nur Syahmina
author_facet	Ahmad Azhar, Nur Syahmina
author_sort	Ahmad Azhar, Nur Syahmina
description	Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech rehabilitation and treatment are time-consuming and involve physical activity, with most facilities still manually performing the process. However, technological advancements, such as Artificial Intelligence (AI), have opened up innovative solutions in speech rehabilitation. AI studies have focused on speech classification for various human languages, with the potential to revolutionize speech rehabilitation and make it more accessible to individuals worldwide. Since computer vision has impacted this field, machine learning and deep learning have been applied to the medical and healthcare industries to enhance rehabilitation by utilizing the new technology. Convolutional Neural Network (CNN) network models have been proven in countless studies to be precise at classifying performance in object and speech classification. This research analyzed the performance accuracy of different deep learning comparative network models, proposed network models, VGG-Net, AlexNet, and Inception, and performed a complete comparative analysis to assess these network models' classification accuracy and suitability for rehabilitation purposes. This thesis aims to develop a reliable vowel classification system with high-performance accuracy that can successfully recognize the classification of vowels in the normal person group, the post-stroke patient group with speech disorders, and the combination of both groups using the two proposed image profiles: the Mel spectrogram and the Mel Frequency Cepstral Coefficients (MFCC). According to the experimental results, the proposed network network model, which used six batch sizes, 20 epochs, and ADAM as the optimizer, managed to outperform the performance accuracy of the other existing comparative network network models. The highest performance accuracy gained for the Mel spectrogram, and MFCC image profile in the analyses conducted was 96.30% and 98.77%, respectively.
format	Thesis
id	utem-28630
institution	Universiti Teknikal Malaysia Melaka
language	English English
publishDate	2024
record_format	EPrints
record_pdf	Restricted
spelling	utem-286302025-04-03T09:44:05Z http://eprints.utem.edu.my/id/eprint/28630/ Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment Ahmad Azhar, Nur Syahmina Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech rehabilitation and treatment are time-consuming and involve physical activity, with most facilities still manually performing the process. However, technological advancements, such as Artificial Intelligence (AI), have opened up innovative solutions in speech rehabilitation. AI studies have focused on speech classification for various human languages, with the potential to revolutionize speech rehabilitation and make it more accessible to individuals worldwide. Since computer vision has impacted this field, machine learning and deep learning have been applied to the medical and healthcare industries to enhance rehabilitation by utilizing the new technology. Convolutional Neural Network (CNN) network models have been proven in countless studies to be precise at classifying performance in object and speech classification. This research analyzed the performance accuracy of different deep learning comparative network models, proposed network models, VGG-Net, AlexNet, and Inception, and performed a complete comparative analysis to assess these network models' classification accuracy and suitability for rehabilitation purposes. This thesis aims to develop a reliable vowel classification system with high-performance accuracy that can successfully recognize the classification of vowels in the normal person group, the post-stroke patient group with speech disorders, and the combination of both groups using the two proposed image profiles: the Mel spectrogram and the Mel Frequency Cepstral Coefficients (MFCC). According to the experimental results, the proposed network network model, which used six batch sizes, 20 epochs, and ADAM as the optimizer, managed to outperform the performance accuracy of the other existing comparative network network models. The highest performance accuracy gained for the Mel spectrogram, and MFCC image profile in the analyses conducted was 96.30% and 98.77%, respectively. 2024 Thesis NonPeerReviewed text en http://eprints.utem.edu.my/id/eprint/28630/1/Malay%20language%20vowel%20classification%20using%20audio%20image%20profile%20via%20deep%20learning%20for%20speech%20disorder%20rehabilitation%20assessment.pdf text en http://eprints.utem.edu.my/id/eprint/28630/2/Malay%20language%20vowel%20classification%20using%20audio%20image%20profile%20via%20deep%20learning%20for%20speech%20disorder%20rehabilitation%20assessment.pdf Ahmad Azhar, Nur Syahmina (2024) Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment. Masters thesis, Universiti Teknikal Malaysia Melaka. https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
spellingShingle	Ahmad Azhar, Nur Syahmina Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
thesis_level	Master
title	Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_full	Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_fullStr	Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_full_unstemmed	Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_short	Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_sort	malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
url	http://eprints.utem.edu.my/id/eprint/28630/ https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
work_keys_str_mv	AT ahmadazharnursyahmina malaylanguagevowelclassificationusingaudioimageprofileviadeeplearningforspeechdisorderrehabilitationassessment

Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment

Similar Items