Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment

Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech reha...

Full description

Bibliographic Details
Main Author: Ahmad Azhar, Nur Syahmina
Format: Thesis
Language:English
English
Published: 2024
Online Access:http://eprints.utem.edu.my/id/eprint/28630/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
Abstract Abstract here
_version_ 1855619834084065280
author Ahmad Azhar, Nur Syahmina
author_facet Ahmad Azhar, Nur Syahmina
author_sort Ahmad Azhar, Nur Syahmina
description Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech rehabilitation and treatment are time-consuming and involve physical activity, with most facilities still manually performing the process. However, technological advancements, such as Artificial Intelligence (AI), have opened up innovative solutions in speech rehabilitation. AI studies have focused on speech classification for various human languages, with the potential to revolutionize speech rehabilitation and make it more accessible to individuals worldwide. Since computer vision has impacted this field, machine learning and deep learning have been applied to the medical and healthcare industries to enhance rehabilitation by utilizing the new technology. Convolutional Neural Network (CNN) network models have been proven in countless studies to be precise at classifying performance in object and speech classification. This research analyzed the performance accuracy of different deep learning comparative network models, proposed network models, VGG-Net, AlexNet, and Inception, and performed a complete comparative analysis to assess these network models' classification accuracy and suitability for rehabilitation purposes. This thesis aims to develop a reliable vowel classification system with high-performance accuracy that can successfully recognize the classification of vowels in the normal person group, the post-stroke patient group with speech disorders, and the combination of both groups using the two proposed image profiles: the Mel spectrogram and the Mel Frequency Cepstral Coefficients (MFCC). According to the experimental results, the proposed network network model, which used six batch sizes, 20 epochs, and ADAM as the optimizer, managed to outperform the performance accuracy of the other existing comparative network network models. The highest performance accuracy gained for the Mel spectrogram, and MFCC image profile in the analyses conducted was 96.30% and 98.77%, respectively.
format Thesis
id utem-28630
institution Universiti Teknikal Malaysia Melaka
language English
English
publishDate 2024
record_format EPrints
record_pdf Restricted
spelling utem-286302025-04-03T09:44:05Z http://eprints.utem.edu.my/id/eprint/28630/ Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment Ahmad Azhar, Nur Syahmina Communication impairments can result from various medical conditions, such as speech problems, hearing loss, brain injuries, strokes, and physical disabilities. These conditions can affect verbal and non-verbal communication and may require specific rehabilitation and therapy. Currently, speech rehabilitation and treatment are time-consuming and involve physical activity, with most facilities still manually performing the process. However, technological advancements, such as Artificial Intelligence (AI), have opened up innovative solutions in speech rehabilitation. AI studies have focused on speech classification for various human languages, with the potential to revolutionize speech rehabilitation and make it more accessible to individuals worldwide. Since computer vision has impacted this field, machine learning and deep learning have been applied to the medical and healthcare industries to enhance rehabilitation by utilizing the new technology. Convolutional Neural Network (CNN) network models have been proven in countless studies to be precise at classifying performance in object and speech classification. This research analyzed the performance accuracy of different deep learning comparative network models, proposed network models, VGG-Net, AlexNet, and Inception, and performed a complete comparative analysis to assess these network models' classification accuracy and suitability for rehabilitation purposes. This thesis aims to develop a reliable vowel classification system with high-performance accuracy that can successfully recognize the classification of vowels in the normal person group, the post-stroke patient group with speech disorders, and the combination of both groups using the two proposed image profiles: the Mel spectrogram and the Mel Frequency Cepstral Coefficients (MFCC). According to the experimental results, the proposed network network model, which used six batch sizes, 20 epochs, and ADAM as the optimizer, managed to outperform the performance accuracy of the other existing comparative network network models. The highest performance accuracy gained for the Mel spectrogram, and MFCC image profile in the analyses conducted was 96.30% and 98.77%, respectively. 2024 Thesis NonPeerReviewed text en http://eprints.utem.edu.my/id/eprint/28630/1/Malay%20language%20vowel%20classification%20using%20audio%20image%20profile%20via%20deep%20learning%20for%20speech%20disorder%20rehabilitation%20assessment.pdf text en http://eprints.utem.edu.my/id/eprint/28630/2/Malay%20language%20vowel%20classification%20using%20audio%20image%20profile%20via%20deep%20learning%20for%20speech%20disorder%20rehabilitation%20assessment.pdf Ahmad Azhar, Nur Syahmina (2024) Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment. Masters thesis, Universiti Teknikal Malaysia Melaka. https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
spellingShingle Ahmad Azhar, Nur Syahmina
Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
thesis_level Master
title Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_full Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_fullStr Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_full_unstemmed Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_short Malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
title_sort malay language vowel classification using audio image profile via deep learning for speech disorder rehabilitation assessment
url http://eprints.utem.edu.my/id/eprint/28630/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=124339
work_keys_str_mv AT ahmadazharnursyahmina malaylanguagevowelclassificationusingaudioimageprofileviadeeplearningforspeechdisorderrehabilitationassessment