Gene selection and classification in autism gene expression data

Also available in printed version

Bibliographic Details
Main Author: Abubakar Yusuf
Other Authors: Rohayanti Hassan
Format: Master's thesis
Language:English
Published: Universiti Teknologi Malaysia 2025
Subjects:
Online Access:https://utmik.utm.my/handle/123456789/57774
Abstract Abstract here
_version_ 1854975045755994112
author Abubakar Yusuf
author2 Rohayanti Hassan
author_facet Rohayanti Hassan
Abubakar Yusuf
author_sort Abubakar Yusuf
description Also available in printed version
format Master's thesis
id utm-123456789-57774
institution Universiti Teknologi Malaysia
language English
publishDate 2025
publisher Universiti Teknologi Malaysia
record_format dspace
record_pdf Abstract
spelling utm-123456789-577742025-08-20T17:59:58Z Gene selection and classification in autism gene expression data Abubakar Yusuf Rohayanti Hassan Computing Also available in printed version Autism spectrum disorders (ASD) are neurodevelopmental disorders that are currently diagnosed on the basis of abnormal stereotyped behaviour as well as observable deficits in communication and social functioning. Although a variety of candidate genes have been attributed to the disorder, no single gene is applicable to more than 1� of the general ASD population. Despite extensive efforts, definitive genes that contribute to autism susceptibility have yet to be identified. The major problems in dealing with the gene expression dataset of autism include the presence of limited number of samples and large noises due to errors of experimental measurements and natural variation. In this study, a systematic combination of three important filters, namely t-test (TT), Wilcoxon Rank Sum (WRS) and Feature Correlation (COR) are applied along with efficient wrapper algorithm based on geometric binary particle swarm optimization-support vector machine (GBPSO-SVM), aiming at selecting and classifying the most attributed genes of autism. A new approach based on the criterion of median ratio, mean ratio and variance deviations is also applied to reduce the initial dataset prior to its involvement. Results showed that the most discriminative genes that were identified in the first and last selection steps concluded the presence of a repetitive gene (CAPS2), which was assigned as the most ASD risk gene. The fused result of genes subset that were selected by the GBPSO-SVM algorithm increased the classification accuracy to about 92.10%, which is higher than those reported in literature for the same autism dataset. Noticeably, the application of ensemble using random forest (RF) showed better performance compared to that of previous studies. However, the ensemble approach based on the employment of SVM as an integrator of the fused genes from the output branches of GBPSO-SVM outperformed the RF integrator. The overall improvement was ascribed to the selection strategies that were taken to reduce the dataset and the utilization of efficient wrapper based GBPSO-SVM algorithm shafika UTM 114 p. Thesis (Sarjana Sains (Sains Komputer)) - Universiti Teknologi Malaysia 2025-03-17T04:45:46Z 2025-03-17T04:45:46Z 2017 Master's thesis https://utmik.utm.my/handle/123456789/57774 vital:132618 valet-20200309-103658 ENG Closed Access UTM Complete Completion Unpublished application/pdf Universiti Teknologi Malaysia
spellingShingle Computing
Abubakar Yusuf
Gene selection and classification in autism gene expression data
thesis_level Master
title Gene selection and classification in autism gene expression data
title_full Gene selection and classification in autism gene expression data
title_fullStr Gene selection and classification in autism gene expression data
title_full_unstemmed Gene selection and classification in autism gene expression data
title_short Gene selection and classification in autism gene expression data
title_sort gene selection and classification in autism gene expression data
topic Computing
url https://utmik.utm.my/handle/123456789/57774
work_keys_str_mv AT abubakaryusuf geneselectionandclassificationinautismgeneexpressiondata