An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis

Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's ge...

全面介绍

书目详细资料
主要作者: Salleh, Muhammad Sharilazlan
格式: Thesis
语言:英语
英语
出版: 2018
主题:
在线阅读:http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf
_version_ 1846509731558981632
author Salleh, Muhammad Sharilazlan
author_facet Salleh, Muhammad Sharilazlan
author_sort Salleh, Muhammad Sharilazlan
description Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's generation, social media such as web pages, blogs, Facebook, Twitter, Instagram and online newspapers are among the major contributors to information extraction. These resources contain various types of unstructured data such as text. However, the amount of works done to process this type of data is limited for Malay Named Entity Recognition (MNER). The deficiency on Malay textual analytic has led to difficulties in extracting information for decision making. This research aims to present a Malay Named Entity Recognition technique that focuses on crime data analysis in the Malay language that extracted from Polis Diraja Malaysia (PDRM) news web page. This Malay Named Entity Recognition (MNER) technique is proposed by using multi-staged of clustering and classification methods. The methods are Fuzzy C-Means and K-Nearest Neighbors Algorithm. The methods involve multi-layer features extraction to recognize entities such as person name, location, organization, date and crime type. This multi-staged technique is obtained 95.24% accuracy in the process of recognizing named entities for text analysis, particularly in Malay. The proposed technique can improve the accuracy performance on named entity recognition of crime data based on the suitability selected features for the Malay language.
format Thesis
id oai:eprints.utem.edu.my:23326
institution Universiti Teknikal Malaysia Melaka
language English
English
publishDate 2018
record_format eprints
spelling oai:eprints.utem.edu.my:233262022-04-20T12:25:25Z http://eprints.utem.edu.my/id/eprint/23326/ An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis Salleh, Muhammad Sharilazlan Q Science (General) QA76 Computer software Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's generation, social media such as web pages, blogs, Facebook, Twitter, Instagram and online newspapers are among the major contributors to information extraction. These resources contain various types of unstructured data such as text. However, the amount of works done to process this type of data is limited for Malay Named Entity Recognition (MNER). The deficiency on Malay textual analytic has led to difficulties in extracting information for decision making. This research aims to present a Malay Named Entity Recognition technique that focuses on crime data analysis in the Malay language that extracted from Polis Diraja Malaysia (PDRM) news web page. This Malay Named Entity Recognition (MNER) technique is proposed by using multi-staged of clustering and classification methods. The methods are Fuzzy C-Means and K-Nearest Neighbors Algorithm. The methods involve multi-layer features extraction to recognize entities such as person name, location, organization, date and crime type. This multi-staged technique is obtained 95.24% accuracy in the process of recognizing named entities for text analysis, particularly in Malay. The proposed technique can improve the accuracy performance on named entity recognition of crime data based on the suitability selected features for the Malay language. 2018 Thesis NonPeerReviewed text en http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf text en http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf Salleh, Muhammad Sharilazlan (2018) An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis. Masters thesis, Universiti Teknikal Malaysia Melaka. https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=112736
spellingShingle Q Science (General)
QA76 Computer software
Salleh, Muhammad Sharilazlan
An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_full An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_fullStr An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_full_unstemmed An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_short An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_sort enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
topic Q Science (General)
QA76 Computer software
url http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf
url-record http://eprints.utem.edu.my/id/eprint/23326/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=112736
work_keys_str_mv AT sallehmuhammadsharilazlan anenhancedmalaynamedentityrecognitionusingclusteringandclassificationapproachforcrimetextualdataanalysis
AT sallehmuhammadsharilazlan enhancedmalaynamedentityrecognitionusingclusteringandclassificationapproachforcrimetextualdataanalysis