An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents

Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the la...

Full description

Bibliographic Details
Main Author: Al-Dyani, Wafa Zubair Abdullah
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/
Abstract Abstract here
_version_ 1855574933323644928
author Al-Dyani, Wafa Zubair Abdullah
author_facet Al-Dyani, Wafa Zubair Abdullah
author_sort Al-Dyani, Wafa Zubair Abdullah
description Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making.
format Thesis
id oai:etd.uum.edu.my:10228
institution Universiti Utara Malaysia
language English
publishDate 2022
record_format EPrints
record_pdf Restricted
spelling oai:etd.uum.edu.my:102282025-08-24T06:32:43Z https://etd.uum.edu.my/10228/ An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents Al-Dyani, Wafa Zubair Abdullah QA Mathematics Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making. 2022 Thesis NonPeerReviewed text en https://etd.uum.edu.my/10228/1/s901775_01.pdf Al-Dyani, Wafa Zubair Abdullah (2022) An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents. Doctoral thesis, Universiti Utara Malaysia.
spellingShingle QA Mathematics
Al-Dyani, Wafa Zubair Abdullah
An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
thesis_level PhD
title An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_fullStr An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full_unstemmed An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_short An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_sort enhanced binary bat and markov clustering algorithms to improve event detection for heterogeneous news text documents
topic QA Mathematics
url https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/
work_keys_str_mv AT aldyaniwafazubairabdullah anenhancedbinarybatandmarkovclusteringalgorithmstoimproveeventdetectionforheterogeneousnewstextdocuments
AT aldyaniwafazubairabdullah enhancedbinarybatandmarkovclusteringalgorithmstoimproveeventdetectionforheterogeneousnewstextdocuments