Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization
The central challenge in Automatic Text Summarization (ATS) is efficiently generating machine-generated text summaries through optimization algorithms, a critical component for systems dealing with textual information processing. The current approach encounters a significant hurdle due to the lo...
| Auteur principal: | |
|---|---|
| Format: | Thèse |
| Langue: | anglais |
| Publié: |
2024
|
| Sujets: | |
| Accès en ligne: | http://psasir.upm.edu.my/id/eprint/120029/1/120029.pdf |
| _version_ | 1846217895674118144 |
|---|---|
| author | Hazmi Wahab, Muhammad Hafizul |
| author_facet | Hazmi Wahab, Muhammad Hafizul |
| author_sort | Hazmi Wahab, Muhammad Hafizul |
| description | The central challenge in Automatic Text Summarization (ATS) is efficiently
generating machine-generated text summaries through optimization
algorithms, a critical component for systems dealing with textual information
processing. The current approach encounters a significant hurdle due to the
long execution time, especially when employing complex optimization
techniques alongside a computationally expensive ATS repair operator that
repairs multiple candidate solutions.
While the current approach yields impressive Recall-Oriented Understudy for
Gisting Evaluation (ROUGE) metrics for the generated summary, it struggles
with inefficiencies, mainly attributed to the substantial optimization time
consumed by the ATS repair operator scheme. In order to address this, a novel
solution called Decomposition-based Multi-objective Differential Evolution
(MODE/D) is proposed. It is built upon the foundation of Differential
Evolution for Multi-objective optimization (DEMO) and the weighted sum
method (WS), coupled with an innovative ATS repair operator scheme.
Through experimentation on Document Understanding Conferences (DUC)
datasets, the novel approach of MODE/D is validated by evaluating the results
using ROUGE metrics. The outcomes are twofold: a remarkable reduction in
serial execution time and a noteworthy enhancement over existing techniques
in the scholarly domain, as evidenced by improved ROUGE-1, ROUGE-2, and
ROUGE-L scores.
The multi-core variant of MODE/D explored an alternative computational
environment, which not only demonstrates stability but also achieves
remarkable efficiency when static loop scheduling is employed. Notably, in a
multi-core environment, parallel multi-core MODE/D attains a commendable
speedup of 2 times faster than the serial version of MODE/D, with the highest
efficiency peaking at an impressive 86.35% when employing 6 CPU cores.
Additionally, when the input size is tripled, the parallel multi-core MODE/D
achieves a 7.9 speedup with 98.98% efficiency under static scheduling. The
commendable speedup achieved comes with a slight degradation in terms of ROUGE-2 metrics. However, this efficiency milestone underscores the
robustness and scalability of the proposed approach, showcasing its ability to
harness the computational power of multiple cores while maintaining stability
in summary quality metrics, yielding 31 words per second (WPS), a 233.13%
increase compared to its serial counterpart for the topic of d061j in DUC2002.
Furthermore, two GPU variants of GMODE/D, namely variant I and variant
II, are implemented, with both incorporating unified and non-unified memory
architectures. Variant I performs sentence scoring at the outset of the
accelerator region, while variant II conducts sentence scoring within the
accelerator region. GMODE/D variant I with unified memory achieves a
significant speedup of 18.17 compared to the serial variant when a 256 vector
size is used with NVIDIA Tesla V100 as an accelerator device, resulting in a
substantial increase in WPS, amounting to 215.517. Despite suffering a slight
reduction in ROUGE scores, it exhibits the most stable CV values among the
serial, multi-core, and many core variants.
These advancements collectively propel optimization-based ATS approaches
closer to real-time applications where thousands of documents could be
involved, demonstrating the versatility and efficiency of the proposed
MODE/D algorithm across diverse computing architectures, including multicore
and many core environments. |
| format | Thesis |
| id | oai:psasir.upm.edu.my:120029 |
| institution | Universiti Putra Malaysia |
| language | English |
| publishDate | 2024 |
| record_format | eprints |
| spelling | oai:psasir.upm.edu.my:1200292025-10-09T08:27:29Z http://psasir.upm.edu.my/id/eprint/120029/ Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization Hazmi Wahab, Muhammad Hafizul The central challenge in Automatic Text Summarization (ATS) is efficiently generating machine-generated text summaries through optimization algorithms, a critical component for systems dealing with textual information processing. The current approach encounters a significant hurdle due to the long execution time, especially when employing complex optimization techniques alongside a computationally expensive ATS repair operator that repairs multiple candidate solutions. While the current approach yields impressive Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics for the generated summary, it struggles with inefficiencies, mainly attributed to the substantial optimization time consumed by the ATS repair operator scheme. In order to address this, a novel solution called Decomposition-based Multi-objective Differential Evolution (MODE/D) is proposed. It is built upon the foundation of Differential Evolution for Multi-objective optimization (DEMO) and the weighted sum method (WS), coupled with an innovative ATS repair operator scheme. Through experimentation on Document Understanding Conferences (DUC) datasets, the novel approach of MODE/D is validated by evaluating the results using ROUGE metrics. The outcomes are twofold: a remarkable reduction in serial execution time and a noteworthy enhancement over existing techniques in the scholarly domain, as evidenced by improved ROUGE-1, ROUGE-2, and ROUGE-L scores. The multi-core variant of MODE/D explored an alternative computational environment, which not only demonstrates stability but also achieves remarkable efficiency when static loop scheduling is employed. Notably, in a multi-core environment, parallel multi-core MODE/D attains a commendable speedup of 2 times faster than the serial version of MODE/D, with the highest efficiency peaking at an impressive 86.35% when employing 6 CPU cores. Additionally, when the input size is tripled, the parallel multi-core MODE/D achieves a 7.9 speedup with 98.98% efficiency under static scheduling. The commendable speedup achieved comes with a slight degradation in terms of ROUGE-2 metrics. However, this efficiency milestone underscores the robustness and scalability of the proposed approach, showcasing its ability to harness the computational power of multiple cores while maintaining stability in summary quality metrics, yielding 31 words per second (WPS), a 233.13% increase compared to its serial counterpart for the topic of d061j in DUC2002. Furthermore, two GPU variants of GMODE/D, namely variant I and variant II, are implemented, with both incorporating unified and non-unified memory architectures. Variant I performs sentence scoring at the outset of the accelerator region, while variant II conducts sentence scoring within the accelerator region. GMODE/D variant I with unified memory achieves a significant speedup of 18.17 compared to the serial variant when a 256 vector size is used with NVIDIA Tesla V100 as an accelerator device, resulting in a substantial increase in WPS, amounting to 215.517. Despite suffering a slight reduction in ROUGE scores, it exhibits the most stable CV values among the serial, multi-core, and many core variants. These advancements collectively propel optimization-based ATS approaches closer to real-time applications where thousands of documents could be involved, demonstrating the versatility and efficiency of the proposed MODE/D algorithm across diverse computing architectures, including multicore and many core environments. 2024-09 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/120029/1/120029.pdf Hazmi Wahab, Muhammad Hafizul (2024) Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization. Doctoral thesis, Universiti Putra Malaysia. http://ethesis.upm.edu.my/id/eprint/18497 Optimization algorithms (Computer science) Parallel processing (Computer science) Natural language processing (Computer science) |
| spellingShingle | Optimization algorithms (Computer science) Parallel processing (Computer science) Natural language processing (Computer science) Hazmi Wahab, Muhammad Hafizul Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title | Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title_full | Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title_fullStr | Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title_full_unstemmed | Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title_short | Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization |
| title_sort | optimization of extractive automatic text summarization using decomposition based multi objective differential evolution and parallelization |
| topic | Optimization algorithms (Computer science) Parallel processing (Computer science) Natural language processing (Computer science) |
| url | http://psasir.upm.edu.my/id/eprint/120029/1/120029.pdf |
| url-record | http://psasir.upm.edu.my/id/eprint/120029/ http://ethesis.upm.edu.my/id/eprint/18497 |
| work_keys_str_mv | AT hazmiwahabmuhammadhafizul optimizationofextractiveautomatictextsummarizationusingdecompositionbasedmultiobjectivedifferentialevolutionandparallelization |