Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model

Today, the advancement of information technology has led to a growing need for continuous processing of significant events, such as enhanced methods for monitoring road speed and mobile computing. The Uncertain Data Stream (UDS) utilized for query processing can provide challenges in many technic...

全面介绍

书目详细资料
主要作者: Raja Wahab, Raja Azhan Syah
格式: Thesis
语言:英语
出版: 2024
主题:
在线阅读:http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf
_version_ 1846217895890124800
author Raja Wahab, Raja Azhan Syah
author_facet Raja Wahab, Raja Azhan Syah
author_sort Raja Wahab, Raja Azhan Syah
description Today, the advancement of information technology has led to a growing need for continuous processing of significant events, such as enhanced methods for monitoring road speed and mobile computing. The Uncertain Data Stream (UDS) utilized for query processing can provide challenges in many technical contexts owing to its inherent inconsistency, ambiguity, and time delay in interpreting information. The large amount of data generated and frequent changes in a short time make conventional processing methods insufficient. The main issues are minimizing redundant scans of the whole data set, improving uncertainty computation, and only processing the most recent tuple items. In UDS, the number of possible world instances grows exponentially, and understanding what is required to achieve Top-k query processing in the shortest possible time can be extremely challenging. However, there is a need to increase the number of studies investigating the issue of UDS using the Sliding Window Model (SWM). An inefficient approach to processing continuous queries on UDS with uncertainty over the SWM increased the complexity of semantic trade-offs between answering maximum probability and high-scoring result sets. Current research on tackling uncertainty revolves around creating specifically tailored algorithms that can operate in the presence of value uncertainty using both a count-based and a time-based approach. This study aims to propose a framework for processing Top-k queries in UDS, where the focus is on leveraging the efficiency of the SWM, achieved through the SWMTop-kDelta algorithm. After establishing this model's rules and probability theory, a method was designed to support the Top-k processing algorithm over the SWM until the Top-k potential candidates expired. This study also provides an overview of an improved optimization method for tackling computational redundancy in the context of SWM and Topk query computation. This method reduces computational costs by efficiently handling the insertion and exit policy for the appropriate tuple candidates within a specified window frame. The experiments in this study compare the SWMTop-kDelta algorithm with two previous researchers and two baseline approach algorithms to evaluate their effectiveness. The algorithm development combines the frameworks from Phases 1 to 3, evaluating real and synthetic datasets. It assesses efficiency by comparing the number of possible worlds and processing times. The experiment was conducted in triplicate and recorded the mean value of these iterations. As the data set size increases, SWMTop-kDelta consistently performs well, regardless of the data set size and the measurement of the number parameter k. Even if the initial improvement is only slight, performance can consistently improve by making certain adjustments, such as increasing the number of window segmentations, decreasing the window size, reducing the number of queries, and adjusting the probability threshold (d) more frequently. It demonstrates a significant improvement of 30%–90% compared to other methods, thanks to its consistent performance and strong scalability. This study effort will make a valuable contribution to the field of Top-k computational query processing.
format Thesis
id oai:psasir.upm.edu.my:120035
institution Universiti Putra Malaysia
language English
publishDate 2024
record_format eprints
spelling oai:psasir.upm.edu.my:1200352025-10-09T08:29:10Z http://psasir.upm.edu.my/id/eprint/120035/ Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model Raja Wahab, Raja Azhan Syah Today, the advancement of information technology has led to a growing need for continuous processing of significant events, such as enhanced methods for monitoring road speed and mobile computing. The Uncertain Data Stream (UDS) utilized for query processing can provide challenges in many technical contexts owing to its inherent inconsistency, ambiguity, and time delay in interpreting information. The large amount of data generated and frequent changes in a short time make conventional processing methods insufficient. The main issues are minimizing redundant scans of the whole data set, improving uncertainty computation, and only processing the most recent tuple items. In UDS, the number of possible world instances grows exponentially, and understanding what is required to achieve Top-k query processing in the shortest possible time can be extremely challenging. However, there is a need to increase the number of studies investigating the issue of UDS using the Sliding Window Model (SWM). An inefficient approach to processing continuous queries on UDS with uncertainty over the SWM increased the complexity of semantic trade-offs between answering maximum probability and high-scoring result sets. Current research on tackling uncertainty revolves around creating specifically tailored algorithms that can operate in the presence of value uncertainty using both a count-based and a time-based approach. This study aims to propose a framework for processing Top-k queries in UDS, where the focus is on leveraging the efficiency of the SWM, achieved through the SWMTop-kDelta algorithm. After establishing this model's rules and probability theory, a method was designed to support the Top-k processing algorithm over the SWM until the Top-k potential candidates expired. This study also provides an overview of an improved optimization method for tackling computational redundancy in the context of SWM and Topk query computation. This method reduces computational costs by efficiently handling the insertion and exit policy for the appropriate tuple candidates within a specified window frame. The experiments in this study compare the SWMTop-kDelta algorithm with two previous researchers and two baseline approach algorithms to evaluate their effectiveness. The algorithm development combines the frameworks from Phases 1 to 3, evaluating real and synthetic datasets. It assesses efficiency by comparing the number of possible worlds and processing times. The experiment was conducted in triplicate and recorded the mean value of these iterations. As the data set size increases, SWMTop-kDelta consistently performs well, regardless of the data set size and the measurement of the number parameter k. Even if the initial improvement is only slight, performance can consistently improve by making certain adjustments, such as increasing the number of window segmentations, decreasing the window size, reducing the number of queries, and adjusting the probability threshold (d) more frequently. It demonstrates a significant improvement of 30%–90% compared to other methods, thanks to its consistent performance and strong scalability. This study effort will make a valuable contribution to the field of Top-k computational query processing. 2024-07 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf Raja Wahab, Raja Azhan Syah (2024) Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model. Doctoral thesis, Universiti Putra Malaysia. http://ethesis.upm.edu.my/id/eprint/18498 Data streams (Computer science) Query processing (Computer science) Uncertainty (Information theory)
spellingShingle Data streams (Computer science)
Query processing (Computer science)
Uncertainty (Information theory)
Raja Wahab, Raja Azhan Syah
Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title_full Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title_fullStr Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title_full_unstemmed Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title_short Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
title_sort efficient management of top k queries over uncertain data streams with dynamic sliding window model
topic Data streams (Computer science)
Query processing (Computer science)
Uncertainty (Information theory)
url http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf
url-record http://psasir.upm.edu.my/id/eprint/120035/
http://ethesis.upm.edu.my/id/eprint/18498
work_keys_str_mv AT rajawahabrajaazhansyah efficientmanagementoftopkqueriesoveruncertaindatastreamswithdynamicslidingwindowmodel