Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model
Today, the advancement of information technology has led to a growing need for continuous processing of significant events, such as enhanced methods for monitoring road speed and mobile computing. The Uncertain Data Stream (UDS) utilized for query processing can provide challenges in many technic...
| 主要作者: | |
|---|---|
| 格式: | Thesis |
| 语言: | 英语 |
| 出版: |
2024
|
| 主题: | |
| 在线阅读: | http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf |
| _version_ | 1846217895890124800 |
|---|---|
| author | Raja Wahab, Raja Azhan Syah |
| author_facet | Raja Wahab, Raja Azhan Syah |
| author_sort | Raja Wahab, Raja Azhan Syah |
| description | Today, the advancement of information technology has led to a growing need
for continuous processing of significant events, such as enhanced methods for
monitoring road speed and mobile computing. The Uncertain Data Stream
(UDS) utilized for query processing can provide challenges in many technical
contexts owing to its inherent inconsistency, ambiguity, and time delay in
interpreting information. The large amount of data generated and frequent
changes in a short time make conventional processing methods insufficient.
The main issues are minimizing redundant scans of the whole data set,
improving uncertainty computation, and only processing the most recent tuple
items. In UDS, the number of possible world instances grows exponentially,
and understanding what is required to achieve Top-k query processing in the
shortest possible time can be extremely challenging. However, there is a need
to increase the number of studies investigating the issue of UDS using the
Sliding Window Model (SWM). An inefficient approach to processing
continuous queries on UDS with uncertainty over the SWM increased the
complexity of semantic trade-offs between answering maximum probability
and high-scoring result sets. Current research on tackling uncertainty revolves
around creating specifically tailored algorithms that can operate in the
presence of value uncertainty using both a count-based and a time-based
approach. This study aims to propose a framework for processing Top-k
queries in UDS, where the focus is on leveraging the efficiency of the SWM,
achieved through the SWMTop-kDelta algorithm. After establishing this
model's rules and probability theory, a method was designed to support the
Top-k processing algorithm over the SWM until the Top-k potential candidates
expired. This study also provides an overview of an improved optimization
method for tackling computational redundancy in the context of SWM and Topk
query computation. This method reduces computational costs by efficiently
handling the insertion and exit policy for the appropriate tuple candidates within
a specified window frame. The experiments in this study compare the
SWMTop-kDelta algorithm with two previous researchers and two baseline
approach algorithms to evaluate their effectiveness. The algorithm
development combines the frameworks from Phases 1 to 3, evaluating real
and synthetic datasets. It assesses efficiency by comparing the number of
possible worlds and processing times. The experiment was conducted in
triplicate and recorded the mean value of these iterations. As the data set size
increases, SWMTop-kDelta consistently performs well, regardless of the data
set size and the measurement of the number parameter k. Even if the initial
improvement is only slight, performance can consistently improve by making
certain adjustments, such as increasing the number of window segmentations,
decreasing the window size, reducing the number of queries, and adjusting the
probability threshold (d) more frequently. It demonstrates a significant
improvement of 30%–90% compared to other methods, thanks to its consistent
performance and strong scalability. This study effort will make a valuable
contribution to the field of Top-k computational query processing. |
| format | Thesis |
| id | oai:psasir.upm.edu.my:120035 |
| institution | Universiti Putra Malaysia |
| language | English |
| publishDate | 2024 |
| record_format | eprints |
| spelling | oai:psasir.upm.edu.my:1200352025-10-09T08:29:10Z http://psasir.upm.edu.my/id/eprint/120035/ Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model Raja Wahab, Raja Azhan Syah Today, the advancement of information technology has led to a growing need for continuous processing of significant events, such as enhanced methods for monitoring road speed and mobile computing. The Uncertain Data Stream (UDS) utilized for query processing can provide challenges in many technical contexts owing to its inherent inconsistency, ambiguity, and time delay in interpreting information. The large amount of data generated and frequent changes in a short time make conventional processing methods insufficient. The main issues are minimizing redundant scans of the whole data set, improving uncertainty computation, and only processing the most recent tuple items. In UDS, the number of possible world instances grows exponentially, and understanding what is required to achieve Top-k query processing in the shortest possible time can be extremely challenging. However, there is a need to increase the number of studies investigating the issue of UDS using the Sliding Window Model (SWM). An inefficient approach to processing continuous queries on UDS with uncertainty over the SWM increased the complexity of semantic trade-offs between answering maximum probability and high-scoring result sets. Current research on tackling uncertainty revolves around creating specifically tailored algorithms that can operate in the presence of value uncertainty using both a count-based and a time-based approach. This study aims to propose a framework for processing Top-k queries in UDS, where the focus is on leveraging the efficiency of the SWM, achieved through the SWMTop-kDelta algorithm. After establishing this model's rules and probability theory, a method was designed to support the Top-k processing algorithm over the SWM until the Top-k potential candidates expired. This study also provides an overview of an improved optimization method for tackling computational redundancy in the context of SWM and Topk query computation. This method reduces computational costs by efficiently handling the insertion and exit policy for the appropriate tuple candidates within a specified window frame. The experiments in this study compare the SWMTop-kDelta algorithm with two previous researchers and two baseline approach algorithms to evaluate their effectiveness. The algorithm development combines the frameworks from Phases 1 to 3, evaluating real and synthetic datasets. It assesses efficiency by comparing the number of possible worlds and processing times. The experiment was conducted in triplicate and recorded the mean value of these iterations. As the data set size increases, SWMTop-kDelta consistently performs well, regardless of the data set size and the measurement of the number parameter k. Even if the initial improvement is only slight, performance can consistently improve by making certain adjustments, such as increasing the number of window segmentations, decreasing the window size, reducing the number of queries, and adjusting the probability threshold (d) more frequently. It demonstrates a significant improvement of 30%–90% compared to other methods, thanks to its consistent performance and strong scalability. This study effort will make a valuable contribution to the field of Top-k computational query processing. 2024-07 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf Raja Wahab, Raja Azhan Syah (2024) Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model. Doctoral thesis, Universiti Putra Malaysia. http://ethesis.upm.edu.my/id/eprint/18498 Data streams (Computer science) Query processing (Computer science) Uncertainty (Information theory) |
| spellingShingle | Data streams (Computer science) Query processing (Computer science) Uncertainty (Information theory) Raja Wahab, Raja Azhan Syah Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title | Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title_full | Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title_fullStr | Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title_full_unstemmed | Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title_short | Efficient management of Top-k queries over Uncertain Data Streams with dynamic Sliding Window Model |
| title_sort | efficient management of top k queries over uncertain data streams with dynamic sliding window model |
| topic | Data streams (Computer science) Query processing (Computer science) Uncertainty (Information theory) |
| url | http://psasir.upm.edu.my/id/eprint/120035/1/120035.pdf |
| url-record | http://psasir.upm.edu.my/id/eprint/120035/ http://ethesis.upm.edu.my/id/eprint/18498 |
| work_keys_str_mv | AT rajawahabrajaazhansyah efficientmanagementoftopkqueriesoveruncertaindatastreamswithdynamicslidingwindowmodel |