CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data

Uncertainty of data, the degree to which data are inaccurate, imprecise, untrusted, and undetermined, is inherent in many contemporary database applications, and numerous research endeavours have been devoted to efficiently answer skyline queries over uncertain data. The literature discussed two dif...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Lawal, Ma'aruf Mohammed, Ibrahim, Hamidah, Mohd Sani, Nor Fazlida, Yaakob, Razali, Alwan, Ali A.
التنسيق: مقال
اللغة:English
منشور في: PeerJ 2024
الوصول للمادة أونلاين:http://psasir.upm.edu.my/id/eprint/114879/1/114879.pdf
http://psasir.upm.edu.my/id/eprint/114879/
https://peerj.com/articles/cs-2225/
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
id my.upm.eprints.114879
record_format eprints
spelling my.upm.eprints.1148792025-02-06T07:52:01Z http://psasir.upm.edu.my/id/eprint/114879/ CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data Lawal, Ma'aruf Mohammed Ibrahim, Hamidah Mohd Sani, Nor Fazlida Yaakob, Razali Alwan, Ali A. Uncertainty of data, the degree to which data are inaccurate, imprecise, untrusted, and undetermined, is inherent in many contemporary database applications, and numerous research endeavours have been devoted to efficiently answer skyline queries over uncertain data. The literature discussed two different methods that could be used to handle the data uncertainty in which objects having continuous range values. The first method employs a probability-based approach, while the second assumes that the uncertain values are represented by their median values. Nevertheless, neither of these methods seem to be suitable for the modern high-dimensional uncertain databases due to the following reasons. The first method requires an intensive probability calculations while the second is impractical. Therefore, this work introduces an index, non-probability framework named Constrained Skyline Query processing on Uncertain Data (CSQUiD) aiming at reducing the computational time in processing constrained skyline queries over uncertain high-dimensional data. Given a collection of objects with uncertain data, the CSQUiD framework constructs the minimum bounding rectangles (MBRs) by employing the X-tree indexing structure. Instead of scanning the whole collection of objects, only objects within the dominant MBRs are analyzed in determining the final skylines. In addition, CSQUiD makes use of the Fuzzification approach where the exact value of each continuous range value of those dominant MBRs’ objects is identified. The proposed CSQUiD framework is validated using real and synthetic data sets through extensive experimentations. Based on the performance analysis conducted, by varying the sizes of the constrained query, the CSQUiD framework outperformed the most recent methods (CIS algorithm and SkyQUD-T framework) with an average improvement of 44.07% and 57.15% with regards to the number of pairwise comparisons, while the average improvement of CPU processing time over CIS and SkyQUD-T stood at 27.17% and 18.62%, respectively. PeerJ 2024-09-16 Article PeerReviewed text en cc_by_4 http://psasir.upm.edu.my/id/eprint/114879/1/114879.pdf Lawal, Ma'aruf Mohammed and Ibrahim, Hamidah and Mohd Sani, Nor Fazlida and Yaakob, Razali and Alwan, Ali A. (2024) CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data. PeerJ Computer Science, 10. art. no. e2225. pp. 1-34. ISSN 2376-5992 https://peerj.com/articles/cs-2225/ 10.7717/PEERJ-CS.2225
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Uncertainty of data, the degree to which data are inaccurate, imprecise, untrusted, and undetermined, is inherent in many contemporary database applications, and numerous research endeavours have been devoted to efficiently answer skyline queries over uncertain data. The literature discussed two different methods that could be used to handle the data uncertainty in which objects having continuous range values. The first method employs a probability-based approach, while the second assumes that the uncertain values are represented by their median values. Nevertheless, neither of these methods seem to be suitable for the modern high-dimensional uncertain databases due to the following reasons. The first method requires an intensive probability calculations while the second is impractical. Therefore, this work introduces an index, non-probability framework named Constrained Skyline Query processing on Uncertain Data (CSQUiD) aiming at reducing the computational time in processing constrained skyline queries over uncertain high-dimensional data. Given a collection of objects with uncertain data, the CSQUiD framework constructs the minimum bounding rectangles (MBRs) by employing the X-tree indexing structure. Instead of scanning the whole collection of objects, only objects within the dominant MBRs are analyzed in determining the final skylines. In addition, CSQUiD makes use of the Fuzzification approach where the exact value of each continuous range value of those dominant MBRs’ objects is identified. The proposed CSQUiD framework is validated using real and synthetic data sets through extensive experimentations. Based on the performance analysis conducted, by varying the sizes of the constrained query, the CSQUiD framework outperformed the most recent methods (CIS algorithm and SkyQUD-T framework) with an average improvement of 44.07% and 57.15% with regards to the number of pairwise comparisons, while the average improvement of CPU processing time over CIS and SkyQUD-T stood at 27.17% and 18.62%, respectively.
format Article
author Lawal, Ma'aruf Mohammed
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
Alwan, Ali A.
spellingShingle Lawal, Ma'aruf Mohammed
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
Alwan, Ali A.
CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
author_facet Lawal, Ma'aruf Mohammed
Ibrahim, Hamidah
Mohd Sani, Nor Fazlida
Yaakob, Razali
Alwan, Ali A.
author_sort Lawal, Ma'aruf Mohammed
title CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
title_short CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
title_full CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
title_fullStr CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
title_full_unstemmed CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
title_sort csquid: an index and non-probability framework for constrained skyline query processing over uncertain data
publisher PeerJ
publishDate 2024
url http://psasir.upm.edu.my/id/eprint/114879/1/114879.pdf
http://psasir.upm.edu.my/id/eprint/114879/
https://peerj.com/articles/cs-2225/
_version_ 1823534620448129024
score 13.251813