Abstract
Top-k dominance (TKD) query for incomplete datasets is a popular preference query for incomplete data, which analyzes the dominance relationships among objects in a dataset by a dominance method to reveal the top-k most valuable information in the dataset. At present, in-depth research has been conducted on this topic, and efficient query algorithms based on various pruning strategies have been proposed, as well as optimization algorithms based on a distributed computing framework for processing large-scale datasets. With the advent of the information age, data update iterations are accelerated, and in the face of dynamically updated data, the traditional TKD query algorithm based on static data can no longer meet our needs, and an efficient algorithm based on the dynamically updated data set environment is needed. In this paper, we conduct an in-depth study on the TKD query problem for dynamically updated incomplete datasets, and propose a dynamic update parallel algorithm based on MapReduce framework. The algorithm utilizes the query results of historical datasets, avoids the repeated analysis of the dominant relationships between historical objects, optimizes the computation process, reduces the space occupation, and proves through experiments that the dynamic update algorithm has more obvious advantages compared with the traditional algorithm.
Supported by Shandong Provincial Natural Science Foundation (ZR201911150391).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amagata, D., Sasaki, Y., Hara, T., Nishio, S.: Efficient processing of top-k dominating queries in distributed environments. World Wide Web 19(4), 545–577 (2015). https://doi.org/10.1007/s11280-015-0340-6
Antova, L., Koch, C., Olteanu, D.: From complete to incomplete information and back. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 713–724 (2007)
Canahuate, G., Gibas, M., Ferhatosmanoglu, H.: Indexing incomplete databases. In: Ioannidis, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 884–901. Springer, Heidelberg (2006). https://doi.org/10.1007/11687238_52
Cheung, D.W., Han, D.-W., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering, pp. 106–114. IEEE (1996)
Cheung, D.W., Lee, S.D., Kao, B.: A general incremental technique for maintaining discovered association rules. In: Database Systems For Advanced Applications 1997, pp. 185–194. World Scientific (1997)
Ezatpoor, P., Zhan, J., Wu, J.M.-T., Chiu, C.: Finding top-\( k \) dominance on incomplete big data using mapreduce framework. IEEE Access 6, 7872–7887 (2018)
Gao, Y., Miao, X., Cui, H., Chen, G., Li, Q.: Processing k-skyband, constrained skyline, and group-by skyline queries on incomplete data. Expert Syst. App. 41(10), 4959–4974 (2014)
Green, T.J., Tannen, V.: Models for incomplete and probabilistic information. In: Grust, T., et al. (eds.) EDBT 2006. LNCS, vol. 4254, pp. 278–296. Springer, Heidelberg (2006). https://doi.org/10.1007/11896548_24
Haghani, P., Michel, S., Aberer, K.: Evaluating top-k queries over incomplete data streams. In: Proceedings of the 18th ACM conference on Information and Knowledge Management, pp. 877–886 (2009)
Hong, T.-P., Wang, C.-Y., Tao, Y.-H.: A new incremental data mining algorithm using pre-large itemsets. Intell. Data Anal. 5(2), 111–129 (2001)
Imieliński, T., Jr, W.L.: Incomplete information in relational databases. In Readings in Artificial Intelligence and Databases, pp. 342–360. Elsevier (1989)
Khalefa, M.E., Mokbel, M.F., Levandoski, J.J.: Skyline query processing for incomplete data. In: 2008 IEEE 24th International Conference on Data Engineering, pp. 556–565. IEEE (2008)
Lee, C.-H., Lin, C.-R., Chen, M.-S.: Sliding-window filtering: an efficient algorithm for incremental mining. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 263–270 (2001)
Lin, M.-Y., Lee, S.-Y.: Incremental update on sequential patterns in large databases. In: Proceedings of Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No. 98CH36294), pp. 24–31. IEEE (1998)
Lofi, C., Maarry, K.E., Balke, W.T.: Skyline queries in crowd-enabled databases. In: Proceedings of the 16th International Conference on Extending Database Technology, pp. 465–476 (2013)
Miao, X., Gao, Y., Zheng, B., Chen, G., Cui, H.: Top-k dominating queries on incomplete data. IEEE Trans. Knowl. Data Eng. 28(1), 252–266 (2015)
Ooi, B.C., Goh, C.H., Tan, K.L.: Fast high-dimensional data search in incomplete databases. In: VLDB, pp. 357–367 (1998)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. (TODS) 30(1), 41–82 (2005)
Parthasarathy, S., Zaki, M.J., Ogihara, M., Dwarkadas, S.: Incremental and interactive sequence mining. In: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 251–258 (1999)
Pudi, V., Haritsa, J.R.: Quantifying the utility of the past in mining large databases. Inf. Syst. 25(5), 323–343 (2000)
Saleti, S., Subramanyam, R.: A mapreduce solution for incremental mining of sequential patterns from big data. Expert Syst. App. 133, 109–125 (2019)
Soliman, M.A., Ilyas, I.F., Ben-David, S.: Supporting ranking queries on uncertain and incomplete data. VLDB J. 19(4), 477–501 (2010)
Tiakas, E., Papadopoulos, A.N., Manolopoulos, Y.: Progressive processing of subspace dominating queries. VLDB J. 20(6), 921–948 (2011)
Wang, K.: Discovering patterns from large and dynamic sequential data. J. Intell. Inf. Syst. 9(1), 33–56 (1997)
Wu, J.M.-T., Teng, Q., Lin, J.C.-W., Cheng, C.-F.: Incrementally updating the discovered high average-utility patterns with the pre-large concept. IEEE Access 8, 66788–66798 (2020)
Wu, J.M.-T., Wei, M., Wu, M.-E., Tayeb, S.: Top-k dominating queries on incomplete large dataset. J. Supercomput. 78, 1–22 (2021)
Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data. VLDB 7, 483–494 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, J.MT., Wang, K., Lin, J.CW. (2022). Top-k Dominating Queries on Incremental Datasets. In: Rage, U.K., Goyal, V., Reddy, P.K. (eds) Database Systems for Advanced Applications. DASFAA 2022 International Workshops. DASFAA 2022. Lecture Notes in Computer Science, vol 13248. Springer, Cham. https://doi.org/10.1007/978-3-031-11217-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-11217-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11216-4
Online ISBN: 978-3-031-11217-1
eBook Packages: Computer ScienceComputer Science (R0)