Abstract
As many data mining applications involve networked data with dynamically increasing volumes, graph stream classification has recently extracted significant research interest. The aim of graph stream classification is to learn a discriminative model from a stream of graphs represented by sets of edges on a complex network. In this paper, we propose a fast graph stream classification method using DIscriminative Clique Hashing (DICH). The main idea is to employ a fast algorithm to decompose a compressed graph into a number of cliques to sequentially extract clique-patterns over the graph stream as features. Two random hashing schemes are employed to compress the original edge set of the graph stream and map the unlimitedly increasing clique-patterns onto a fixed-size feature space, respectively. The hashed cliques are used to update an “in-memory” fixed-size pattern-class table, which will be finally used to construct a rule-based classifier. DICH essentially speeds up the discriminative clique-pattern mining process and solves the unlimited clique-pattern expanding problem in graph stream mining. Experimental results on two real-world graph stream data sets demonstrate that DICH can clearly outperform the compared state-of-the-art method in both classification accuracy and training efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data. Springer, New York (2010)
Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized kernels between labeled graphs. In: ICML, pp. 321–328 (2003)
Borgwardt, K.M., Kriegel, H.P.: Shortest-path kernels on graphs. In: ICDM, pp. 74–81 (2005)
Mahé, P., Vert, J.P.: Graph kernels based on tree patterns for molecules. Machine Learning 75(1), 3–35 (2009)
Shervashidze, N., Borgwardt, K.: Fast subtree kernels on graphs. In: NIPS, pp. 1660–1668 (2009)
Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: Graph distances in the data-stream model. SIAM Journal on Computing 38(5), 1709–1727 (2008)
Aggarwal, C.C., Zhao, Y., Yu, P.S.: On clustering graph streams. In: SDM, pp. 478–489 (2010)
Aggarwal, C.C., Li, Y., Yu, P.S., Jin, R.: On dense pattern mining in graph streams. In: PVLDB, pp. 975–984 (2010)
Aggarwal, C.C.: On classification of graph streams. In: SDM, pp. 652–663 (2011)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD, pp. 226–235 (2003)
Li, B., Zhu, X., Chi, L., Zhang, C.: Nested subtree hash kernels for large-scale graph classification over streams. In: ICDM, pp. 399–408 (2012)
Vishwanathan, S., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. Journal of Machine Learning Research 11, 1201–1242 (2010)
Hido, S., Kashima, H.: A linear-time graph kernel. In: ICDM, pp. 179–188 (2009)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM, pp. 721–724 (2002)
Domingos, P., Hulten, G.: Mining high speed data streams. In: KDD, pp. 71–80 (2000)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: KDD, pp. 377–382 (2001)
Soufiani, H.A., Airoldi, E.: Graphlet decomposition of a weighted network. Journal of Machine Learning Research – Proceedings Track 22, 54–63 (2012)
Bron, C., Kerbosch, J.: Algorithm 457: Finding all cliques of an undirected graph. Communications of the ACM 16(9), 575–577 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chi, L., Li, B., Zhu, X. (2013). Fast Graph Stream Classification Using Discriminative Clique Hashing. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science(), vol 7818. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37453-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-37453-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37452-4
Online ISBN: 978-3-642-37453-1
eBook Packages: Computer ScienceComputer Science (R0)