Non-negative matrix factorization for semi-supervised data clustering

Chen, Yanhua; Rege, Manjeet; Dong, Ming; Hua, Jing

doi:10.1007/s10115-008-0134-6

Non-negative matrix factorization for semi-supervised data clustering

Regular Paper
Published: 28 March 2008

Volume 17, pages 355–379, (2008)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

888 Accesses
Explore all metrics

Abstract

Traditional clustering algorithms are inapplicable to many real-world problems where limited knowledge from domain experts is available. Incorporating the domain knowledge can guide a clustering algorithm, consequently improving the quality of clustering. In this paper, we propose SS-NMF: a semi-supervised non-negative matrix factorization framework for data clustering. In SS-NMF, users are able to provide supervision for clustering in terms of pairwise constraints on a few data objects specifying whether they “must” or “cannot” be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the data similarity matrix to infer the clusters. Theoretically, we show the correctness and convergence of SS-NMF. Moveover, we show that SS-NMF provides a general framework for semi-supervised clustering. Existing approaches can be considered as special cases of it. Through extensive experiments conducted on publicly available datasets, we demonstrate the superior performance of SS-NMF for clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

References

Bansal N, Blum A, Chawla S (2002) Correlation clustering. Proceedings of the 43rd symposium on foundations of computer science, pp 238–247
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. Proceedings of the 20th international conference on machine learning, pp 11–18
Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. Proceedings of the 19th international conference on machine learning, pp 27–34
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 59–68
Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. Annual workshop on computational learning theory, Proceedings of the 11th annual conference on Computational learning theory, pp 92–100
Boley D (1998). Principal direction divisive partitioning. Data Mining Knowledge Discovery 2(4): 325–344
Article Google Scholar
Charikar M, Guruswami V, Wirth A (2003) Clustering with qualitative information. Proceedings of the 44th IEEE symposium on foundations of computer science, pp 524–533
Chung FRK (1997) Spectral Graph Theory, American Mathematical Society
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc B pp 1–38
Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. Proceedings of SIAM international conference on data mining, pp 606–610
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM Press, New York, pp 126–135
Dubnov S, EI-Yaniv R, Gdalyahu Y, Schneidman E, Tishby N and Yona G (2002). A new nonparametric pairwise clustering algorithm based on iterative estimation of distance profiles. Machine Learning 47(1): 35–61
Article MATH Google Scholar
Duda RO, Hart PE and Stork DG (2000). Pattern Classification. Wiley, New York
Google Scholar
Fiedler M (1973). Algebraic connectivity of graphs. Czechoslovak Math J 23: 298–305
MathSciNet Google Scholar
Fiedler M (1975a). Eigenvectors of acyclic matrices. Czechoslovak Math J 25: 607–618
MathSciNet Google Scholar
Fiedler M (1975b). A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovak Math J 25: 619–633
MathSciNet Google Scholar
Fisher D (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning 2: 139–172
Google Scholar
Godbole S, Harpale A, Sarawagi S, Chakrabarti S (2004) Document classification through interactive supervision of document and term labels. Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases, pp 185–196
Gondek D and Hofmann T (2007). Non-redundant data clustering. Knowledge Information Systems 12(1): 1–24
Article Google Scholar
Hagen L and Kahng AB (1992). New spectral methods for ratio cut partitioning and clustering. IEEE Trans CAD Integrated Circuits Systems 11(9): 1074–1085
Article Google Scholar
Han E-H, Karypis G (2000) Centroid-based document classification: Analysis and experimental results. Proceedings of the 4th European conference on principles of data mining and knowledge discovery, pp 424–431
Hersh W, Buckley C, Leone T, Hickam D (1994) Ohsumed: an interactive retrival evaluation and new large test collection for research. Proceedings of 17th ACM SIGIR conference on research and development in information retrieval, pp 192–201
Hinneburg A and Keim D (2003). A general approach to clustering in large databases with noise. Knowledge Information Systems 5(4): 387–415
Article Google Scholar
Hinneburg A, Keim DA (1998) An efficient approach to clustering in large multimedia databases with noise. Proceedings of the 4th international conference on knowledge discovery and data mining, pp 58–65
Hotho A, Staab S, Stumme G (2003) Text clustering based on background knowledge, Technical report 425. University of Karlsruhe, Institute AIFB, Karlsruhe
Huang Y, Mitchell TM (2006) Text clustering with extended user feedback. Proceedings of the 29th ACM SIGIR conference on research and development in information retrieval, pp 413–420
Jain AK, Murty MN and Flynn PJ (1999). Data clustering: a review. ACM computing surveys 31(3): 264–323
Article Google Scholar
Ji X, Xu W (2006) Document clustering with prior knowledge. Proceedings of the 29th ACM SIGIR conference on research and development in information retrieval, pp 405–412
Joachims T (1999) Transductive inference for text classification using support vector machines. Proceedings of the 16th international conference on machine learning, pp 200–209
Jones R, McCallum A, Nigam K, Riloff E (1999) Bootstrapping for text learning tasks. Workshop on text mining: foundations, techniques and applications, proceedings of international joint conference on artifical intelligence, pp 52–63
Kamvar SD, Klein D, Manning CD (2003) Spectral learning. Proceedings of the 18th international joint conference on artificial intelligence, pp 561–566
Kaufman L and Rousseeuw P (1990). Finding Groups in Data: an introduction to cluster analysis. Wiley, New York
Google Scholar
Kim H and Lee S (2004). An intelligent information system for organizing online text documents. Knowledge Information Systems 6(2): 125–149
Google Scholar
Klein D, Kamvar S, Manning C (2002) From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. Proceedings of the 19th international conference on machine learning, pp 307–314
Koga H, Ishibashi T and Watanabe T (2007). Fast agglomerative hierarchical clustering algorithm using locality-sensitive hashing. Knowledge Information Systems 12(1): 25–53
Article Google Scholar
Kulis B, Basu S, Dhillon I, Mooney R (2005) Semi-supervised graph clustering: a kernel approach. Proceedings of the 22nd international conference on machine learning, pp 457–464
Lee DD and Seung HS (1999). Learning the parts of objects by non-negative matrix factorization. Nature 401: 788–791
Article Google Scholar
Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Proceedings of annual conference on neural information processing systems 13, pp 556–562
Lewis DD (1999) Reuters-21578 text categorization test collection distribution 1.0, http://www.research.att/lewis
Li T, Ding C (2006) The relationships among various nonnegative matrix factorization methods for clustering. Proceedings of the 6th IEEE international conference on data mining, pp 362–371
Liu B, Li X, Lee WS, Yu PS (2004) Text classification by labeling words. Proceedings of AAAI conference on artificial intelligence, pp 425–430
Long B, Zhang Z, Yu PS (2005) Co-clustering by block value decomposition. Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, pp 635–640
Long B, Zhang Z, Yu PS (2007) Relational clustering by symmetric convex coding. Proceedings the 24th annual international conference on machine learning, pp 569–576
MacQueen J (1967) Some methods for classsification and analysis of multivariate observations. Proceedings of 5th Berkeley symposium on mathematical statistics and probability, pp 281–297
Newman C, Hettich S, Merz C (1998) UCI repository of machine learning databases
Nigam K, McCallum AK, Thrun S, Mitchell TM (1998) Learning to classify text from labeled and unlabeled documents. Proceedings of AAAI conference on artificial intelligence, pp 792–799
Raghavan H, Madani O, Jones R (2005) Interactive feature selection. Proceedings of international joint conference on artificial intelligence, pp 841–846
Sheikholesami G, Chatterjee S, Zhang A (1998) Wavecluster: a multi-resolution clustering approach for very large spatial databases. Proceedings of the international conference on very large databases, pp 428–439
Shi J and Malik J (2000). Normalized cuts and image segmentation. IEEE Trans on PAMI 22(8): 888–905
Google Scholar
TREC (n.d.) Text retrieval conference, http://trec.nist.gov
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. Proceedings of the 18th international conference on machine learning, pp 577–584
Xing EP, Ng AY, Jordan M, Russell S (2002) Distance metric learning, with application to clustering with side-information. Advances in neural information processing systems 15, pp 502–512
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. Proceedings of the 26th ACM SIGIR conference on research and development in information retrieval, pp 267–273
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. Proceedings of the 14th international conference on machine learning, pp 412–420
Zhang T, Ramakrishnan R, Livny M (1996) Birch: An efficient data clustering method for very large databases. Proceedings of the ACM SIGMOD international conference on management of data, pp 103–114

Download references

Author information

Authors and Affiliations

Department of Computer Science, Wayne State University, Detroit, MI, USA
Yanhua Chen, Manjeet Rege, Ming Dong & Jing Hua

Authors

Yanhua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Manjeet Rege
View author publications
You can also search for this author in PubMed Google Scholar
Ming Dong
View author publications
You can also search for this author in PubMed Google Scholar
Jing Hua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Y., Rege, M., Dong, M. et al. Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17, 355–379 (2008). https://doi.org/10.1007/s10115-008-0134-6

Download citation

Received: 29 October 2007
Revised: 19 December 2007
Accepted: 29 January 2008
Published: 28 March 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s10115-008-0134-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Non-negative matrix factorization for semi-supervised data clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Non-redundant multiple clustering by nonnegative matrix factorization

Semi-supervised symmetric non-negative matrix factorization with graph quality improvement and constraints

Adaptive local learning regularized nonnegative matrix factorization for data clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Non-negative matrix factorization for semi-supervised data clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Non-redundant multiple clustering by nonnegative matrix factorization

Semi-supervised symmetric non-negative matrix factorization with graph quality improvement and constraints

Adaptive local learning regularized nonnegative matrix factorization for data clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation