Abstract
We propose a new randomized projection method based on entropy optimization of random projection matrices (the ERP method). The concept of compactness indicator of a data matrix, which is stored in projection matrices, is introduced. An ERP algorithm is formulated in the form of a conditional maximization problem for an entropy functional defined on the probability density functions of the projection matrices. A discrete version of this problem is considered, and conditions are obtained for the existence and uniqueness of its positive solution. Procedures are developed for the implementation of entropy-optimal projection matrices by sampling the probability density functions.
Similar content being viewed by others
Notes
Other indicators, e.g., the maximum distance between the points, etc., can also be selected.
REFERENCES
Carreira-Perpinán, M.A., A review of dimension reduction techniques, in Tech. Rep. CS-96-09 , Dep. Comput. Sci., Univ. Sheffield, 1997.
Imola, K., A survey of dimension reduction techniques, Cent. Appl. Sci. Comput, Lawrence Livermore Natl. Lab., 2002.
Cunningham, P., Dimension reduction, in Tech. Rep. UCD-CSI-2007-7 , Univ. Coll. Dublin, 2007.
Aivazyan, S.A., Bukhshtaber, V.M., Enyukov, I.S., and Meshalkin, L.D., Prikladnaya statistika. Klassifikatsiya i snizhenie razmernosti (Applied Statistics. Classification and Dimension Reduction), Moscow: Finansy Stat., 1989.
Friedman, J., Hastie, T., and Tibshirani, R., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Berlin: Springer, 2001.
Bishop, C., Pattern Recognition and Machine Learning, Ser. Inf. Sci. Stat., Berlin: Springer, 2006.
Comon, P. and Jutten, C., Handbook of Blind Source Separation. Independent Component Analysis and Applications, Oxford UK: Academic Press, 2010.
Bruckstein, A.M., Donoho, D.L., and Elad, M., From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Rev., 2009, vol. 51, no. 1, pp. 34–81.
Pirson, K., On lines and planes of closest fit to systems of points in space, Philos. Mag., 1901, vol. 2, pp. 559–572.
Kendall, M.G. and Stuart, A., The Advanced Theory of Statistics. Vol. 2. Inference and Relationship, London: Charles Griffin & Co., 1967. Translated under the title: Statisticheskie vyvody i svyazi, Moscow: Nauka, 1973.
Jolliffe, I.T., Principal Component Analysis, New York: Springer-Verlag, 2002.
Polyak, B.T. and Khlebnikov, M.V., Principle component analysis: robust versions, Autom. Remote Control, 2017, vol. 78, pp. 490–506. https://doi.org/10.1134/S0005117917030092
Deerwester, S.C., Dumias, S.T., Landaurer, T.K., Furnas, G.W., and Harshman, R.A., Indexing by latent semantic analysis, J. Am. Soc. Inf. Sci., 1990, vol. 41, no. 6, pp. 391–407.
Fisher, R.A., The use of multiple measurements in taxonomic problems, Ann. Eugen., 1936, vol. 7, pp. 179–188.
McLachlan, G.J., Discriminant Analysis and Statistical Pattern Recognition, New York: Wiley Interscience, 2004.
Johnson, W.B. and Lindenstrauss, J., Extension of Lipshitz mapping into Hilbert space, Conf. Modern Anal. Probab., Am. Math. Soc., 1984, vol. 26, pp. 189–206.
Achlioptas, D., Database-friendly random projections, Proc. Twentieth ACM Symp. Princ. Database Syst., pp. 274–281.
Bingham, E. and Mannila, H., Random projection in dimensionality reduction: applications to image and text data, Proc. Seventh ACM SIGKDD Int. Conf. Knowl. Discovery Data Min., pp. 245–250.
Vempala, S.S., The Random Projection Method. Vol. 65 , Providence, RI: Am. Math. Soc., 2005.
Ganin, I.P., Kosichenko, E.A., and Kaplan, A.Ya., Properties of EEG responses to emotionally significant stimuli using a P300 wave-based brain–computer interface, Neurosci. Behav. Physiol., 2018, vol. 48, no. 9, pp. 1093–1099.
Huber, F. and Zorner, T.O., Threshold cointegration in international exchange rates: a Bayesian approach, Int. J. Forecast., 2019, vol. 35, pp. 458–473.
Kosinskia, M., Stillwella, D., and Groepelb, T., Private traits and attributes are predictable from digital records of human behaviour, Proc. Natl. Acad. Sci. USA, 2013, vol. 110, no. 15, pp. 5802–5805.
Blum, A. and Langly, P., Selection of relevant feature and examples in machine learning, Artif. Intell., 1997, vol. 97, no. 1–2, pp. 245–271.
Cover, T.M. and Thomas, J.A., Elements of Information Theory, New York: John Wiley & Sons, 1991.
Peng, H.C., Long, F., and Ding, C., Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 2005, vol. 27, no. 8, pp. 1226–1238.
Zhang, Y., Li, S., Wang, T., and Zhang, Z., Divergence-based feature selection for separate classes, Neurocomputing, 2013, vol. 101, pp. 32–42.
Darhovsky, B.S., Kaplan, A.Ya., and Shishkin, S.L., On an approach to the estimation of the complexity of curves (using as an example an electroencephalogram of a human being), Autom. Remote Control, 2002, vol. 63, no. 3, pp. 468–474.
Darkhovskii, B.S. and Piryatinskaya, A., New approach to the segmentation problem for time series of arbitrary nature, Proc. Steklov Inst. Math., 2014, vol. 287, no. 1, pp. 54–67.
Darhovsky, B. and Piryatinska, A., Quickest detection of changes in the generating mechanism of a time series via the \(\varepsilon \)-complexity of continuous functions, Sequential Anal., 2014, vol. 33, pp. 231–250.
Efron, B., Bootstrap methods: another look at the jackknife, Ann. Stat., 1979, vol. 7, no. 1, pp. 1–26.
Bach, F.R., Bolasso: model consistent lasso estimation through the bootstrap, Proc. 25th Int. Conf. Mach. Learn. (2008), pp. 33–40.
Popkov, Y.S., Asymptotic efficiency of maximum entropy estimates, Dokl. Math., 2020, vol. 102, no. 1, pp. 350–352. https://doi.org/10.1134/S106456242004016X
Ioffe, A.D. and Tikhomirov, V.M., Teoriya ekstremal’nykh zadach (Theory of Extremum Problems), Moscow: Nauka, 1974.
Darkhovsky, B.S., Popkov, Y.S., Popkov, A.Y., and Aliev, A.S., A method of generating random vectors with a given probability density function, Autom. Remote Control, 2018, vol. 79, no. 9, pp. 1569–1581.
Funding
This work was supported by the Russian Foundation for Basic Research, project no. 20-07-00470.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Translated by V. Potapchouck
APPENDIX
Proof of Theorem 1. Let us use conditions (5.13), (5.14) to represent Eq. (5.12) in the form
Rights and permissions
About this article
Cite this article
Popkov, Y.S., Dubnov, Y.A. & Popkov, A.Y. Entropy-Randomized Projection. Autom Remote Control 82, 490–505 (2021). https://doi.org/10.1134/S0005117921030097
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0005117921030097