Abstract
In this paper, a dual-based stochastic inexact algorithm is developed to solve a class of stochastic nonsmooth convex problems with underlying structure. This algorithm can be regarded as an integration of a deterministic augmented Lagrangian method and some stochastic approximation techniques. By utilizing the sparsity of the second order information, each subproblem is efficiently solved by a superlinearly convergent semismooth Newton method. We derive some almost surely convergence properties and convergence rate of objective values. Furthermore, we present some results related to convergence rate of distance between iteration points and solution set under error bound conditions. Numerical results demonstrate favorable comparison of the proposed algorithm with some existing methods.






















Similar content being viewed by others
Data availability
The datasets cina0, sido0, lucap0, lucas and reged are available from http://www.causality.inf.ethz.ch/challenge.php?page=datasets#cont. The rest datasetes w8a, mushrooms, pyrim, bodyfat, housing, australian, rcv1, real-sim and covtype are available from the LIBSVM data repository: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.
References
Allen-Zhu Z.: Natasha: Faster non-convex stochastic optimization via strongly non-convex parameter. In: International Conference on Machine Learning, vol. 70, pp. 89–97, PMLR Press (2017)
Asi, H., Duchi, J.C.: Stochastic (approximate) proximal point methods: convergence, optimality, and adaptivity. SIAM. J. Optim. 29(3), 2257–2290 (2019)
Asi, H., Duchi, J.C.: The importance of better models in stochastic optimization. Proc. Natl. Acad. Sci. USA 116, 22924–22930 (2019)
Atchaé, Y.F., Fort, G., Moulines, E.: On perturbed proximal gradient algorithms. J. Mach. Learn. Res. 18(1), 1–33 (2017)
Bertsekas, D.: Convex Optimization Algorithms. Athena Scientific (2015)
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)
Clarke, F.H.: Optimization and Nonsmooth Analysis. SIAM, University City (1990)
Combettes P.L., Pesquet J.C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, Springer-Verlag, New York, pp. 185–212 (2011)
Davis, D., Drusvyatskiy, D.: Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29, 207–239 (2019)
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1646–1654. Curran Associates, Inc. (2014)
Demirer, M., Diebold, F.X., Liu, L., Yilmaz, K.: Estimating global bank network connectedness. J. Appl. Econom. 33(1), 1–15 (2018)
Fu, C., Fu, C., Michael, M.: Handbook of Simulation Optimization. International Series in Operations Research and Management Science. Springer-Verlag, New York (2015)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization I: a generic algorithmic framework. SIAM J. Optim. 22(4), 1469–1492 (2012)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization II: shrinking procedures and optimal algorithms. SIAM J. Optim. 23(4), 2061–2089 (2013)
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2), 267–305 (2016)
Goebel, R., Rockafellar, R.T.: Local strong convexity and local Lipschitz continuity of the gradient of convex functions. J. Convex Anal. 15, 263–270 (2008)
Jalilzadeh A., Shanbhag U.V., Blanchet J.H., Glynn P.W.: Smoothed variable sample-size accelerated proximal methods for nonsmooth stochastic convex programs. https://arxiv.org/pdf/1803.00718.pdf (2020)
Jofré, A., Thompson, P.: On variance reduction for stochastic smooth convex optimization with multiplicative noise. Math. Program. 174(1–2), 253–292 (2019)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Lan, G.: First-order and Stochastic Optimization Methods for Machine Learning. Springer-Verlag, New York (2020)
Lei, J., Shanbhag, U.V.: Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes. Optim. Methods Softw. 0, 1–31 (2020)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
Lin, M., Liu, Y.J., Sun, D., Toh, K.C.: Efficient sparse semismooth Newton methods for the clustered lasso problem. SIAM J. Optim. 29(3), 2026–2052 (2019)
Li, X., Sun, D., Toh, K.C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28(1), 433–458 (2018)
Lin M., Yuan Y., Sun D., Toh K.C.: Adaptive sieving with PPDNA: generating solution paths of exclusive lasso models, https://arxiv.org/pdf/2009.08719.pdf (2020)
Li, X., Sun, D., Toh, K.C.: On efficiently solving the subproblems of a level-set method for fused Lasso problems. SIAM J. Optim. 28(2), 1842–1866 (2018)
Luo, Z., Sun, D., Toh, K.C., Xiu, N.: Solving the OSCAR and SLOPE models using a semismooth Newton-based augmented Lagrangian method. J. Mach. Learn. Res. 20(106), 1–25 (2019)
McCann, M., Li, Y., Maguire, L., Johnston, A.: Causality challenge: benchmarking relevant signal components for effective monitoring and process control. In: Causality: Objectives and Assessment, pp 277–288, PMLR (2010)
Milzarek, A., Xiao, X., Cen, S., Wen, Z., Ulbrich, M.: A stochastic semismooth Newton method for nonsmooth nonconvex optimization. SIAM J. Optim. 29(4), 2916–2948 (2019)
Milzarek, A., Xiao, X., Wen, Z., Ulbrich, M.: On the local convergence of a stochastic semismooth Newton method for nonsmooth nonconvex optimization. Sci. China Math. 65, 2151–2170 (2022)
Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)
Pătraşcu, A., Necoara, I.: Nonasymptotic convergence of stochastic proximal point methods for constrained convex optimization. J. Mach. Learn. Res. 18(1), 1–42 (2018)
Pham, N.H., Nguyen, L.M., Phan, D.T., Tran-Dinh, Q.: ProxSARAH: an efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Learn. Res. 21(110), 1–48 (2020)
Reddi S.J., Sra S., Póczos B., Smola A.J.: Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. In: Advance in Neural Information Processing Systems, vol. 29, pp. 1145–1153, MIT Press, Cambridge (2016)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Robbins H.,Siegund D.: A convergence theorem for non-negative almost supermartingales and some applications. In: Rustagi, J.S. (ed.), Optimizing Methods in Statistics (Proceedings of a Symposium at Ohio State University, Columbus, Ohio), Academic, pp. 233-257 (1971)
Rosasco, L., Villa, S., Vũ, B.C.: Convergence of stochastic proximal gradient algorithm. Appl. Math. Optim. 82, 891–917 (2019)
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1997)
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
Rockafellar, R.T., Wets, R.: Variational Analysis. Springer Science & Business Media (2009)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York (2014)
Shalev-Shwartz, S., Zhang, T.: Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. Math. Program. 155(1–2), 105–145 (2016)
Shi Z., Liu R.: Large scale optimization with proximal stochastic Newton-type gradient descent. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 691–704, Springer, Cham (2015)
Tang, P., Wang, C., Sun, D., Toh, K.C.: A sparse semismooth Newton based proximal majorization-minimization algorithm for nonconvex square-root-loss regression problems. J. Mach. Learn. Res. 21(226), 1–38 (2020)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58, 267–288 (1996)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science and Business Media, New York (2013)
Wang, M.D., Bertsekas, D.P.: Stochastic first-order methods with random constraint projection. SIAM J. Optim. 26(1), 681–717 (2016)
Wang, X., Yuan, Y.X.: Stochastic proximal quasi-Newton methods for non-convex composite optimization. Optim. Methods Softw. 34, 922–948 (2019)
Wang, J., Zhang, T.: Utilizing second order information in minibatch stochastic variance reduced proximal iterations. J. Mach. Learn. Res. 20(42), 1–56 (2019)
Xiao, X.: A unified convergence analysis of stochastic Bregman proximal gradient and extragradient methods. J. Optim. Theory Appl. 188(3), 605–627 (2021)
Xiao, L., Zhang, T.: A proximal stochastic gradientmethod with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Yang, M., Milzarek, A., Wen, Z., Zhang, T.: A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization. Math. Program. 194(1–2), 257–303 (2022)
Ye, J.J., Yuan, X., Zeng, S., Zhang, J.: Variational analysis perspective on linear convergence of some first order methods for nonsmooth convex optimization problems. Set-Valued Var. Anal. 29(4), 803–837 (2021)
Zhang N.: A dual based semismooth Newton method for a class of sparse Tikhonov regularization. arXiv:2006.08188 (2020)
Zhang, Y., Zhang, N., Sun, D., Toh, K.C.: An efficient Hessian based algorithm for solving large-scale sparse group Lasso problems. Math. Program. 179, 223–263 (2020)
Zou, H., Hastie, T.L.: Regularization and variable selection via the elastic net. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Acknowledgements
The authors are very grateful to the associate editor and two anonymous referees for their helpful suggestions and remarks that allowed us to improve the original presentation significantly. The alphabetical order of the authors indicates their equal contributions to the paper.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by NSFC (Nos. 12101262, 12071280, 11971220, 12222106), Guangdong Basic and Applied Basic Research Foundation (Nos. 2022A1515010263, 2022B1515020082), and Shenzhen Science and Technology Program (No. RCYX20200714114700072).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, GH., Yang, ZP., Yin, HA. et al. A dual-based stochastic inexact algorithm for a class of stochastic nonsmooth convex composite problems. Comput Optim Appl 86, 669–710 (2023). https://doi.org/10.1007/s10589-023-00504-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-023-00504-0