Abstract
Federated Learning aims at training a global model from multiple decentralized devices (i.e. clients) without exchanging their private local data. A key challenge is the handling of non-i.i.d. (independent identically distributed) data across multiple clients that may induce disparities of their local features. We introduce the Hyperspherical Federated Learning (SphereFed) framework to address the non-i.i.d. issue by constraining learned representations of data points to be on a unit hypersphere shared by clients. Specifically, all clients learn their local representations by minimizing the loss with respect to a fixed classifier whose weights span the unit hypersphere. After federated training in improving the global model, this classifier is further calibrated with a closed-form solution by minimizing a mean squared loss. We show that the calibration solution can be computed efficiently and distributedly without direct access of local data. Extensive experiments indicate that our SphereFed approach is able to improve the accuracy of multiple existing federated learning algorithms by a considerable margin (up to 6% on challenging datasets) with enhanced computation and communication efficiency across datasets and model architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The terms representation and feature are used interchangeably.
References
Torch.linalg.qr. https://pytorch.org/docs/stable/generated/torch.linalg.qr.html
Acar, D.A.E., Zhao, Y., Matas, R., Mattina, M., Whatmough, P., Saligrama, V.: Federated learning based on dynamic regularization. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=B7v4QMR6Z9w
Acar, D.A.E., et al.: Debiasing model updates for improving personalized federated training. In: International Conference on Machine Learning, pp. 21–31. PMLR (2021)
Achille, A., Golatkar, A., Ravichandran, A., Polito, M., Soatto, S.: LQF: linear quadratic fine-tuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15729–15739 (2021)
Al-Shedivat, M., Gillenwater, J., Xing, E., Rostamizadeh, A.: Federated learning via posterior averaging: a new perspective and practical algorithms. In: International Conference on Learning Representations (ICLR) (2021)
Anzai, Y.: Pattern Recognition and Machine Learning. Elsevier, Amsterdam (2012)
Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers. arXiv preprint arXiv:1912.00818 (2019)
Belkin, M.: Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation. arXiv preprint arXiv:2105.14368 (2021)
Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Brier, G.W., et al.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)
Bui, D., et al.: Federated user representation learning. arXiv preprint arXiv:1909.12535 (2019)
Chen, C., et al.: Communication-efficient federated learning with adaptive parameter freezing. In: 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pp. 1–11. IEEE (2021)
Chen, Y., Sun, X., Jin, Y.: Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Trans. Neural Netw. Learn. Syst. 31(10), 4229–4238 (2019)
Cheraghian, A., Rahman, S., Fang, P., Roy, S.K., Petersson, L., Harandi, M.: Semantic-aware knowledge distillation for few-shot class-incremental learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2534–2543 (2021)
Collins, L., Hassani, H., Mokhtari, A., Shakkottai, S.: Exploiting shared representations for personalized federated learning. In: International Conference on Machine Learning, pp. 2089–2099. PMLR (2021)
Diao, E., Ding, J., Tarokh, V.: HeteroFL: computation and communication efficient federated learning for heterogeneous clients. In: International Conference on Learning Representations (2021)
Dong, X., Yin, H., Alvarez, J.M., Kautz, J., Molchanov, P.: Deep neural networks are surprisingly reversible: a baseline for zero-shot inversion. arXiv preprint arXiv:2107.06304 (2021)
Duan, J.-H., Li, W., Lu, S.: FedDNA: federated learning with decoupled normalization-layer aggregation for non-IID data. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12975, pp. 722–737. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_44
Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: InterSpeech, vol. 13, pp. 1756–1760 (2013)
He, C., et al.: FedML: a research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hoffer, E., Hubara, I., Soudry, D.: Fix your classifier: the marginal value of training the last weight layer. In: International Conference on Learning Representations (2018)
Hsu, T.M.H., Qi, H., Brown, M.: Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 (2019)
Hsu, T.-M.H., Qi, H., Brown, M.: Federated visual classification with real-world data distribution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 76–92. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_5
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Hui, L., Belkin, M.: Evaluation of neural architectures trained with square loss vs cross-entropy in classification tasks. In: ICLR (2020)
Jain, V., Learned-Miller, E.: Online domain adaptation of a pre-trained cascade of classifiers. In: CVPR 2011, pp. 577–584. IEEE (2011)
Jiang, Y., Konečnỳ, J., Rush, K., Kannan, S.: Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488 (2019)
Kairouz, P., et al.: Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019)
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2020)
Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: stochastic controlled averaging for federated learning. In: International Conference on Machine Learning, pp. 5132–5143. PMLR (2020)
Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)
Kornblith, S., Chen, T., Lee, H., Norouzi, M.: Why do better loss functions lead to less transferable features? Adv. Neural Inf. Process. Syst. 34 (2021)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
Kuangliu: Pytorch-cifar/vgg.py. https://github.com/kuangliu/pytorch-cifar/blob/master/models/vgg.py
Le, Y., Yang, X.S.: Tiny imagenet visual recognition challenge (2015)
Lezama, J., Qiu, Q., Musé, P., Sapiro, G.: Ole: orthogonal low-rank embedding - a plug and play geometric loss for deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8109–8118 (2018)
Li, D., Wang, J.: FedMD: heterogenous federated learning via model distillation. In: NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality (2019)
Li, Q., Diao, Y., Chen, Q., He, B.: Federated learning on non-IID data silos: an experimental study. In: IEEE International Conference on Data Engineering (2021)
Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10713–10722 (2021)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Sig. Process. Mag. 37(3), 50–60 (2020)
Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020)
Liang, P.P., et al.: Think locally, act globally: federated learning with local and global representations. arXiv preprint arXiv:2001.01523 (2020)
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017)
Liu, X., Tang, Z., Huang, H., Zhang, T., Yang, B.: Multiple learning for regression in big data. In: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 587–594. IEEE (2019)
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
Luo, M., Chen, F., Hu, D., Zhang, Y., Liang, J., Feng, J.: No fear of heterogeneity: classifier calibration for federated learning with non-IID data. IN: 35th Conference on Neural Information Processing Systems (2021)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(86), 2579–2605 (2008). http://jmlr.org/papers/v9/vandermaaten08a.html
Mai, X., Liao, Z.: High dimensional classification via empirical risk minimization: improvements and optimality. arXiv preprint arXiv:1905.13742 (2019)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Mettes, P., van der Pol, E., Snoek, C.: Hyperspherical prototype networks. Adv. Neural Inf. Process. Syst. 32 (2019)
Muthukumar, V., Narang, A., Subramanian, V., Belkin, M., Hsu, D., Sahai, A.: Classification vs regression in overparameterized regimes: does the loss function matter? J. Mach. Learn. Res. 22(222), 1–69 (2021)
Oh, J., Kim, S., Yun, S.Y.: FedBABU: towards enhanced representation for federated image classification. In: International Conference on Learning Representations (2021)
Oh, J., Yoo, H., Kim, C., Yun, S.Y.: Boil: towards representation change for few-shot learning. In: International Conference on Learning Representations (2021)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)
Puigcerver, J., et al.: Scalable transfer learning with expert models. In: International Conference on Learning Representations (2021)
Pursell, L., Trimble, S.: Gram-Schmidt orthogonalization by gauss elimination. Am. Math. Mon. 98(6), 544–549 (1991)
Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. In: International Conference on Learning Representations (2019)
Reddi, S., et al.: Adaptive federated optimization. In: International Conference on Learning Representations (2021)
Saad, D.: Online algorithms and stochastic approximations. Online Learning 5, 6–3 (1998)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Sangari, A., Sethares, W.: Convergence analysis of two loss functions in soft-max regression. IEEE Trans. Sig. Process. 64(5), 1280–1288 (2015)
Shamir, O., Srebro, N., Zhang, T.: Communication-efficient distributed optimization using an approximate newton-type method. In: International Conference on Machine Learning, pp. 1000–1008. PMLR (2014)
Shao, S., Xing, L., Wang, Y., Xu, R., Zhao, C., Wang, Y., Liu, B.: MHFC: multi-head feature collaboration for few-shot learning. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4193–4201 (2021)
Shoham, N., et al.: Overcoming forgetting in federated learning on non-IID data. In: NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Singhal, K., Sidahmed, H., Garrett, Z., Wu, S., Rush, J., Prakash, S.: Federated reconstruction: partially local federated learning. Adv. Neural Inf. Process. Syst. 34 (2021)
Smith, V., Chiang, C.K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. Adv. Neural Inf. Process. Syst. 30 (2017)
Sun, B., Huo, H., Yang, Y., Bai, B.: PartialFED: cross-domain personalized federated learning via partial initialization. Adv. Neural Inf. Process. Syst. 34 (2021)
Tammes, P.M.L.: On the origin of number and arrangement of the places of exit on the surface of pollen-grains. Recueil des travaux botaniques néerlandais 27(1), 1–84 (1930)
Thrampoulidis, C., Oymak, S., Soltanolkotabi, M.: Theoretical insights into multiclass classification: a high-dimensional asymptotic view. Adv. Neural Inf. Process. Syst. (2020)
Trefethen, L.N., Bau III, D.: Numerical Linear Algebra, vol. 50. SIAM (1997)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Venkat, N., Kundu, J.N., Singh, D., Revanur, A., et al.: Your classifier can secretly suffice multi-source domain adaptation. Adv. Neural. Inf. Process. Syst. 33, 4647–4659 (2020)
Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv. Neural Inf. Process. Syst. (2020)
Wang, K., Mathews, R., Kiddon, C., Eichner, H., Beaufays, F., Ramage, D.: Federated evaluation of on-device personalization. arXiv preprint arXiv:1910.10252 (2019)
Xilinx: Xilinx virtex-7 fpga vc707 evaluation kit. https://www.xilinx.com/products/boards-and-kits/ek-v7-vc707-g.html
Xu, A., Huang, H.: Coordinating momenta for cross-silo federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8735–8743 (2022)
Xu, C., Hong, Z., Huang, M., Jiang, T.: Acceleration of federated learning with alleviated forgetting in local training. In: International Conference on Learning Representations (2021)
Yang, K., Fan, T., Chen, T., Shi, Y., Yang, Q.: A quasi-newton method based vertical federated learning framework for logistic regression. In: The 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality, in Conjunction with NeurIPS 2019 (2019)
Yao, X., Sun, L.: Continual local training for better initialization of federated models. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 1736–1740. IEEE (2020)
Ye, H.J., Hu, H., Zhan, D.C.: Learning adaptive classifiers synthesis for generalized few-shot learning. Int. J. Comput. Vision 129(6), 1930–1953 (2021)
Yoon, J., Jeong, W., Lee, G., Yang, E., Hwang, S.J.: Federated continual learning with weighted inter-client transfer. In: International Conference on Machine Learning, pp. 12073–12086. PMLR (2021)
Yoon, T., Shin, S., Hwang, S.J., Yang, E.: Fedmix: approximation of mixup under mean augmented federated learning. In: International Conference on Learning Representations (2021)
You, K., Liu, Y., Wang, J., Long, M.: Logme: practical assessment of pre-trained models for transfer learning. In: International Conference on Machine Learning, pp. 12133–12143. PMLR (2021)
Yuan, H., Zaheer, M., Reddi, S.: Federated composite optimization. In: International Conference on Machine Learning, pp. 12253–12266. PMLR (2021)
Zhai, A., Wu, H.Y.: Classification is a strong baseline for deep metric learning. arXiv preprint arXiv:1811.12649 (2018)
Zhang, L., Luo, Y., Bai, Y., Du, B., Duan, L.Y.: Federated learning for non-IID data via unified feature learning and optimization objective alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4420–4428 (2021)
Zhang, S.Q., Lin, J., Zhang, Q.: A multi-agent reinforcement learning approach for efficient client selection in federated learning. arXiv preprint arXiv:2201.02932 (2022)
Zhang, S.Q., McDanel, B., Kung, H.: Fast: DNN training under variable precision block floating point with stochastic rounding. In: International Symposium on High-Performance Computer Architecture (2021)
Zhang, T., Yang, B.: Box-cox transformation in big data. Technometrics 59(2), 189–201 (2017)
Zhao, N., Wu, Z., Lau, R.W., Lin, S.: What makes instance discrimination good for transfer learning? In: International Conference on Learning Representations (2021)
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-IID data. arXiv preprint arXiv:1806.00582 (2018)
Zheng, Y., Pal, D.K., Savvides, M.: Ring loss: convex feature normalization for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5089–5097 (2018)
Zhu, Y., Bai, Y., Wei, Y.: Spherical feature transform for deep metric learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 420–436. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_25
Zhu, Z., Hong, J., Zhou, J.: Data-free knowledge distillation for heterogeneous federated learning. In: International Conference on Machine Learning, pp. 12878–12889. PMLR (2021)
Acknowledgements
This research was supported in part by the Air Force Research Laboratory under award number FA8750-18-1-0112.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, X., Zhang, S.Q., Li, A., Kung, H. (2022). SphereFed: Hyperspherical Federated Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13686. Springer, Cham. https://doi.org/10.1007/978-3-031-19809-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-19809-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19808-3
Online ISBN: 978-3-031-19809-0
eBook Packages: Computer ScienceComputer Science (R0)