Skip to main content

SphereFed: Hyperspherical Federated Learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)


Federated Learning aims at training a global model from multiple decentralized devices (i.e. clients) without exchanging their private local data. A key challenge is the handling of non-i.i.d. (independent identically distributed) data across multiple clients that may induce disparities of their local features. We introduce the Hyperspherical Federated Learning (SphereFed) framework to address the non-i.i.d. issue by constraining learned representations of data points to be on a unit hypersphere shared by clients. Specifically, all clients learn their local representations by minimizing the loss with respect to a fixed classifier whose weights span the unit hypersphere. After federated training in improving the global model, this classifier is further calibrated with a closed-form solution by minimizing a mean squared loss. We show that the calibration solution can be computed efficiently and distributedly without direct access of local data. Extensive experiments indicate that our SphereFed approach is able to improve the accuracy of multiple existing federated learning algorithms by a considerable margin (up to 6% on challenging datasets) with enhanced computation and communication efficiency across datasets and model architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
€32.70 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
EUR 93.08
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 116.04
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

    The terms representation and feature are used interchangeably.


  1. Torch.linalg.qr.

  2. Acar, D.A.E., Zhao, Y., Matas, R., Mattina, M., Whatmough, P., Saligrama, V.: Federated learning based on dynamic regularization. In: International Conference on Learning Representations (2021).

  3. Acar, D.A.E., et al.: Debiasing model updates for improving personalized federated training. In: International Conference on Machine Learning, pp. 21–31. PMLR (2021)

    Google Scholar 

  4. Achille, A., Golatkar, A., Ravichandran, A., Polito, M., Soatto, S.: LQF: linear quadratic fine-tuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15729–15739 (2021)

    Google Scholar 

  5. Al-Shedivat, M., Gillenwater, J., Xing, E., Rostamizadeh, A.: Federated learning via posterior averaging: a new perspective and practical algorithms. In: International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

  6. Anzai, Y.: Pattern Recognition and Machine Learning. Elsevier, Amsterdam (2012)

    Google Scholar 

  7. Arivazhagan, M.G., Aggarwal, V., Singh, A.K., Choudhary, S.: Federated learning with personalization layers. arXiv preprint arXiv:1912.00818 (2019)

  8. Belkin, M.: Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation. arXiv preprint arXiv:2105.14368 (2021)

  9. Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  10. Brier, G.W., et al.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)

    Article  Google Scholar 

  11. Bui, D., et al.: Federated user representation learning. arXiv preprint arXiv:1909.12535 (2019)

  12. Chen, C., et al.: Communication-efficient federated learning with adaptive parameter freezing. In: 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pp. 1–11. IEEE (2021)

    Google Scholar 

  13. Chen, Y., Sun, X., Jin, Y.: Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Trans. Neural Netw. Learn. Syst. 31(10), 4229–4238 (2019)

    Article  Google Scholar 

  14. Cheraghian, A., Rahman, S., Fang, P., Roy, S.K., Petersson, L., Harandi, M.: Semantic-aware knowledge distillation for few-shot class-incremental learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2534–2543 (2021)

    Google Scholar 

  15. Collins, L., Hassani, H., Mokhtari, A., Shakkottai, S.: Exploiting shared representations for personalized federated learning. In: International Conference on Machine Learning, pp. 2089–2099. PMLR (2021)

    Google Scholar 

  16. Diao, E., Ding, J., Tarokh, V.: HeteroFL: computation and communication efficient federated learning for heterogeneous clients. In: International Conference on Learning Representations (2021)

    Google Scholar 

  17. Dong, X., Yin, H., Alvarez, J.M., Kautz, J., Molchanov, P.: Deep neural networks are surprisingly reversible: a baseline for zero-shot inversion. arXiv preprint arXiv:2107.06304 (2021)

  18. Duan, J.-H., Li, W., Lu, S.: FedDNA: federated learning with decoupled normalization-layer aggregation for non-IID data. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12975, pp. 722–737. Springer, Cham (2021).

    Chapter  Google Scholar 

  19. Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: InterSpeech, vol. 13, pp. 1756–1760 (2013)

    Google Scholar 

  20. He, C., et al.: FedML: a research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518 (2020)

  21. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  22. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  23. Hoffer, E., Hubara, I., Soudry, D.: Fix your classifier: the marginal value of training the last weight layer. In: International Conference on Learning Representations (2018)

    Google Scholar 

  24. Hsu, T.M.H., Qi, H., Brown, M.: Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 (2019)

  25. Hsu, T.-M.H., Qi, H., Brown, M.: Federated visual classification with real-world data distribution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 76–92. Springer, Cham (2020).

    Chapter  Google Scholar 

  26. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  27. Hui, L., Belkin, M.: Evaluation of neural architectures trained with square loss vs cross-entropy in classification tasks. In: ICLR (2020)

    Google Scholar 

  28. Jain, V., Learned-Miller, E.: Online domain adaptation of a pre-trained cascade of classifiers. In: CVPR 2011, pp. 577–584. IEEE (2011)

    Google Scholar 

  29. Jiang, Y., Konečnỳ, J., Rush, K., Kannan, S.: Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488 (2019)

  30. Kairouz, P., et al.: Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019)

  31. Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2020)

    Google Scholar 

  32. Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: stochastic controlled averaging for federated learning. In: International Conference on Machine Learning, pp. 5132–5143. PMLR (2020)

    Google Scholar 

  33. Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  34. Kornblith, S., Chen, T., Lee, H., Norouzi, M.: Why do better loss functions lead to less transferable features? Adv. Neural Inf. Process. Syst. 34 (2021)

    Google Scholar 

  35. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)

    Google Scholar 

  36. Kuangliu: Pytorch-cifar/

  37. Le, Y., Yang, X.S.: Tiny imagenet visual recognition challenge (2015)

    Google Scholar 

  38. Lezama, J., Qiu, Q., Musé, P., Sapiro, G.: Ole: orthogonal low-rank embedding - a plug and play geometric loss for deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8109–8118 (2018)

    Google Scholar 

  39. Li, D., Wang, J.: FedMD: heterogenous federated learning via model distillation. In: NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality (2019)

    Google Scholar 

  40. Li, Q., Diao, Y., Chen, Q., He, B.: Federated learning on non-IID data silos: an experimental study. In: IEEE International Conference on Data Engineering (2021)

    Google Scholar 

  41. Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10713–10722 (2021)

    Google Scholar 

  42. Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Sig. Process. Mag. 37(3), 50–60 (2020)

    Article  Google Scholar 

  43. Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020)

    Google Scholar 

  44. Liang, P.P., et al.: Think locally, act globally: federated learning with local and global representations. arXiv preprint arXiv:2001.01523 (2020)

  45. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017)

    Google Scholar 

  46. Liu, X., Tang, Z., Huang, H., Zhang, T., Yang, B.: Multiple learning for regression in big data. In: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 587–594. IEEE (2019)

    Google Scholar 

  47. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)

  48. Luo, M., Chen, F., Hu, D., Zhang, Y., Liang, J., Feng, J.: No fear of heterogeneity: classifier calibration for federated learning with non-IID data. IN: 35th Conference on Neural Information Processing Systems (2021)

    Google Scholar 

  49. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(86), 2579–2605 (2008).

  50. Mai, X., Liao, Z.: High dimensional classification via empirical risk minimization: improvements and optimality. arXiv preprint arXiv:1905.13742 (2019)

  51. McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)

    Google Scholar 

  52. Mettes, P., van der Pol, E., Snoek, C.: Hyperspherical prototype networks. Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  53. Muthukumar, V., Narang, A., Subramanian, V., Belkin, M., Hsu, D., Sahai, A.: Classification vs regression in overparameterized regimes: does the loss function matter? J. Mach. Learn. Res. 22(222), 1–69 (2021)

    MathSciNet  Google Scholar 

  54. Oh, J., Kim, S., Yun, S.Y.: FedBABU: towards enhanced representation for federated image classification. In: International Conference on Learning Representations (2021)

    Google Scholar 

  55. Oh, J., Yoo, H., Kim, C., Yun, S.Y.: Boil: towards representation change for few-shot learning. In: International Conference on Learning Representations (2021)

    Google Scholar 

  56. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)

    Google Scholar 

  57. Puigcerver, J., et al.: Scalable transfer learning with expert models. In: International Conference on Learning Representations (2021)

    Google Scholar 

  58. Pursell, L., Trimble, S.: Gram-Schmidt orthogonalization by gauss elimination. Am. Math. Mon. 98(6), 544–549 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  59. Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. In: International Conference on Learning Representations (2019)

    Google Scholar 

  60. Reddi, S., et al.: Adaptive federated optimization. In: International Conference on Learning Representations (2021)

    Google Scholar 

  61. Saad, D.: Online algorithms and stochastic approximations. Online Learning 5, 6–3 (1998)

    Google Scholar 

  62. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  63. Sangari, A., Sethares, W.: Convergence analysis of two loss functions in soft-max regression. IEEE Trans. Sig. Process. 64(5), 1280–1288 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  64. Shamir, O., Srebro, N., Zhang, T.: Communication-efficient distributed optimization using an approximate newton-type method. In: International Conference on Machine Learning, pp. 1000–1008. PMLR (2014)

    Google Scholar 

  65. Shao, S., Xing, L., Wang, Y., Xu, R., Zhao, C., Wang, Y., Liu, B.: MHFC: multi-head feature collaboration for few-shot learning. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4193–4201 (2021)

    Google Scholar 

  66. Shoham, N., et al.: Overcoming forgetting in federated learning on non-IID data. In: NeurIPS 2019 Workshop on Federated Learning for Data Privacy and Confidentiality (2019)

    Google Scholar 

  67. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  68. Singhal, K., Sidahmed, H., Garrett, Z., Wu, S., Rush, J., Prakash, S.: Federated reconstruction: partially local federated learning. Adv. Neural Inf. Process. Syst. 34 (2021)

    Google Scholar 

  69. Smith, V., Chiang, C.K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  70. Sun, B., Huo, H., Yang, Y., Bai, B.: PartialFED: cross-domain personalized federated learning via partial initialization. Adv. Neural Inf. Process. Syst. 34 (2021)

    Google Scholar 

  71. Tammes, P.M.L.: On the origin of number and arrangement of the places of exit on the surface of pollen-grains. Recueil des travaux botaniques néerlandais 27(1), 1–84 (1930)

    Google Scholar 

  72. Thrampoulidis, C., Oymak, S., Soltanolkotabi, M.: Theoretical insights into multiclass classification: a high-dimensional asymptotic view. Adv. Neural Inf. Process. Syst. (2020)

    Google Scholar 

  73. Trefethen, L.N., Bau III, D.: Numerical Linear Algebra, vol. 50. SIAM (1997)

    Google Scholar 

  74. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  75. Venkat, N., Kundu, J.N., Singh, D., Revanur, A., et al.: Your classifier can secretly suffice multi-source domain adaptation. Adv. Neural. Inf. Process. Syst. 33, 4647–4659 (2020)

    Google Scholar 

  76. Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv. Neural Inf. Process. Syst. (2020)

    Google Scholar 

  77. Wang, K., Mathews, R., Kiddon, C., Eichner, H., Beaufays, F., Ramage, D.: Federated evaluation of on-device personalization. arXiv preprint arXiv:1910.10252 (2019)

  78. Xilinx: Xilinx virtex-7 fpga vc707 evaluation kit.

  79. Xu, A., Huang, H.: Coordinating momenta for cross-silo federated learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8735–8743 (2022)

    Google Scholar 

  80. Xu, C., Hong, Z., Huang, M., Jiang, T.: Acceleration of federated learning with alleviated forgetting in local training. In: International Conference on Learning Representations (2021)

    Google Scholar 

  81. Yang, K., Fan, T., Chen, T., Shi, Y., Yang, Q.: A quasi-newton method based vertical federated learning framework for logistic regression. In: The 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality, in Conjunction with NeurIPS 2019 (2019)

    Google Scholar 

  82. Yao, X., Sun, L.: Continual local training for better initialization of federated models. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 1736–1740. IEEE (2020)

    Google Scholar 

  83. Ye, H.J., Hu, H., Zhan, D.C.: Learning adaptive classifiers synthesis for generalized few-shot learning. Int. J. Comput. Vision 129(6), 1930–1953 (2021)

    Article  MATH  Google Scholar 

  84. Yoon, J., Jeong, W., Lee, G., Yang, E., Hwang, S.J.: Federated continual learning with weighted inter-client transfer. In: International Conference on Machine Learning, pp. 12073–12086. PMLR (2021)

    Google Scholar 

  85. Yoon, T., Shin, S., Hwang, S.J., Yang, E.: Fedmix: approximation of mixup under mean augmented federated learning. In: International Conference on Learning Representations (2021)

    Google Scholar 

  86. You, K., Liu, Y., Wang, J., Long, M.: Logme: practical assessment of pre-trained models for transfer learning. In: International Conference on Machine Learning, pp. 12133–12143. PMLR (2021)

    Google Scholar 

  87. Yuan, H., Zaheer, M., Reddi, S.: Federated composite optimization. In: International Conference on Machine Learning, pp. 12253–12266. PMLR (2021)

    Google Scholar 

  88. Zhai, A., Wu, H.Y.: Classification is a strong baseline for deep metric learning. arXiv preprint arXiv:1811.12649 (2018)

  89. Zhang, L., Luo, Y., Bai, Y., Du, B., Duan, L.Y.: Federated learning for non-IID data via unified feature learning and optimization objective alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4420–4428 (2021)

    Google Scholar 

  90. Zhang, S.Q., Lin, J., Zhang, Q.: A multi-agent reinforcement learning approach for efficient client selection in federated learning. arXiv preprint arXiv:2201.02932 (2022)

  91. Zhang, S.Q., McDanel, B., Kung, H.: Fast: DNN training under variable precision block floating point with stochastic rounding. In: International Symposium on High-Performance Computer Architecture (2021)

    Google Scholar 

  92. Zhang, T., Yang, B.: Box-cox transformation in big data. Technometrics 59(2), 189–201 (2017)

    Article  MathSciNet  Google Scholar 

  93. Zhao, N., Wu, Z., Lau, R.W., Lin, S.: What makes instance discrimination good for transfer learning? In: International Conference on Learning Representations (2021)

    Google Scholar 

  94. Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V.: Federated learning with non-IID data. arXiv preprint arXiv:1806.00582 (2018)

  95. Zheng, Y., Pal, D.K., Savvides, M.: Ring loss: convex feature normalization for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5089–5097 (2018)

    Google Scholar 

  96. Zhu, Y., Bai, Y., Wei, Y.: Spherical feature transform for deep metric learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 420–436. Springer, Cham (2020).

    Chapter  Google Scholar 

  97. Zhu, Z., Hong, J., Zhou, J.: Data-free knowledge distillation for heterogeneous federated learning. In: International Conference on Machine Learning, pp. 12878–12889. PMLR (2021)

    Google Scholar 

Download references


This research was supported in part by the Air Force Research Laboratory under award number FA8750-18-1-0112.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Xin Dong .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 320 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, X., Zhang, S.Q., Li, A., Kung, H. (2022). SphereFed: Hyperspherical Federated Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13686. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19808-3

  • Online ISBN: 978-3-031-19809-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics