Performance Analysis of Quantum Federated Learning in Data Classification

Hyunsoo Lee♦ and Soohyun Park°

Abstract

Abstract: Federated learning is a method with the advantage of allowing various institutions to create a global model by sharing model parameters without sharing the data they possess. Also, with the advent of the era of quantum computing, efforts to combine traditional machine learning algorithms and quantum computing are gaining momentum. In this paper, we intend to discuss quantum federated learning (QFL) methods. We examine the fundamentals of quantum computing and the structure of quantum neural networks, and introduce the methods of quantum federated learning. QFL demonstrated a potential for practical application in the real world, achieving up to 92% performance in the MNIST data classification task.

Keywords: Federated Learning , Quantum computing , Quantum Federated Learning

Ⅰ. Introduction

Distributed computing has gained considerable attention in the realm of computer science, particularly with the advent of large-scale data processing and real-time applications[1][2,3]. In order to handle large-scale data with improving computation speed, various distributed machine learning algorithms came out. Especially, federated learning (FL) is a learning method that transmits only model parameters to a central server without transmitting local data[4]. It has the advantage of protecting personal information while reducing the amount of communication data[5,6].

Meanwhile, quantum computing, which processes data using quantum mechanical phenomena such as entanglement and superposition, is known that it can solve some complex problems faster than conventional computing methods[7]. The research on quantum machine learning (QML) relies on quantum computing characteristics to enhance classical machine learning[8]. In the noisy intermediate-scale quantum (NISQ) era, where it is still difficult to use a large number of qubits, the ability of quantum neural networks (QNNs) has the potential to overcome these limitations.

One of the problems with FL is that sending large amounts of neural network model parameters can critically impact the performance of the global model. Quantum computing has a good chance of making up for this, since it has potential that QNN can obtain better under less parameter usage compared to classical NN. In detail, quantum computers leverage phenomena like quantum coherence and entanglement to perform computations that are unachievable for classical computers[9]. Quantum federated learning (QFL) is designed based on the traditional structure of classical FL, with the adaptation of quantum computing[10]. QFL retains the replacement of all classical neural networks (NNs) with quantum neural networks, preserving the overall architectural framework.

In this paper, we investigate the possibility of the practical use of QFL. An MNIST, a widely-used classification dataset, is applied to evaluate the performance of QFL. We have determined the appropriate number of qubits for quantum federated learning by varying the number of qubits used and the number of MNIST data classes.

The rest of this paper is organized as follows. Section II investigates the basics of quantum computing and quantum machine learning, and Section III discusses the detailed explanation and structure of QFL. Numerical results are analyzed in Section IV, and Section V concludes this paper and presents future research directions.

Ⅱ. Basics of Quantum Machine Learning

This section presents quantum computing, QNN, and its basics. Analogous to the role of bits as the fundamental units in classical computing, qubits serve as the fundamental units in quantum computing. One salient distinction between a qubit and a bit is that a qubit is represented in a two-dimensional quantum state. When leveraging quantum states as units of information, the intrinsic phenomena of quantum mechanics become pivotal in defining the information. Whereas a singular bit can possess one of two values, 0 or 1, a single qubit is typically represented as a superposition of states |0> and |1>, denoted as follows:

(1)
[TeX:] $$\alpha|0\rangle+\beta|1\rangle .$$

Here, [TeX:] $$|\alpha|^2$$ and [TeX:] $$|\beta|^2$$ represent the probabilities of measuring the qubit as 0 and 1, respectively. Hence, [TeX:] $$\alpha$$ and [TeX:] $$\beta$$ must satisfy the condition |[TeX:] $$|\alpha|^2 + |\beta|^2 = 1$$. Quantum superposition is one of the inherent properties of quantum physics, distinct from the classical theory.

In addition, one of the quantum properties of quantum computing is the entanglement among the qubits.

In the context of qubits, entanglement can be illustrated using a pair of qubits in an entangled state. If two qubits are entangled, measuring one qubit can immediately determine the state of the other qubit. However, this does not imply “communication” between the qubits. Instead, it is the result of the correlations present in their shared quantum state.

In conventional computers, semiconductor devices can function as operational amplifiers or switches, and the advancement in semiconductor fabrication technology has enabled the integration of billions of gates on a single integrated circuit. The output of each gate is a digital electrical signal representing a value of 0 or 1, which is fed into the input of another gate through wires. On the other hand, since quantum signals are analog signals and are sensitive to noise, the system should be designed to change the quantum state of the qubits as intended with qubits having quantum information rather than passing the quantum signals through the gate. Every quantum gate operates by rotating the state vector to another state, preserving the magnitude of the state vector as one basically. Among these, the most commonly used gates in quantum operations are [TeX:] $$R_x, R_y, R_z,$$ and controlled-NOT(CNOT) gates. The [TeX:] $$R_X, R_y, R_z$$ gates rotate the qubit by [TeX:] $$\theta$$ around their respective axes.

A QNN represents a quantum version of the traditional neural networks used in machine learning and is composed of encoding, parameterized quantum circuit (PQC), and measurement, as shown in Fig. 1. Encoding is the process of converting classical data into quantum state form for use in QNN. Commonly used encoding techniques in QML include amplitude-encoding, angle-encoding, and basis-encoding. Among these, angle-encoding is generally preferred due to its ease of implementation and superior performance. In the encoding layer, rotation gates are typically utilized. After passing through the encoding layer, the quantum state obtained from the output of the encoding layer serves as the input to the PQC. PQC carries out the desired computation equivalent to a classical neural network. a core component of a QNN. PQC is constructed by appropriately combining trainable rotation gates with fixed CNOT gates. The rotation gates enable continuous control of each qubit's state, while the CNOT gate performs the task of entangling two qubits. The layers in PQC are consisted of quantum gates including [TeX:] $$R_x, R_y, R_z$$ and CNOT gates. The number of layers can be adjusted as a hyperparameter. By combining rotation gates and CNOT gates, we can design a PQC. The performance of PQC is affected by the arrangement of the quantum gates that consists of the circuit. Measurement is the sole method of transforming an indeterminate quantum state into a definite value. Through this process, we can obtain an observable classical output.

Fig. 1.
QNN Architecture

Ⅲ. System Model

In this paper, we apply the QFL model with QNNs in edge devices, as visualized in Fig. 2. Similar to classical FL, in the QFL structure, each edge device trains the QNN model using its own dataset, and the output of the model is obtained through measurement, [TeX:] $$\langle 0\rangle_{x, \theta} \in[-1,1]^{|y|}$$, is referred to as an observable, where |y| signifies the output dimension. With the observable output and the origin value of the input, the loss [TeX:] $$\mathcal{L}(\theta)$$ is calculated. Thereafter, the QNN is trained using the stochastic gradient descent algorithm as follows:

(2)
[TeX:] $$\tilde{\theta} \leftarrow \theta-\eta \nabla_\theta \mathcal{L}(\theta),$$

where [TeX:] $$\eta$$ is the learning rate, and the gradient [TeX:] $$\nabla_\theta \mathcal{L}(\theta)$$ is calculated using the parameter shift rule [9]. These parameters are utilized in global model training using FedAvg, a commonly employed method in FL for parameter integration [3]. In other words, the classical input x is transformed into a quantumstate through the encoding stage, and this quantum state is processed through the PQC U([TeX:] $$\theta$$). The global QNN parameter [TeX:] $$\tilde{\theta}^G$$ is given by the averaged PQC parameter as follows:

Fig. 2.
A schematic illustration of QFL

(3)
[TeX:] $$\tilde{\theta}^G \leftarrow \frac{1}{\sum_{n=1}^N c_n} \sum_{n=1}^N c_n \cdot \tilde{\theta}^n$$

where [TeX:] $$\tilde{\theta}^n$$ is the n-th device's local model parameter, and [TeX:] $$c_n \in \{\0,1}$$ is an indicator function returning 1 if the n-th edge device contributes to the global model aggregation.

Ⅳ. Performance evaluation

We performed MNIST image classification through the global QFL model. John et al.[11] also performed the experience in comparing the accuracy of various quantum computing across multiple datasets. However, it’s worth noting that the datasets employed in that paper are limited to binary classification problems, whereas our work extends this by classifying four and ten classes. We utilized four local devices, and the learning was carried out by varying the number of qubits to 4, 6, and 10, respectively. The evaluation was conducted on mini-MNIST, which classifies four kinds of data from 0 to 3, and on full-MNIST, which classifies ten kinds of data from 0 to 9. The training was performed over a total of 100 epochs.

Fig. 3 depicts the training loss for the mini-MNIST and full-MNIST datasets, respectively. Regardless of the number of qubits, the training loss converged well in all cases. Fig. 4 illustrates the test accuracy according to the number of qubits. In the system classifying four classes, approximately 92.4% of high performance was achieved with 4 qubits, while 88.5% and 89.2% were performed with 6 and 10 qubits, respectively. In this case, there was no linear trend according to the number of qubits. However, it was confirmed that high performance can be achieved even when a small number of qubits must be used, depending on the situation.

Fig. 3.
Training Losses for Mini-MNIST and Full-MNIST datasets
Fig. 4.
Test accuracies for Mini-MNIST and Full-MNIST datasets

Ⅴ. Conclusions

In this paper, image classification was performed through QFL. We discussed a quantum federated learning system that combines the advantages of federated learning and quantum computing and debated the structure of quantum computing and quantum neural networks. Demonstrating an accuracy of over 90% in performance evaluations, we confirmed the universal applicability of QFL. With the benefits of local data protection and reduced communication volume, Federated learning may be more actively utilized in scenarios like learning from electronic health record (EMR) data or in environments with poor communication conditions. It is generally known that utilizing a greater number of qubits tends to yield higher performance [12,13]. In future research, we plan to conduct experiments aimed at performance enhancement by modifying the quantum gates and layers within the PQC.

Biography

Hyunsoo Lee

Feb. 2021:B.S. degree. Soongsil University

Mar. 2021~Current:M.S.-Ph.D Combined Course, Korea University

[Research Interests] Reinforcement Learning, Electronic Engineering, Communication Engineering

[ORCID:0000-0003-1113-9019]

Biography

Soohyun Park

Feb. 2019 : B.S. degree, Chung- Ang University

Mar. 2020~Aug. 2023 : Ph.D. degree, Korea University

Aug. 2023~Current : Postdoctoral, Korea University

[Research Interests] Deep learning algorithms and their applications to computer networking, autonomous mobility platforms, and quantum multi-agent distributed autonomous systems.

[ORCID:0000-0002-6556-9746]

References

  • 1 D. Kwon, J. Jeon, S. Park, J. Kim, and S. Cho, "Multiagent DDPG-based deep learning for smart ocean federated learning IoT (a) Mini-MNIST (b) Full-MNIST Fig. 3. Training Losses for Mini-MNIST and Full-MNIST datasets (a) Mini-MNIST (b) Full-MNIST Fig. 4. Test accuracies for Mini-MNIST and Full-MNIST datasets 268 networks," IEEE Internet of Things J., vol. 7, no. 10, pp. 9895-9903, Oct. 2020. (https://doi.org/10.1109/JIOT.2020.2988033)doi:[[[10.1109/JIOT.2020.2988033]]]
  • 2 S. Barbarossa, S. Sardellitti, and P. Di Lorenzo, "Communicating while computing: Distributed mobile cloud computing over 5G heterogeneous networks," IEEE Signal Process. Mag., vol. 31, no. 6, pp. 45-55, Nov. 2014. (https://doi.org/10.1109/MSP.2014.2334709)doi:[[[10.1109/MSP.2014.2334709]]]
  • 3 H. Baek, S. Park, and J. Kim, "Dynamic유저 환경 연합학습을 위한 파라미터 Aggregation 최적화," in Proc. KICS Summer Conference 2023, pp. 1535-1536, Jeju Island, Korea, June 2023.custom:[[[-]]]
  • 4 B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-efficient learning of deep networks from decentralized data," in Proc. Int. Conf. AISTATS, pp. 1273-1282, Ft. Lauderdale, FL, USA, Apr. 2017. (https://doi.org/10.48550/arXiv.1602.05629)doi:[[[10.48550/arXiv.1602.05629]]]
  • 5 S. Niknam, H. S. Dhillon, and J. H. Reed, "Federated learning for wireless communications: Motivation, opportunities, and challenges," IEEE Commun. Mag., vol. 58, no. 6, pp. 46-51, Jun. 2020. (https://doi.org/10.1109/MCOM.001.1900461)doi:[[[10.1109/MCOM.001.1900461]]]
  • 6 C. Ma, J. Li, M. Ding, H. H. Yang, F. Shu, T. Q. S. Quek, and H. V. Poor, "On safeguarding privacy and security in the framework of federated learning," IEEE Network, vol. 34, no. 4, pp. 242-248, Jul./Aug. 2020. (https://doi.org/10.1109/MNET.001.1900506)doi:[[[10.1109/MNET.001.1900506]]]
  • 7 F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. Buell, et al., "Quantum supremacy using a programmable superconducting processor," Nature, vol. 574, no. 7779, pp. 505-510, Oct. 2019. (https://doi.org/10.1038/s41586-019-1666-5)doi:[[[10.1038/s41586-019-1666-5]]]
  • 8 R. Huang, X. Tan, and Q. Xu, "Learning to learn variational quantum algorithm," IEEE Trans. Neural Netw. and Learn. Syst. (Early Access), pp. 1-11, Feb. 2022. (https://doi.org/10.1109/TNNLS.2022.3151127)doi:[[[10.1109/TNNLS.2022.3151127]]]
  • 9 J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, "Quantum machine learning," Nature, vol. 549, pp. 195-202, Sep. 2017. (https://doi.org/10.1038/nature23474)doi:[[[10.1038/nature23474]]]
  • 10 M. Chehimi and W. Saad, "Quantum federated learning with quantum data," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8617-8621, Singapore, May 2022. (https://doi.org/10.1109/ICASSP43922.2022.97 46622)doi:[[[10.1109/ICASSP43922.2022.9746622]]]
  • 11 M. John, J. Schuhmacher, P. Barkoutsos, I. Tavernelli, and F. Tacchino, "Optimizing quantum classification algorithms on classical benchmark datasets," Entropy, vol. 25, no. 6, pp. 860-873, May 2023. (https://doi.org/10.3390/e25060860)doi:[[[10.3390/e25060860]]]
  • 12 K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, "Quantum circuit learning," Physical Rev. A, vol. 98, no. 3, p. 032309, Sep. 2018. (https://doi.org/10.1103/PhysRevA.98.032309)doi:[[[10.1103/PhysRevA.98.032309]]]
  • 13 A. Wack, et al., "Quality, speed, and scale: Three key attributes to measure the performance of near-term quantum computers," arXiv preprint arXiv:2110.14108, Oct. 2021. (https://arxiv.org/abs/2110.14108)custom:[[[https://arxiv.org/abs/2110.14108)]]]
QNN Architecture
A schematic illustration of QFL
Training Losses for Mini-MNIST and Full-MNIST datasets
Test accuracies for Mini-MNIST and Full-MNIST datasets