Ⅰ. Introduction
Distributed computing has gained considerable attention in the realm of computer science, particularly with the advent of large-scale data processing and real-time applications[1][2,3]. In order to handle large-scale data with improving computation speed, various distributed machine learning algorithms came out. Especially, federated learning (FL) is a learning method that transmits only model parameters to a central server without transmitting local data[4]. It has the advantage of protecting personal information while reducing the amount of communication data[5,6].
Meanwhile, quantum computing, which processes data using quantum mechanical phenomena such as entanglement and superposition, is known that it can solve some complex problems faster than conventional computing methods[7]. The research on quantum machine learning (QML) relies on quantum computing characteristics to enhance classical machine learning[8]. In the noisy intermediate-scale quantum (NISQ) era, where it is still difficult to use a large number of qubits, the ability of quantum neural networks (QNNs) has the potential to overcome these limitations.
One of the problems with FL is that sending large amounts of neural network model parameters can critically impact the performance of the global model. Quantum computing has a good chance of making up for this, since it has potential that QNN can obtain better under less parameter usage compared to classical NN. In detail, quantum computers leverage phenomena like quantum coherence and entanglement to perform computations that are unachievable for classical computers[9]. Quantum federated learning (QFL) is designed based on the traditional structure of classical FL, with the adaptation of quantum computing[10]. QFL retains the replacement of all classical neural networks (NNs) with quantum neural networks, preserving the overall architectural framework.
In this paper, we investigate the possibility of the practical use of QFL. An MNIST, a widely-used classification dataset, is applied to evaluate the performance of QFL. We have determined the appropriate number of qubits for quantum federated learning by varying the number of qubits used and the number of MNIST data classes.
The rest of this paper is organized as follows. Section II investigates the basics of quantum computing and quantum machine learning, and Section III discusses the detailed explanation and structure of QFL. Numerical results are analyzed in Section IV, and Section V concludes this paper and presents future research directions.
Ⅱ. Basics of Quantum Machine Learning
This section presents quantum computing, QNN, and its basics. Analogous to the role of bits as the fundamental units in classical computing, qubits serve as the fundamental units in quantum computing. One salient distinction between a qubit and a bit is that a qubit is represented in a two-dimensional quantum state. When leveraging quantum states as units of information, the intrinsic phenomena of quantum mechanics become pivotal in defining the information. Whereas a singular bit can possess one of two values, 0 or 1, a single qubit is typically represented as a superposition of states |0> and |1>, denoted as follows:
Here, [TeX:] $$|\alpha|^2$$ and [TeX:] $$|\beta|^2$$ represent the probabilities of measuring the qubit as 0 and 1, respectively. Hence, [TeX:] $$\alpha$$ and [TeX:] $$\beta$$ must satisfy the condition |[TeX:] $$|\alpha|^2 + |\beta|^2 = 1$$. Quantum superposition is one of the inherent properties of quantum physics, distinct from the classical theory.
In addition, one of the quantum properties of quantum computing is the entanglement among the qubits.
In the context of qubits, entanglement can be illustrated using a pair of qubits in an entangled state. If two qubits are entangled, measuring one qubit can immediately determine the state of the other qubit. However, this does not imply “communication” between the qubits. Instead, it is the result of the correlations present in their shared quantum state.
In conventional computers, semiconductor devices can function as operational amplifiers or switches, and the advancement in semiconductor fabrication technology has enabled the integration of billions of gates on a single integrated circuit. The output of each gate is a digital electrical signal representing a value of 0 or 1, which is fed into the input of another gate through wires. On the other hand, since quantum signals are analog signals and are sensitive to noise, the system should be designed to change the quantum state of the qubits as intended with qubits having quantum information rather than passing the quantum signals through the gate. Every quantum gate operates by rotating the state vector to another state, preserving the magnitude of the state vector as one basically. Among these, the most commonly used gates in quantum operations are [TeX:] $$R_x, R_y, R_z,$$ and controlled-NOT(CNOT) gates. The [TeX:] $$R_X, R_y, R_z$$ gates rotate the qubit by [TeX:] $$\theta$$ around their respective axes.
A QNN represents a quantum version of the traditional neural networks used in machine learning and is composed of encoding, parameterized quantum circuit (PQC), and measurement, as shown in Fig. 1. Encoding is the process of converting classical data into quantum state form for use in QNN. Commonly used encoding techniques in QML include amplitude-encoding, angle-encoding, and basis-encoding. Among these, angle-encoding is generally preferred due to its ease of implementation and superior performance. In the encoding layer, rotation gates are typically utilized. After passing through the encoding layer, the quantum state obtained from the output of the encoding layer serves as the input to the PQC. PQC carries out the desired computation equivalent to a classical neural network. a core component of a QNN. PQC is constructed by appropriately combining trainable rotation gates with fixed CNOT gates. The rotation gates enable continuous control of each qubit's state, while the CNOT gate performs the task of entangling two qubits. The layers in PQC are consisted of quantum gates including [TeX:] $$R_x, R_y, R_z$$ and CNOT gates. The number of layers can be adjusted as a hyperparameter. By combining rotation gates and CNOT gates, we can design a PQC. The performance of PQC is affected by the arrangement of the quantum gates that consists of the circuit. Measurement is the sole method of transforming an indeterminate quantum state into a definite value. Through this process, we can obtain an observable classical output.
Ⅲ. System Model
In this paper, we apply the QFL model with QNNs in edge devices, as visualized in Fig. 2. Similar to classical FL, in the QFL structure, each edge device trains the QNN model using its own dataset, and the output of the model is obtained through measurement, [TeX:] $$\langle 0\rangle_{x, \theta} \in[-1,1]^{|y|}$$, is referred to as an observable, where |y| signifies the output dimension. With the observable output and the origin value of the input, the loss [TeX:] $$\mathcal{L}(\theta)$$ is calculated. Thereafter, the QNN is trained using the stochastic gradient descent algorithm as follows:
where [TeX:] $$\eta$$ is the learning rate, and the gradient [TeX:] $$\nabla_\theta \mathcal{L}(\theta)$$ is calculated using the parameter shift rule [9]. These parameters are utilized in global model training using FedAvg, a commonly employed method in FL for parameter integration [3]. In other words, the classical input x is transformed into a quantumstate through the encoding stage, and this quantum state is processed through the PQC U([TeX:] $$\theta$$). The global QNN parameter [TeX:] $$\tilde{\theta}^G$$ is given by the averaged PQC parameter as follows:
A schematic illustration of QFL
where [TeX:] $$\tilde{\theta}^n$$ is the n-th device's local model parameter, and [TeX:] $$c_n \in \{\0,1}$$ is an indicator function returning 1 if the n-th edge device contributes to the global model aggregation.
Ⅳ. Performance evaluation
We performed MNIST image classification through the global QFL model. John et al.[11] also performed the experience in comparing the accuracy of various quantum computing across multiple datasets. However, it’s worth noting that the datasets employed in that paper are limited to binary classification problems, whereas our work extends this by classifying four and ten classes. We utilized four local devices, and the learning was carried out by varying the number of qubits to 4, 6, and 10, respectively. The evaluation was conducted on mini-MNIST, which classifies four kinds of data from 0 to 3, and on full-MNIST, which classifies ten kinds of data from 0 to 9. The training was performed over a total of 100 epochs.
Fig. 3 depicts the training loss for the mini-MNIST and full-MNIST datasets, respectively. Regardless of the number of qubits, the training loss converged well in all cases. Fig. 4 illustrates the test accuracy according to the number of qubits. In the system classifying four classes, approximately 92.4% of high performance was achieved with 4 qubits, while 88.5% and 89.2% were performed with 6 and 10 qubits, respectively. In this case, there was no linear trend according to the number of qubits. However, it was confirmed that high performance can be achieved even when a small number of qubits must be used, depending on the situation.
Training Losses for Mini-MNIST and Full-MNIST datasets
Test accuracies for Mini-MNIST and Full-MNIST datasets
Ⅴ. Conclusions
In this paper, image classification was performed through QFL. We discussed a quantum federated learning system that combines the advantages of federated learning and quantum computing and debated the structure of quantum computing and quantum neural networks. Demonstrating an accuracy of over 90% in performance evaluations, we confirmed the universal applicability of QFL. With the benefits of local data protection and reduced communication volume, Federated learning may be more actively utilized in scenarios like learning from electronic health record (EMR) data or in environments with poor communication conditions. It is generally known that utilizing a greater number of qubits tends to yield higher performance [12,13]. In future research, we plan to conduct experiments aimed at performance enhancement by modifying the quantum gates and layers within the PQC.