Optimizing a Variational Quantum Circuit, studying the character of the optimized cost as a function of layers in the circuit

7 min readOct 30, 2020

An approach to optimizing a variational quantum circuit using a gradient-free optimizer and study the change in cost function with increasing the number of layers in the circuit.

Variational Quantum Circuits (also known as VQC) are basically quantum algorithms consisting of one or more parameterized gates where the gate parameters are trainable. VQC is commonly used in Quantum Machine Learning and many other optimization problems.

As shown in the picture, a VQC starts with initial state preparation (in QML problems, various feature mapping techniques are used for that, whereas it can be left as zero state also as per the requirement of problems we are solving). After the initial state preparation comes to the variational circuit consisting of one or more trainable parameterized gates (maybe some non-parameterized gates also), which we have to optimize. Finally, the measurement is done and the measured value is used again as feedback for optimizing the trainable parameters of the variational circuit. Importantly, the optimization is not a part of the Quantum circuit, it is done by a classical optimizer.

A little back story

The problem statement, I will be discussing here is one of the four selection tasks of the second cohort of QOSF mentorship program. You can learn more about the Quantum Open Source Foundation and the mentorship program here.

Disclaimer: Unfortunately😔, I didn’t get through the selection task as there was a very less number of mentors compared to the huge quantity of submissions. Also, I will be discussing some of the points where I could have done better along with the approach.

Problem Statement

The problem statement was basically studying how the optimized cost of a Variational circuit changes with the increasing number of layers (also known as repetitions). Each layer, that was specified in the problem statement has two blocks, namely U1 and U2.

U1, also denoted as ‘Odd blocks’ has four RX gates with trainable parameters one applied on each of the four qubits. Similarly, U2 (Even blocks) consists of RZ gates in the same manner, followed by a specific combination of CZ gates on the four qubits.

Design of Each layer mentioned in the problem statement

The circuit shown in the above picture is the design of each layer, where the left side of the barrier is U1 (odd block) and the right side is U2 (even block). In each layer, there will be a total of eight parameters to be optimized, four for the RX gates, four for the RZ. CZ gates require no parameter.

Now we need to study the optimized cost of statevector of the quantum circuit with respect to a random statevector |phi>, which should be constant for all the circuits.

My approach using Gradient-free optimizer

Here we will be going through the implementation using pennylane, everything starting from the necessary imports up to plotting the optimized cost as a function of the number of layers in the circuit. Also, we will look into how the cost decreased in each circuit in the course of the optimization process.

Necessary Imports

Although I have imported NumPy directly, it is recommended to import the wrapped version of Numpy provided by Pennylane.

Function to initialize theta values

The initialize_theta function takes only the no. of layers as argument. It initializes a random NumPy array having dimensions (layers*2, 4). For each layer, there will be two rows, one for a set of RX gate parameters another for RY. As the no. of qubits has been fixed, the column no. is also fixed at 4.

Initializing random statevector |Phi>

As mentioned in the problem statement, the cost function should be calculated with respect to a random statevector. As the circuit consists of 4 qubits, there will be 2⁴, i.e. 16 elements in the statevector, which we randomly initialized using random.randn function of Numpy.

Better approach: Using the above approach to generate random |phi> is not actually feasible. Firstly, while randomly initializing we are not taking care of the statevector to be normalized, so the sum of the square of all the elements is not equal to one here. This led to a huge difference between |phi> and the output statevector even after being optimized.

So what we should do is either taking care of the normalization while initializing using a random function or creating a four qubit circuit with random gates and considering its output statevector as |phi>. The second approach is much more feasible.

Creating the Circuit

The simulate_circuit() takes the NumPy array of parameters and calculates the no. of qubits from the dimension of the Numpy array itself. So we don’t have to explicitly pass the no. of qubits in the function.

Although we could have directly called the circuit() function, there are some restrictions in the return type of circuit function of pennylane. It can return either a single or tuple of measured observable values. So we encapsulated the circuit() function within another function named simulate_circuit(), which returns the statevector of the device instance using dev.state

Better approach: We can call the states of the qubits as an observable using qml.probs(). You can learn more about that here.

Calculating the Cost

The cost of the circuit is here denoted by ‘distance’. We take the difference between the elements of the output statevector of simulated Quantum circuit, denoted by |Psi> and random vector |phi> then return the sum of square of mod of all the elements.

Optimizing the circuit parameters

There is plenty of both Gradient-based and Gradient-free optimizers in Pennylane. In this problem, we will be using Rotosolve, a gradient-free optimizer provided in Pennylane. You can learn more about the built-in optimizers here.

The optimize() function takes the Numpy array of gate parameters and no. of iterations as arguments (and also the learning rate if you are using a gadient-based optimizer). There are two more optional arguments: print_distance and plot_distance. When set to True, it prints and plots the costs of each iteration steps.

callback: As the circuit becomes more computationally expensive with an increasing number of layers, I have tried to terminate unnecessary extra iterations. Although the maximum iterations is specified as 50 while calling the optimize() function, the optimization process will be terminated midway, if the cost of the circuit remains the same up to 2 decimal places for three consecutive optimization steps.

Main Function

Gradient free optimizers being computationally expensive, I was able to simulate only up to 15 layer circuit.

Plotting the optimized cost as a function of no. of layers

As can be observed from the plot, from a single-layer circuit up to 6 layers, the optimized cost dropped extensively. From layer 6 to layer 10 the drop became quite small. Finally from a circuit with 10 layers to 15 layers, the optimized cost was the same approximately. More importantly from a specific circuit to the next one, the optimized cost either decreased or remained the same here.

No. of iterations as a function layers in the circuit

Although the maximum iteration is set to 50, due to the terminating condition in the optimize() function, none of all 15 circuits took above 40 iterations. You can visualize the no.of iteration taken by each circuit to reach the minimum cost in this plot.

Optimization steps of a single circuit

For each circuit irrespective of the no. layers, the trend of cost function in the course of the optimization process looks quite similar.

A plot of cost functions at each step of optimization for circuits with a single layer and 12 layers has been shown in the above picture. You can have a look into the plot of costs in each optimization step for the rest of the circuits in this notebook.

Here is the GitHub repo, where you can get some more insights along with all the information discussed above 😎.

Qiskit is another famous framework for Quantum Computing. In case you are more familiar with that, I will recommend this blog, by Mark Cunningham, where he has described his approach to this same problem using Qiskit.

Thanks a lot for reading. Please let me know if any corrections/suggestions. Please do 👏 if you like the post. Thanks in advance 😄…