Benchmarks¶
Benchmark results¶
In our release publication we compare Qibo performance with other publicly available libraries for quantum circuit simulation and we provide results from different hardware configurations. For convenience the results can be found in the following examples for various tasks related to circuit or adiabatic evolution simulation:
The libraries used in these benchmarks are shown in the table below with their respective default simulation precision and supported hardware configurations.
Library 
Precision 
Hardware 

single/double 
multithread CPU, GPU, multiGPU 

single 
singlethread CPU 

single 
singlethread CPU 

double 
singlethread CPU 

double 
singlethread CPU 

double 
multithread CPU 

single 
multithread CPU, GPU 

double 
multithread CPU, GPU 
The default precision and hardware configuration is used for all libraries. Singlethread Qibo numbers were obtained using the taskset utility to restrict the number of threads.
All results presented in the above pages are produced with an
NVIDIA DGX Station.
The machine specification includes 4x NVIDIA Tesla V100 with
32 GB of GPU memory each, and an Intel Xeon E52698 v4 with 2.2 GHz
(20Core/40Threads) with 256 GB of RAM.
The operating system of this machine is the default Ubuntu 18.04LTS with
CUDA/nvcc 10.1
, TensorFlow 2.2.0 and g++ 7.5
.
The following sections describe how to run Qibo benchmarks using the scripts found at: https://github.com/QuantumTII/qibo/tree/master/examples/benchmarks.
How to run circuit benchmarks?¶
The main benchmark script is main.py
. This can be
executed as python main.py (OPTIONS)
where (OPTIONS)
can be any of the
following options:
nqubits
(int
): Number of qubits in the circuit.type
(str
): Type of benchmark circuit. Available circuit types are shown in the next section. Some circuit types support additional options which are described below.backend
(str
): Qibo backend to use for the calculation. Available backends are"custom"
,"matmuleinsum"
,"defaulteinsum"
,"numpy_defaulteinsum"
and"numpy_matmuleinsum"
."custom"
is the default backend.precision
(str
): Complex number precision to use for the benchmark.Available options are
'single'
and'double'
.
device
(str
): Tensorflow device to use for the benchmarks. Example:device /GPU:0
ordevice /CPU:0
.accelerators
(str
): Devices to use for distributed execution of the circuit. Example:accelerators 1/GPU:0,1/GPU:1
will distribute the execution on two GPUs. The coefficient of each device denotes the number of times to reuse this device.memory
(int
): Limits GPU memory used for execution. If no limiter is used, Tensorflow uses all available by default.nshots
(int
): Number of measurement shots. This will benchmark the sampling of frequencies, not individual shot samples. If not given no measurements will be performed and the benchmark will terminate once the final state vector is found.compile
(bool
): If used, the circuit will be compiled usingtf.function
. Note that custom operators do not support compilation. Default isFalse
.fuse
(bool
): Circuit gates will be fused for faster execution of some circuit types. Default isFalse
.
When a benchmark is executed, the total simulation time will be printed in the
terminal once the simulation finishes. Optionally execution times can be saved
by passing the filename
(str
) flag. All benchmarks details are logged
in a Python dictionary and saved in a text file using json.dump
. The logs
include circuit creation and simulation times. If the given filename
already
exists it will be updated, otherwise it will be created.
Available circuit types¶
As explained above, the circuit to be used in the benchmarks can be selected
using the type
flag. This accepts one of the following options:
qft
: Circuit for Quantum Fourier Transform.The circuit contains SWAP gates that rearrange output qubits to their original input order.
variational
: Example of a variational circuit.Contains layer of parametrized
RY
gates followed by a layer of entanglingCZ
gates. The parameters ofRY
gates are sampled randomly from 0 to 2pi. Supports the following options:nlayers
: Total number of layers.
optvariational
: Same asvariational
using theqibo.abstractions.gates.VariationalLayer
.This gate optimizes execution by fusing the parametrized with the entangling gates before applying them to the state vector. Supports the following options:
nlayers
: Total number of layers.
onequbitgate
: Single onequbit gate applied to all qubits. Supports the following options:
gatetype
: Which onequbit gate to use.nlayers
: Total number of layers.theta
: Value of the free parameter (for parametrized gates).
twoqubitgate
: Single twoqubit gate applied to all qubits. Supports the following options:
gatetype
: Which twoqubit gate to use.nlayers
: Total number of layers.theta
(and/orphi
): Value of the free parameter (for parametrized gates).
ghz
: Circuit that prepares the GHZ state..
supremacy
: Circuit inspired by the Quantum supremacy experiment.Contains alternating layers of random onequbit gates and
CU1
gates. Onequbit gates are randomly selected from the set{RX, RY, RZ}
and have random phases. The total number of layers is controlled usingnlayers
. Supports the following options:nlayers
: Total number of layers.
How to run VQE benchmarks?¶
It is possible to run a VQE optimization benchmark using vqe.py
. This
supports the following options:
nqubits
(int
): Number of qubits in the circuit.nlayers
(int
): Total number of layers in the circuit.method
(str
): Optimization method.maxiter
(int
): Maximum number of iterations for the optimizer.varlayer
: If used the circuit will be created using theqibo.abstractions.gates.VariationalLayer
gate which fuses one and two qubits for efficiency.
The script will perform the VQE minimization and will print the optimal energy found and its difference with the exact ground state energy. It will also show the total execution time.
How to run QAOA benchmarks?¶
It is possible to run a QAOA optimization benchmark using qaoa.py
. This
supports the following options:
nqubits
(int
): Number of qubits in the circuit.nangles
(int
): Number of variational parameters in the QAOA ansatz. The parameters are initialized according to uniform distribution in [0, 0.1].trotter
(bool
): IfTrue
it uses the Trotter decomposition to apply the exponential operators.solver
(str
): Solver to use for applying the exponential operators.method
(str
): Optimization method.maxiter
(int
): Maximum number of iterations for the optimizer.
The script will perform the QAOA minimization and will print the optimal energy found and its difference with the exact ground state energy. It will also show the total execution time.