You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
o Quantum Boltzmann Machines: Theory and Implementation
9
-
* Classical Boltzmann Machines (QBMs)
9
+
* Quantum neural networks, wrapping up discussions from last week (see notes from last week at URL:"https://github.com/CompPhysics/QuantumComputingMachineLearning/blob/gh-pages/doc/pub/week15/ipynb/week15.ipynb")
10
+
* Classical Boltzmann Machines (BMs)
10
11
* Restricted Quantum Boltzmann Machines (RQBM)
11
12
* Training Quantum Boltzmann Machines
12
13
* Practical Implementation with PennyLane
14
+
o Summary of course and work on project 2
13
15
!eblock
14
16
15
17
!split
16
18
===== Introduction =====
17
19
20
+
!bblock
18
21
Quantum Boltzmann Machines (QBM extend the
19
22
classical Boltzmann machine (a probabilistic neural network) into the
20
23
quantum domain. QBMs promise richer representations by leveraging
21
24
quantum superposition and entanglement, potentially capturing
22
-
correlations that classical models cannot . In these notes, we review
23
-
classical Boltzmann machines and restricted Boltzmann machines (RBMs),
25
+
correlations that classical models cannot .
26
+
!eblock
27
+
28
+
!bblock
29
+
In these notes, we review
30
+
first classical Boltzmann machines and restricted Boltzmann machines (RBMs).
31
+
Thereafter we
24
32
introduce QBMs and their restricted variant (RQBM), discuss training
25
33
methods, and illustrate practical implementation using
26
34
PennyLane.
35
+
!eblock
36
+
37
+
!split
38
+
===== Classical Boltzmann machines =====
27
39
28
40
29
41
!split
30
-
===== Introduction to Quantum Boltzmann Machines (QBMs) =====
42
+
===== Quantum Boltzmann Machines (QBMs) =====
31
43
32
44
33
45
A Quantum Boltzmann Machine (QBM) extends a classical BM by replacing
34
46
each binary unit with a qubit and generalizing the energy to a quantum
35
-
Hamiltonian. Concretely, consider a system of $N$ qubits. A
36
-
convenient choice is the transverse-field Ising model (TFIM)
37
-
Hamiltonian:
38
-
39
-
40
-
41
-
42
-
43
-
Boltzmann machines are energy-based generative models that learn a
44
-
probability distribution by associating an energy to each
45
-
configuration of visible and hidden binary units. In a Restricted
46
-
Boltzmann Machine (RBM), the units form a bipartite graph with visible
47
-
and hidden layers and no intra-layer connections. The joint
with inverse temperature $\beta$ (set to 1 as we did for standard Boltzmann machines ) and partition function
61
+
$Z = \Tr(\exp{-\beta H})$. The probability of observing a visible
62
+
configuration v is obtained by measuring $\rho$ in the computational
78
63
basis (and tracing out hidden qubits if any). In effect, the quantum
79
64
model can capture richer correlations via superposition and
80
-
entanglement .
81
-
82
-
83
-
84
-
Figure: A schematic of quantum generative modeling using a parameterized quantum circuit (Quantum Circuit Born Machine, or QCBM). A training dataset with empirical distribution \tilde{p}(x) is used to optimize quantum circuit parameters \theta so that the circuit-induced distribution q_\theta(x) = |\langle x|U(\theta)|0\rangle|^2 approximates the target distribution . This figure illustrates the data pipeline and loss evaluation for generative modeling. The Quantum Boltzmann machine can be viewed as another quantum generative model, where the circuit prepares a thermal state rather than a pure state.
85
-
86
-
65
+
entanglement.
87
66
88
-
Figure: Framework for quantum generative modeling. A parameterized quantum circuit $U(\theta)$ is trained so that its output distribution $q_\theta(x)=|\langle x|U(\theta)|0\rangle|^2$ matches the data distribution $p(x)$ . The lower part of the figure shows the “Born machine” approach; in a Quantum Boltzmann Machine, one would instead prepare and measure a thermal (Gibbs) state of a Hamiltonian.
89
67
90
68
91
-
92
-
93
-
94
-
Classical Boltzmann Machines
95
-
96
-
97
-
98
-
99
-
100
-
A classical Boltzmann machine (BM) is an Ising model with binary units
101
-
v_i,h_j. Its energy can be written E(v,h) = - \sum_i a_i v_i - \sum_j
102
-
b_j h_j - \sum_{i<j} W_{ij} x_i x_j (where x runs over all units)
103
-
. The restricted variant (RBM) enforces no visible-visible or
104
-
hidden-hidden couplings, so only v\!-\!h interactions remain.
105
-
Training maximizes the likelihood of training data by adjusting
106
-
\{a,b,W\}. In practice this involves computing the “positive phase”
107
-
(expectation under the data) and “negative phase” (expectation under
108
-
the model), typically by Gibbs sampling or contrastive divergence .
109
-
Despite the simplification of the restricted architecture, exact
110
-
training of RBMs remains computationally demanding due to the cost of
111
-
sampling the model distribution, motivating the exploration of quantum
112
-
accelerations.
113
-
114
-
115
-
116
-
117
-
118
-
Quantum Boltzmann Machines
69
+
!split
70
+
===== Quantum Boltzmann Machines =====
119
71
120
72
121
73
@@ -124,7 +76,7 @@ Quantum Boltzmann Machines
124
76
In a Quantum Boltzmann Machine (QBM), the classical energy is replaced
125
77
by a Hamiltonian H acting on qubits. The model distribution over
126
78
classical bitstrings v is given by the diagonal of the quantum Gibbs
127
-
state \rho = e^{-H}/Z. A straightforward choice is a stoquastic
79
+
state $\rho = e^{-H}/Z$. A straightforward choice is a stoquastic
128
80
Hamiltonian that is diagonal in the computational basis
129
81
(e.g. involving only Pauli-$Z$ operators), which yields a probability
130
82
distribution very similar to a classical BM. More generally one can
@@ -135,24 +87,23 @@ from the transverse-field Ising Hamiltonian . However,
135
87
non-commutativity makes exact training harder, so many proposals use
136
88
either special Hamiltonians or variational approximations.
137
89
90
+
!split
138
91
92
+
!split
93
+
===== Restricted QBM (RQBM) =====
139
94
140
95
A Restricted Quantum Boltzmann Machine (RQBM) (also called Quantum RBM
141
96
or QRBM) enforces a bipartite structure analogous to the classical
142
97
RBM: no hidden-hidden interactions, and possibly limited
143
98
hidden-visible connectivity. The simplest RQBM Hamiltonian can be
This is directly analogous to the classical RBM gradient: the update
240
186
for each parameter is proportional to the difference between its
@@ -262,12 +208,8 @@ Figure: Quantum vs. classical training loop for RBMs. In the classical loop (re
262
208
263
209
264
210
265
-
266
-
Parameter Optimization and Variational Techniques
267
-
268
-
269
-
270
-
211
+
!split
212
+
===== Parameter Optimization and Variational Techniques =====
271
213
272
214
Given the gradient above, one can optimize \theta by standard
273
215
gradient-based methods (SGD, Adam, etc.). In a gate-based setting, we
@@ -310,10 +252,8 @@ with provably polynomial complexity under realistic conditions .
310
252
311
253
312
254
313
-
314
-
Implementation with PennyLane
315
-
316
-
255
+
!split
256
+
===== Implementation with PennyLane =====
317
257
318
258
319
259
@@ -324,7 +264,7 @@ and entangling gates that respect the bipartite structure. Below is
324
264
illustrative code (in Python) using PennyLane’s default.qubit
325
265
simulator.
326
266
327
-
267
+
!bc pycod
328
268
import pennylane as qml
329
269
import numpy as np
330
270
# Number of visible and hidden qubits
@@ -352,42 +292,30 @@ dev = qml.device(“default.qubit”, wires=n_v+n_h)
352
292
# Return probability distribution on visible wires
353
293
354
294
return qml.probs(wires=list(range(n_v)))
355
-
356
-
\end{lstlisting}
295
+
!ec
357
296
358
297
359
298
360
299
This circuit takes a parameter vector params of length n_v+n_h and returns the probabilities q_\theta(v) of measuring each visible bitstring v. Notice we measure only the visible wires (the wires=list(range(n_v)) in qml.probs marginalizes out the hidden qubit).
361
300
362
301
363
302
364
-
Next, we train this model to match a target dataset distribution. Suppose our data has distribution target = [p(00), p(01), p(10), p(11)]. We can define the (classical) loss as the Kullback-Leibler divergence D_{\rm KL}(p_{\rm data}\Vert q_\theta) or simply the negative log-likelihood. Then we update params by gradient descent. PennyLane’s automatic differentiation can compute gradients via the parameter-shift rule, but we show an explicit parameter-shift computation for demonstration:
365
-
366
-
367
-
368
-
\begin{lstlisting}[language=Python]
369
-
370
-
371
-
372
-
373
-
374
-
Example target distribution over 2 visible bits
375
-
376
-
377
-
378
-
303
+
Next, we train this model to match a target dataset distribution.
304
+
Suppose our data has distribution target = [p(00), p(01), p(10),
305
+
p(11)]. We can define the (classical) loss as the Kullback-Leibler
306
+
divergence D_{\rm KL}(p_{\rm data}\Vert q_\theta) or simply the
307
+
negative log-likelihood. Then we update params by gradient descent.
308
+
PennyLane’s automatic differentiation can compute gradients via the
309
+
parameter-shift rule, but we show an explicit parameter-shift
310
+
computation for demonstration:
379
311
312
+
!bc pycod
313
+
# Example target distribution over 2 visible bits
380
314
target = np.array([0.3, 0.2, 0.1, 0.4]) # must sum to 1
381
-
382
-
383
-
384
315
def loss(params):
385
-
386
-
probs = circuit(params) # model probabilities for visible states
387
-
316
+
probs = circuit(params) # model probabilities for visible states
0 commit comments