1 Histogram Toolkit Tutorial

This notebook contains tutorial material for the histogram toolkit provided as part of DL-MONTE. The aim is to provide an introduction to the idea of reweighting observations and histograms based on results from a simulation.

For exposition, we choose the two-dimensional Ising model. A simple implementation of the Ising model is provided with the toolkit so that results for analysis can be generated by the reader.

  • All the results used for analysis are generated in the first section of the notebook. So users need to look through this first.

Users interested in DL-MONTE per se should look at the accompanying notebook for the introduction to the toolkit in the DL-MONTE context.

In writing this notebook, a number of sources have been drawn upon, including

  1. A Guide to Monte Carlo Simulations in Statistical Physics D.P. Landau and K. Binder (Cambridge).

  2. Online lecture notes by Kari Rummukainen of Helsinki University available at http://www.helsinki.fi/~rummukai/lectures/montecarlo_oulu/

This work was funded by the UK EPSRC under the ARCHER eCSE Programme.

Note

Please note this notebook uses print() and so requires Python 3 (usually displayed at the top right of the notebook screen).

You can also check the current version via:

In [ ]:
import sys
print (sys.version_info)



1.1 Python and the Histogram Toolkit

Practical examples of the histogram toolkit are provided throughout the notebook, so the reader will need to know some python.

Basic requirements beyond the toolkit are limited to the numpy, scipy, and matplotlib packages (this notebook only needs numpy and matplotlib).

The following cell imports all the requirements:

In [ ]:
%matplotlib inline
import numpy
import matplotlib.pyplot as plt

import htk.util
from htk.sources.ising import IsingModel
from htk.sources.isingdata import IsingModelData

Not found?

If you see an error saying htk is not found, then you may need to add the toolkit directory to PYTHONPATH. This can be achieved by running

export PYTHONPATH=$DL_MONTE_HOME/htk:$PYTHONPATH

in the shell, where DL_MONTE_HOME is the location of the DL MONTE installation, and then restarting the notebook.


2. Ising Model Example

Here's an example based on the Ising Model. Suppose we have an Ising model in two dimensions with Hamiltonian:

${\cal H} = -J \sum_{} \sigma_i \sigma_j - \mu H \sum_i \sigma_i$

where $J$ is a coupling constant, $\mu$ is a susceptibility, $H$ is the external field and the spins are $\sigma_i$. The interactions involve four nearest neighbours denoted $\langle i,j\rangle$. We assume there are a total of $N$ spins and the temperature is $k_b T$. From this point on we will assume $\mu = 1$ and write the Hamoltonian

$ {\cal H} = -JS - HM. $

The free energy of the system is written $F = -k_b T \ln Z$, where $Z$ is the partition function. The internal energy is $U = -T^2\partial(F/T)/\partial T$ where $F$ may depend on other parameters (assumed to be held constant in the derivative). The heat capacity $C_V$ is the rate of change $(\partial U / \partial T)_V$. Fluctuations in the energy can be used to obtain the heat capacity $C_V$ via

$$ C_V = \frac{\langle{\cal H}^2\rangle - \langle{\cal H}\rangle^2}{k_b T^2} = \frac{1}{k_b T^2}\big\langle({\cal H} - \langle{\cal H}\rangle )^2\big\rangle_{NVT}.$$

To compare systems of different sizes, it is often useful to consider the "specific heat" $C_V/N$.

Likewise, fluctuations in the magnetisation $M = \sum_i \sigma_i$ can be used to obtain the isothermal susceptibilty $\chi = (\partial M / \partial H)_V$ via

$$\chi = \frac{\langle M^2\rangle - \langle M \rangle^2}{k_b T}.$$

The (mean field) critical temperature for the Ising model is known to be at $$kT/J = 2 / ln(1 + \sqrt{2})$$ or $kT_c/J \approx 2.2692$.

2.1 Running the Ising Model

The histogram toolkit contains a 2-d IsingModel class which can be used to generate some test data. To use the class, import via

In [ ]:
from htk.sources.ising import IsingModel

The interface to the IsingModel class includes methods for initialisation and executation:

In [ ]:
help(IsingModel)

The following sections discuss how to generate the test results used later in the analysis. Some of the longer Monte Carlo runs may take an unreasonably long time on any given hardware, so it may be desirable to use fewer Monte Carlo interations, particularly for the larger systems.

2.1.1 Running from different initial conditions

We can examine the course of the observable quantiies as a function of the number of Monte Carlo sweeps for different initial conditions. Ultimately, this should not influence the equilibrium result.

Take an example of a 16x16 system at a relatively low temperature kT = 1.0, which we can initialise as either 'hot' or 'cold'. We run the system for 2000 Monte Carlo sweeps and store the results in a file in each case. These should take a few seconds to execute in each case:

In [ ]:
nlen = 16
J = 1.0
H = 0.0
kT = 1.0
seed = 37
nsweeps = 100
modelh = IsingModel(nlen, J, H, kT, seed, 'hot')
modelh.run(nsweeps, file = 'ising016_ht_1.0_s37.dat')

Instantiate a new model with the same parameters, and use a 'cold' start:

In [ ]:
modelc = IsingModel(nlen, J, H, kT, seed, 'cold')
modelc.run(nsweeps, file = 'ising016_cd_1.0_s37.dat')

Now we need to reload these data from file into an object which can be manipulated by the histogram toolkit. This is done via the IsingModelData class:

In [ ]:
from htk.sources.isingdata import IsingModelData 
datah = IsingModelData(filename = 'ising016_ht_1.0_s37.dat')
datac = IsingModelData(filename = 'ising016_cd_1.0_s37.dat')

Using these objects, we can extract the total energy and the magnetisation as a function of MC sweep (which have been recorded in the file), and plot them using standard pyplot methods. Here we plot the total energy and M, the absolute value of the magnetisation as a function of MC sweep.

Data are accessed via the data attribute of the obsevable, identified by its key (e for energy, and m for magnetisation; these are not case sensitive).

It can be seen that for this system size and temperature, it takes about 10-20 MC sweeps for the hot and cold starts to converge to a consistent state.

In [ ]:
%matplotlib inline
import numpy
import matplotlib.pyplot as plt

plt.figure(figsize=(10.,3.0))
plt.subplot(1,2,1)
plt.xlabel("MC sweep")
plt.ylabel("Total energy")
plt.axis([0,nsweeps,-2.5,1.2])
plt.plot(datac.observable('e').data[:], 'bs')
plt.plot(datah.observable('e').data[:], 'ro')

plt.subplot(1,2,2)
plt.xlabel("MC sweep")
plt.ylabel("|Magnetisation|")
plt.axis([0,nsweeps,0.0,1.2])
plt.plot(numpy.abs(datac.observable('M').data), 'bs')
plt.plot(numpy.abs(datah.observable('M').data), 'ro')
plt.show()

We can try again at a higher temperature. We make use of a convenience method in the IsingModelData class (actaully the superclass method summary_time_series()) to display time series of the observables in each case. At the higher temperature, comparison will show that the results of the 'hot' and 'cold' look very similar.

In [ ]:
kT = 3.0
nsweeps = 500
model3h = IsingModel(nlen, J, H, kT, seed, 'hot')
model3c = IsingModel(nlen, J, H, kT, seed, 'cold')
model3h.run(nsweeps, file = "ising016_ht_3.0_s37.dat")
model3c.run(nsweeps, file = "ising016_cd_3.0_s37.dat")
In [ ]:
data3h = IsingModelData(filename = "ising016_ht_3.0_s37.dat")
data3c = IsingModelData(filename = "ising016_cd_3.0_s37.dat")
data3h.summary_time_series(plt, 'e')
data3c.summary_time_series(plt, 'e')
2.1.2 Generating the Test Results

A number of runs of the Ising model are required to generate the data files used in the analysis. These are generated by the commands below.

Two sets are provided. The "short" set can be generated relateively quickly and provide a reasonable overview of the reuslts. However, the short length of the runs means that statistics are not really satisifactory.

A "medium" set is also described which provide better statistics at the cost of increased run time.

All the simulations use here have $N=4, 8$ or $16$, and have $J=1$ and $H=0$.

Short set
In [ ]:
J = 1.0; H = 0.0
model = IsingModel(4, J, H, 2.1, 37, 'hot')
model.run(10000, file = "short_u004_2.1_s37_h.dat")
In [ ]:
model = IsingModel(8, J, H, 2.1, 37, 'hot')
model.run(10000, file = "short_u008_2.1_s37_h.dat")
In [ ]:
model = IsingModel(16, J, H, 2.1, 37, 'hot')
model.run(10000, file = "short_u016_2.1_s37_h.dat")
Medium set
In [ ]:
# May take about a minute
model = IsingModel(4, J, H, 2.2692, 39, "hot")
model.run(200000, file = "medium_u004_2.2692_s39_h.dat")
In [ ]:
# May take a few minutes
model = IsingModel(8, 1.0, 0.0, 2.2692, 39, "hot")
model.run(200000, file = "medium_u008_2.2692_s39_h.dat")
In [ ]:
# May take 20-30 minutes
model = IsingModel(16, 1.0, 0.0, 2.2692, 39, "hot")
model.run(200000, file = "medium_u016_2.2692_s39_h.dat")


3. Reweighting

Here we use some of the data generated in the previous section to perform some basic reweighting operations. A number of features of the Histogram Toolkit are also discussed.

3.1 Obsevables

Suppose we have some data from a previous Ising Model simulation. We can load it using the data source class, and review its contents:

In [ ]:
data1 = IsingModelData(filename = "short_u008_2.1_s37_h.dat")
table = data1.to_table()
print(table)

Internally, the parameters are stored as Parameter objects, having a Label and a value. Parameters with a given label can extracted from the data set container:

In [ ]:
p = data1.parameter("kT")
print("The temperature is: ", p)

Note that the Parameter is a number, but also has an associated label:

In [ ]:
print(p.label)

Likewise, observables data are stored in an Observable object, which may be accessed via, e.g.,

In [ ]:
t = data1.observable("t")
print(t)

This shows that the time (a misnomer for MC simulation) observations are stored as an array (numpy.ndarray, in fact), here with values between 2,020 and 400,000 MC sweeps.

We can use the labels --- which are not case sensitive --- to look at the time series of the various observables. For example, use "M" or "m" to look at the magnetisation:

In [ ]:
data1.summary_time_series(plt, 'm')

3.1.1 Autocorrelation and Errors

The normalised autocorrelation for quantity $A(t)$ is defined as:

$$ \phi(t) = \frac{\langle A(0)A(t)\rangle - \langle A\rangle^2}{\langle A^2\rangle - \langle A\rangle^2} $$

The normaliser $\langle A^2\rangle - \langle A\rangle^2 $ is defined so that $ \phi(t = 0) = 1$. This can be useful to check that obsevations are not unduly correlated. For example, looking at the energy in an output with observations recorded at every MC sweep, we can see:

In [ ]:
data2 = IsingModelData(filename = "short_u004_2.1_s37_h.dat")
data2.summary_autocorrelation(plt, 'e')

We can get a numerical estimate of the correlation time $\tau$ by integrating the autocorrelation function $\phi(t)$. This utility function also returns the "time step" $\delta t$ (in Monte Carlo sweeps).

In [ ]:
(dt, tau) = data2.autocorrelation_time("E")
print(dt, tau, (1.0 + 2.0*tau/dt))

Estimates of the standard deviation for an observable quantity are then scaled by the factor

$(1 + 2\tau / \delta t)$

which can be large. See Landau and Binder 4.2.4.

Different observable quatities may have different characteristic correlation times:

In [ ]:
data2.summary_autocorrelation(plt, 'M')

Ultimately, the autocorrelation time can depend on many factors, including the observable, the temperature, update method in the MC algorithm, and so on.

Histogram

A utility method is provided to plot a histogram of observable measurements, e.g., for the magnetisation

In [ ]:
data1.summary_histogram(plt, "m", 21)

3.2 Observable reweighting

Suppose we have a satisfactory Monte Carlo simulation for the Ising Model in the NVT ensemble which has been run with a given set of parameters near the critical point.

In [ ]:
data04 = IsingModelData()
data04.load(filename = "medium_u004_2.2692_s39_h.dat")
table = data04.to_table()
print(table)

We will need the system size and the temperature from the parameters, and we will also use the total energy (not energy per site) to find reweighted values of the specific heat:

In [ ]:
volume = data04.parameter("V")
kT = data04.parameter("kT")

The data arrays are extracted as follows. We will require the total energy (not that per site), so multiply by the volume. We also generate a set of values for observations $E_i^2$:

In [ ]:
e0 = data04.observable('E').data
e1 = volume*e0[:]
e2 = e1[:]*e1[:]

We reweight a series of observables in NVT via:

$$ \langle O\rangle_{\beta'} = \frac{\sum_i O_i \exp[-(\beta' - \beta)E_i]}{\sum_i \exp[-(\beta' - \beta)E_i]}.$$

We define a series of new temerpatures; for each temperature we will produce a reweighted $\langle E\rangle$ and reweighted $\langle E^2 \rangle$, which allows $C_V$ (actually $C_V/N$) to be computed.

In [ ]:
nrw = 75
kt_new = numpy.linspace(start = 1.2, stop = 2.7, num = nrw)

A Reweighter object is supplied by the data set container. This performs the appropriate reweighting for new values of $kT$:

In [ ]:
reweighter = data04.reweighter("kT")
print(reweighter)
In [ ]:
cv04 = numpy.zeros(nrw)
for nt in range(nrw):
    e1r = reweighter.reweight_obs(e1, kt_new[nt])
    e2r = reweighter.reweight_obs(e2, kt_new[nt])
    cv04[nt] = (1.0/(volume*kt_new[nt]**2))*(e2r - e1r*e1r)

The reweighted values of $C_V$ are shown below. We can also plot the single value of $C_V/N$ computed directly from the simulation data at the original temperature (the reweighted value must match here).

In [ ]:
e1bar = e1.mean()
e2bar = e2.mean()
cv04s = (1.0/(volume*kT**2))*(e2bar - e1bar*e1bar)

plt.plot(kt_new, cv04)
plt.plot(kT, cv04s, 'ro')
plt.xlabel("Temperature")
plt.ylabel("$C_V/N$")
plt.show()

To get an estimate of the statistical error in the energy, we recall our factor related to the autocorrelation time $(1 + 2\tau/\delta t)$, which is (actually rather small for this example):

In [ ]:
(dt, tau) = data04.autocorrelation_time("E")
de = numpy.sqrt((1.0/e1.size)*(e2bar - e1bar*e1bar)*(1.0 + 2.0*tau/dt))
print(dt, tau, de)

Larger systems

If the total energy becomes large, the terms in the exponential can grow to the point that numerical problems arise. This is handled in the reweighter code by finding $E_{max} = max_i \{E_i\}$ and computing

$$ \langle O\rangle_{\beta'} = \frac{\sum_i O_i \exp[-(\beta' - \beta)(E_i - E_{max})]}{\sum_i \exp[-(\beta' - \beta)(E_i - E_{max})]}. $$

However, there is also a limit on the range to which reweighting can be carried out accurately (discussed below). We'll see what the situation looks like at first.

Here are the data for an 8x8 system:

In [ ]:
data08 = IsingModelData("medium_u008_2.2692_s39_h.dat")
In [ ]:
volume = data08.parameter("V")
kT = data08.parameter("kT")

e0 = data08.observable("E").data
e1 = volume*e0[:]

Reweight to the same series of new temperatures. Note that the reweighter function will take numpy.ndarray as an argument and return an array of new values.

In [ ]:
reweighter = data08.reweighter("kT")

e1r = reweighter.reweight_obs(e1[:], kt_new[:])
e2r = reweighter.reweight_obs(e1[:]*e1[:], kt_new[:])

cv08 = numpy.zeros(nrw)
cv08[:] = (1.0/(volume*kt_new[:]**2))*(e2r[:] - e1r[:]*e1r[:])

Data for the 16x16 system is also reweighted to provide estimates of $C_V$:

In [ ]:
data16 = IsingModelData("medium_u016_2.2692_s39_h.dat")
In [ ]:
volume = data16.parameter("V")
kT = data16.parameter("kT")

e0 = data16.observable("E").data
e1 = volume*e0[:]

rewght = data16.reweighter("beta")
e1r = rewght.reweight_obs(e1, 1.0/kt_new[:])
e2r = rewght.reweight_obs(e1[:]*e1[:], 1.0/kt_new[:])

cv16 = numpy.zeros(nrw)
cv16[:] = (1.0/(volume*kt_new[:]**2))*(e2r[:] - e1r[:]*e1r[:])

This plot compares the reweighted values of $C_V$ for the different system sizes.

In [ ]:
plt.plot(kt_new, cv04)
plt.plot(kt_new, cv08)
plt.plot(kt_new, cv16)
plt.xlabel("Temperature", fontsize = 12)
plt.ylabel("$C_V/N$", fontsize = 18)
plt.show()

Clearly, there is something awry with the above picture. In particular, the curve for the largest system size (red) look badly behaved. This is because the range over which reweighting is valid depends on a number of factors.

3.2.1 Valid range of reweighting

If we look at the histogram of the total energy for a given simulation, this will give us an idea of the extent of reliable statitics in energy space. Clearly, away from the expectation value, there are few or no observations.

In [ ]:
data16.summary_histogram(plt, "E", 21)

If we form the discrete histogram of the total energy from the $N$ observations of the total energy $E_i$ as

$ h_{kT}(E_j) = (1/N) \sum_i \delta(E_i, E_j) $

where $E_j$ represent the histogram bins and $\delta(E_i, E_j)$ counts the observations in bin $j$, it is possible to reweight the historgram to a new temperature $T'$ via

$$ h_{kT'}(E_j) = \frac{\sum_i \delta(E_i, E_j) exp [-(1/kT' - 1/kT)E_i]}{\sum_i exp[-(1/kT' - 1/kT)E_i]}$$

If the resulting histogram is too far from the expectation value for the original, statisitical errors will grow to dominate.

We can look more closely at the histogram. At this point we will use the numpy.histogram method to perform the binning explicitly. Later, we will see the Histogram Toolkit Histrogram class.

In [ ]:
e0 = data16.observable("E").data
volume = data16.parameter("V")
e1 = volume*e0[:]
kt = data16.parameter("kT")
erange = (-512.0, -128.0)
hist1, bin_edges1 = numpy.histogram(e1, bins = 32, range = erange, density = True)
ax = plt.subplot()
ax.set_ylim(0.0, 0.01)
bars = plt.bar(bin_edges1[:-1], hist1, width = 4.)

We can reweight the histogram to a new temperature by computing the weights $exp[-(1/kT' - 1/kT)E_i]$ in numerator of the above equation. The normalisation is taken care of by the numpy.histogram routine as it is just hte sum of the weights. Note we subtract the largest exponent as a protection against overflow. We change the temperature by 0.1 (a guess).

In [ ]:
w2 = numpy.zeros(e0.size)
smax = e1.max()
# Change kt by 0.1
ktnew = kt + 0.1
db = (1.0/ktnew) - (1.0/kt)
w2[:] = numpy.exp(-db*(e1[:] - smax))
ax = plt.subplot()
ax.set_ylim(0.0, 0.01)
hist2, bin_edges2 = numpy.histogram(e1, bins = 32, range = erange, density = True, weights = w2)
bars2 = plt.bar(bin_edges2[:-1], hist2, width = 8., color = 'r')
bars1 = plt.bar(bin_edges1[:-1], hist1, width = 4., color = 'b')

This shows that we may be getting toward the limit of reweighting for this example. Following Rummukainen, the limit of reweighting may be estimated by asking that the new expectation value lies within one (original) standard deviation of the original expectation, i.e.:

$$ \left| \langle E\rangle_{kT} - \langle E\rangle_{kT'} \right| < \big\langle[E - \langle E\rangle_{kT}]^2 \big\rangle_{kT}^{1/2}. $$

At this point we note the standard relation $-(\partial U / \partial\beta)_V = \langle{\cal H}\rangle - \langle{\cal H}\rangle^2$ may be used to write

$$ kT^2C_V = \langle H\rangle^2 - \langle H\rangle^2 = \big\langle (E - \langle E\rangle)^2 \big\rangle_{NVT} $$

(see Landau and Binder 2.1.1.4). In this case our condition becomes

$$ \left| \langle E\rangle_{kT} - \langle E\rangle_{kT'} \right| < (kT^2 C_V)^{1/2}, $$

where $C_V$ is evaluated at the original temperature. If we make the assumption that the changes are small, we may approximate

$$ C_V = \frac{\partial E }{\partial T}\Bigg|_V \approx \frac{\langle E\rangle_{kT} - \langle E\rangle_{kT'}}{\delta kT} $$

and so the maximum change in temperature which meets the condition is

$$ \delta kT_{max} \approx (kT^2/C_V)^{1/2}. $$

We can check this against what is seen above by computing directly $C_V$ from the expectation values of the energy:

In [ ]:
e2 = e1[:]*e1[:]
e1x = e1.mean()
e2x = e2.mean()
cv = htk.util.nvt_cv(e1x, e2x, kt)

dktmax = kt/numpy.sqrt(cv)
print(kt, dktmax)

which looks about right. Similar arguments can be made for the histograms of other observables, in which case $C_V$ is replaced by the appropriate susceptibility. Note that the maximum temperature range for reweighting decreases with increasing system size (as $V^{-1/2}$, or $N^{-1/2}$).

Re-analysis

If we take a slightly relaxed limit of two standard deviations in the above argument, we can look again at the data for systems sizes 4x4, 8x8, 16x16 and 32x32, and adopt limits of $\delta kT_{max} = 0.8$, $0.4$, $0.2$ and $0.1$ respectively.

If we take the appropriate subset of the results computed above, the analysis then looks like:

In [ ]:
plt.plot(kt_new, cv04)
plt.plot(kt_new[34:], cv08[34:])
plt.plot(kt_new[44:59], cv16[44:59])
plt.xlabel("Temperature", fontsize = 12)
plt.ylabel("$C_V/N$", fontsize = 18)
plt.show()

3.2.3 Reweighting wrt External Magnetic Field

We can also reweight by changing other parameters in the Hamiltonian . If our original simulation was performed at zero external magnetic field $H = 0$, we can reweight to finite $H$.

Let's work out the susceptibility for one of the previous data sets.

In [ ]:
data04 = IsingModelData(filename = "short_u004_2.1_s37_h.dat")
kt = data04.parameter("kT")
volume04 = data04.parameter("V")
m1 = volume04*data04.observable("m").data
m2 = m1[:]*m1[:]
In [ ]:
m1bar = m1.mean()
m2bar = m2.mean()
chi = (1.0/kt)*(m2bar - m1bar*m1bar)
print ("Susceptibility ", chi/(volume04*volume04))

To reweight an observable with repect to the external magnetic field we need $$ \langle O\rangle_{H'} = \frac{\sum_i O_i exp [-\beta(H' - H)M_i]}{ \sum_i exp[-\beta(H'-H)M_i]} $$ where $H'$ is the new external field.

In [ ]:
# Create some new external field values
nrw = 40
hnew04 = numpy.linspace(start = 0.0, stop = 0.4, num = nrw)

# Use the pre-supplied reweighter for "H"
reweighter = data04.reweighter("H")
In [ ]:
chinew = numpy.zeros(nrw, dtype = numpy.float)


for i in range(nrw):
    m1r = reweighter.reweight_obs(m1, hnew04[i])
    m2r = reweighter.reweight_obs(m2, hnew04[i])
    chinew[i] = (1.0/(volume04*kt))*(m2r - m1r*m1r)

plt.plot(hnew04*volume, chinew/volume)
plt.xlabel("$HL^2$", fontsize = 16)
plt.ylabel("$\chi / L^2$", fontsize = 16)
plt.show()
In [ ]:
data08a = IsingModelData(filename = "short_u008_2.1_s37_h.dat")
kt = data08a.parameter("kT")
volume08 = data08a.parameter("V")
m1 = volume08*data08a.observable("M").data
m2 = m1[:]*m1[:]

chinew2 = numpy.zeros(nrw, dtype = numpy.float)
hnew08 = numpy.linspace(start = 0.0, stop = 0.1, num = nrw)
reweighter = data08a.reweighter("H")

for i in range(nrw):
    m1r = reweighter.reweight_obs(m1, hnew08[i])
    m2r = reweighter.reweight_obs(m2, hnew08[i])
    chinew2[i] = (1.0/(volume08*kt))*(m2r - m1r*m1r)

ax = plt.subplot()
ax.set_xlim(0.0, 6.0)
plt.plot(hnew04*volume04, chinew/volume04)
plt.plot(hnew08*volume08, chinew2/volume08)
plt.xlabel("$HL^2$", fontsize = 16)
plt.ylabel("$\chi / L^2$", fontsize = 16)
plt.show()

These values can be compared with the results of, e.g., Ferrenberg and Swendsen Physical Review Letters 61, 2635-2638 (1988).

3.2.4 Reweighting wrt two or more parameters at the same time

It may be convenient to reweight more than one parameter, e.g., to reweight with respect to both temperature and external field in one go.

This may be done simply by combining the appropriate Boltzmann weight factors in the reweighting expression, e.g., $$ \langle O\rangle_{kT', H'} = \frac{\sum_i O_i exp[-(\beta' - \beta)E_i] exp[-\beta(H' - H)M_i]}{ \sum_i exp[-(\beta' - \beta)E_i] exp[-\beta(H'-H)M_i]} $$

where we have a new (inverse) temperature $\beta'$ and a new external magnetic field $H'$.

We will illustrate this by first creating, explicitly, a Reweighter object for the temperature (we could have used the one above).

In [ ]:
data04 = IsingModelData(filename = "short_u004_2.1_s37_h.dat")

# Note these are the Parameter objects, and the Observable object

kt = data04.parameter("kT")
volume = data04.parameter("V")
obs_e1 = data04.observable("E")

# Recall that the energies are per site, so a factor of the volume
# is required to compute the total energy

reweighter_kt = htk.histogram.KTReweighter("kt", kt, volume, obs_e1)
In [ ]:
# You can check the arguments of the KTReweighter initialisation:
help(htk.histogram.KTReweighter)

We note that we can reweight the magnetistation wrt temperature in the normal way (here in fact the modulus of the magnetisation), so

In [ ]:
kt = data04.parameter("kT")
volume = data04.parameter("V")
m1 = data04.observable("M").data
m2 = m1[:]*m1[:]

nrw = 40
kt_new = numpy.linspace(start = 1.2, stop = 3.7, num = nrw)

mr = numpy.zeros(nrw, dtype = numpy.float)

for i in range(nrw):
    mr[i] = reweighter_kt.reweight_obs(m2, kt_new[i])

plt.plot(kt_new, mr)
plt.show()

We may recover from the reweighter object for $kT'$ a set of weights appropriate to a single new temperature $kT'$:

In [ ]:
kt_new = 1.5
wkt = reweighter_kt.compute_weights(kt_new)

We than then create a ChainReweighter object which will reweight with resepect to the magnetic field, but includes the Boltzmann weights for the new temperature, thus:

In [ ]:
h_old = data04.parameter("H")
m = data04.observable("M")

const = htk.parameter.Parameter(volume/kt,
                                htk.util.Label("v/kT", "Constant", "sites/k_bT"))

r = htk.histogram.ChainReweighter("h_and_kt", h_old, const, m, wkt)

Use this to reweight to a series of new external fields, and the new temperature:

In [ ]:
kt = data04.parameter("kT")
volume = data04.parameter("V")
m1 = volume*data04.observable("M").data
m2 = m1[:]*m1[:]

nrw = 40
hnew = numpy.linspace(start = 0.0, stop = 0.4, num = nrw)
chinew1 = numpy.zeros(nrw, dtype = numpy.float)

for i in range(nrw):
    m1r = r.reweight_obs(m1, hnew[i])
    m2r = r.reweight_obs(m2, hnew[i])
    chinew1[i] = (1.0/(volume*kt_new))*(m2r - m1r*m1r)
    
plt.plot(hnew*volume, chinew/volume)
plt.plot(hnew*volume, chinew1/volume)
plt.xlabel("$HL^2$", fontsize = 16)
plt.ylabel("$\chi / L^2$", fontsize = 16)
plt.show()

3.2.5 Predicting the position of the critical point via reweighting

In the example above, the simulation took place very close to the (known) position of the (mean field) critical point. But what is the actual position of the critical point for that system size? We just need to locate the position of the maximum of the curve as plotted.

It is a straightforward task to locate the maximum of the reweighted curve for the existing data at $KT = 2.1$:

In [ ]:
data004a = IsingModelData(filename = "short_u004_2.1_s37_h.dat")
volume = data004a.parameter("V")
kt = data004a.parameter("kt")

nrw = 80
ktnew = numpy.linspace(start = 1.9, stop = 2.7, num = nrw)

cv_a = data004a.reweight_cv(ktnew)
In [ ]:
# Cheat: original cv is...

cva0 = data004a.reweight_cv(kt)

plt.plot(ktnew, cv_a, 'bs')
plt.plot(kt, cva0, 'ro')
plt.xlabel("Temperature", fontsize=16)
plt.ylabel("$C_V/N$", fontsize=16)
plt.show()

One can identify the predicted maximum of $C_V$, e.g.:

In [ ]:
kt_peak = ktnew[cv_a.argmax()]
print ("Maximum approx. ", kt_peak)

Try again nearer the critical temperature, e.g., $kT = 2.3$

In [ ]:
J = 1.0
H = 0.0
seed = 99
model = IsingModel(4, J, H, 2.3, seed, 'hot')
model.run(10000, "short_u004_2.3_s99_h.dat")

And look at the updated results (cf. previous)...

In [ ]:
data004b = IsingModelData(filename = "short_u004_2.3_s99_h.dat")
kt = data004b.parameter("kt")
cv_b = data004b.reweight_cv(ktnew)
cvb0 = data004b.reweight_cv(kt)
In [ ]:
plt.plot(ktnew, cv_a, label = "kT = 2.10")
plt.plot(ktnew, cv_b, label = "kT = 2.38")
plt.plot(2.10, cva0, 'bo')
plt.plot(kt, cvb0, 'go')
plt.legend(loc = 'lower right')
plt.xlabel("Temperature")
plt.ylabel("$C_V/N$")
plt.show()
Additional simulation

Let's try one more at a higher temperature $kT = 2.7$. The accompanying plot shows we obtain three slightly differening estimates of the $C_V$ curve from the three simuations at different temperatures.

In [ ]:
J = 1.0
H = 0.0
seed = 99
model = IsingModel(4, J, H, 2.7, seed, 'hot')
model.run(10000, "short_u004_2.7_s99_h.dat")
In [ ]:
data004c = IsingModelData(filename = "short_u004_2.7_s99_h.dat")
cv_c = data004c.reweight_cv(ktnew)
cvc0 = data004c.reweight_cv(2.7)
In [ ]:
plt.plot(ktnew, cv_a, label = "kT = 2.1")
plt.plot(ktnew, cv_b, label = "kT = 2.3")
plt.plot(ktnew, cv_c, label = "kT = 2.7")
plt.plot(2.10, cva0, 'bo')
plt.plot(2.30, cvb0, 'go')
plt.plot(2.70, cvc0, 'ro')
plt.legend(loc = "lower right")
plt.show()


3.3 Reweighting Two or More Datasets

The previous section has produced a number of simulations of the same system with different temperatures. We can use more than one set of observations to reweight a given observation; this should produce better statistics than a single simulation alone. The description here follows Rummukainen [2].

In practice, we need to connect the simulations at different temperatures by coming up for an expression for the (temperature-independent) density of states. This introduces a free energy for each separate simulation which we denote $f_r$ where we have simulations $r = 1, \ldots N^\mathrm{run}$. A given simulation $r$ can has a number of measurements $N^r$ associated with it (and the number can be different for different runs). Energy measurements are denoted $E_i^r$ for run $r$ with $i = 1,\ldots, N^r$. Each of the runs has a weight associated with it which is related to the statistical inefficiency for the energy measurements: $ w = (1.0 + 2\tau/\delta t)^{-1}$

The free energies are related by the equation

$$ \exp[-f_r] = \sum_q^{N^\mathrm{run}} \sum_i^{N^q} \frac{w_q \exp[-\beta_r E_i^q]}{\sum_s^{N^\mathrm{run}} N^s w_s \exp[-\beta_s E_i^q + f_s]} $$

Values for the free energies must be computed via an iterative method and are determined up to an arbitary constant (which will remain unspecified).

To reweight to a new temperature denoted $kT' = 1/\beta'$, we first need the free energy at the new temperature $f'$, computed using the same relation

$$ \exp[-f'] = \sum_q^{N^\mathrm{run}} \sum_i^{N^q} \frac{w_q \exp[-\beta' E_i^q]}{\sum_s^{N^\mathrm{run}} N^s w_s \exp[-\beta_s E_i^q + f_s]}. $$

Observable quantities $O_i^r$ may be reweighted to the new temperature via

$$ \langle O \rangle_{\beta'} = \sum_q^{N^\mathrm{run}} \sum_i^{N^q} \frac{ O_i^q w_q \exp[-\beta' E_i^q - f']}{\sum_s^{N^\mathrm{run}} N^s w_s \exp[-\beta_s E_i^q + f_s]}. $$
In [ ]:
# This will take a minute or two..
volume = 16.0

ea = volume*data004a.observable("E").data
eb = volume*data004b.observable("E").data
ec = volume*data004c.observable("E").data
e = [ea, eb, ec]
obs1 = [ea, eb, ec]
obs2 = [ea[:]*ea[:], eb[:]*eb[:], ec[:]*ec[:]]
kT = [data004a.parameter("kT"), data004b.parameter("kT"),
      data004c.parameter("kt")]

(dt_a, tau_a) = data004a.autocorrelation_time("E")
(dt_b, tau_b) = data004b.autocorrelation_time("E")
(dt_c, tau_c) = data004c.autocorrelation_time("E")
ta = (1.0 + 2.0*tau_a/dt_a)
tb = (1.0 + 2.0*tau_b/dt_b)
tc = (1.0 + 2.0*tau_c/dt_c)
w = [1.0/ta, 1.0/tb, 1.0/tc]

print (dt_a, dt_b, dt_c, tau_a, tau_b, tau_c, ta, tb, tc)

cvm = numpy.zeros(nrw)
fe = htk.util.multiple_free_energies(e, kT, w)
for i in range(nrw):
    e1r = htk.util.multiple_reweight_observable_nvt(e, obs1, kT, fe, w, ktnew[i])
    e2r = htk.util.multiple_reweight_observable_nvt(e, obs2, kT, fe, w, ktnew[i])
    cvm[i] = (1.0/(volume*ktnew[i]**2))*(e2r - e1r*e1r)
In [ ]:
plt.plot(ktnew, cv_a, label = "kT = 2.1")
plt.plot(ktnew, cv_b, label = "kT = 2.3")
plt.plot(ktnew, cv_c, label = "kT = 2.7")
plt.plot(ktnew, cvm,  label = "Combined")
plt.legend(loc = "upper left")
plt.show()

On close inspection it can be seen that the combined reweighted curve is closer to the original lower temperature data at low temperatures, and nearer the original higher temperature data at high temperatures.

In [ ]: