Problem set 8¶

Correlations and correlation functions¶

The following relations may be useful:

\[P(A|B)P(B) = P(A, B) = P(B|A)P(A)\]

In this notation $P(A|B)$ represents a conditional probability, in this case the probability of $A$, under the condition that $B$ is given/known. $P(A,B)$ represents the joint probability of having $A$ and $B$ as outcomes. If $b_1$ and $b_2$ are the only two possible outcomes of the stochastic variable $B$, then one can write:

\[P(A) = P(A|B=b_1)P(B=b_1) + P(A|B=b_2)P(B=b_2)\]

Calling $P(A)$ the probability that someone makes homework and $P(B)$ the probability that someone passes for the exam, the desired quantity for this question is $P(A|B)$. Then, using the above equations, one finds:

\[\begin{split}\begin{aligned} P(A|B) &= \frac{P(B|A)P(A)}{P(B)} = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\mathrm{not}\ A)P(\mathrm{not}\ A)} \nonumber \\ &= \frac{0.9\cdot0.6}{0.9\cdot0.6+0.5\cdot0.4} \approx 0.73 \end{aligned}\end{split}\]
A biosensor is designed to detect minute concentrations of a certain protein A. If this molecule is captured by the sensor, there is a 99% probability that it will give a signal. Unfortunately, there are also other proteins in the sample that can weakly interact with the sensor. Protein B is also able to cause a signal, but with a probability of 5%. A certain sample contains a concentration of 2 pM of A protein and 1 nM of B protein. When the solution is flushed through the sensor, it is giving a signal. What is the probability that it captured protein A?

Calling $P(A)$ the probability that protein $A$ is captured, and $P(S)$ the probability that the sensor will give a signal, the desired quantity is $P(A|S)$. Using the same steps as before

\[\begin{split}\begin{aligned} P(A|S) &= \frac{P(S|A)P(A)}{P(S)} = \frac{P(S|A)P(A)}{P(S|A)P(A) + P(S|\mathrm{not}\ A)P(\mathrm{not}\ A)} \nonumber \\ &= \frac{0.99\cdot0.002}{0.99\cdot0.002+0.05\cdot0.998} \approx 0.04 \end{aligned}\end{split}\]

Scattering experiments and correlation functions¶

Sketch $g(r)$ for an ideal gas:

it should be a constant function, with $g(r)=1$ for all $r\geq0$ (see lecture slides)
Why does $g(r)$ converge to 1, for large $r$?

Particles in a fluid are uncorrelated when they are far away from each other, and only influence their neighbourhood over a certain range (think of water molecules forming temporal hydrogen bonds with their neighbours, or ions avoiding equal charge and seeking oppositely charged ions in their neighbourhood). For two points $\mathbf{r}$ and $\mathbf{r'}$ that are far apart, the density pair-correlation function factorises, $\langle \rho(\mathbf{r})\rho(\mathbf{r'})\rangle \approx \langle \rho(\mathbf{r})\rangle \langle\rho(\mathbf{r'})\rangle$ (meaning, particles behave independently of particles at a large distance), and therefore $g(|\mathbf{r} - \mathbf{r'}|) \approx 1$.
Sketch $g(r)$ for sodium ions in water (for concentrations < 0,1 M so that you can use the information of Homework set 7)

It should be zero for small $r<d$ (with $d$ the minimal distance between two hydrated ions), and some exponential function for $r>d$, starting from some finite value smaller than 1, approaching 1 at large $r$ (similar to the solution found in HW set 7, for negative ions near a negative electrode). Actually it is approximately

\[g(r) \approx 1 - A \frac{ e^{-\kappa r} }{ r }\qquad r>d\]

with $A$ some constant depending on temperature, salt concentration, dielectric constant, etc. You do not need to be able to derive this for this course, as long as the sketch has similar qualitative features.
Sketch $g(r)$ for a crystal and liquid water at room temperature. (be welcome to do a quick literature research what this function may look like - there is no way to guess the precise features of this function)

$The radial distribution function of liquid water has specificfeatures, but globally has the same qualitative features as many other(simple) liquids, namely that correlations decay over the distance of afewmolecules$ {#fig:enter-label}

the Ising model¶

The 1-dimensional Ising model¶

contains $N$ particles that we will refer to as “spins”, originally representing the magnetic dipole moment of a certain number $N$ of atoms, pointing in an upward or downward direction. The spins are arranged in a row, and each spin can take two values, so $\sigma_n = 1$ or $\sigma_n = -1$, using $\sigma_n$ as the symbol for the spin of the $n^\mathrm{th}$ atom. The spin $\sigma$ does not have a dimension, which makes the algebra a lot cleaner, and also more general. If desired, one could convert the dimension to angular momentum, or magnetic moment. Every spin interacts with an external field $H$, and with its neighbours by means of a pair interaction $J$. If the spin is aligned with the external field, and the external field is pointing upward ($H$ is positive), the energy is $-H$ and if it is opposite to the external field it is $H$. In short:

(4)¶\[ E(\{\sigma\}) = - H \sum \limits_{i=1}^N \sigma_i + J \sum \limits_{i=1}^N \sigma_i\sigma_{i+1},\]

where we use that $\sigma_{N+1}=\sigma_1$ (e.g. the spins are positioned in a circle). This choice makes the algebra easier, and makes every spin equivalent to the others, so that the expectation values of the individual spins are all the same.

What would be a natural choice for a state vector, describing a microstate of the Ising model?

A natural choice would be the state vector

\[\begin{aligned} \mathbf{\sigma} = (\sigma_1, \sigma_2, ..., \sigma_N) \end{aligned}\]

where the components are the spin states of each individual spin, being $+1$ or $-1$.
Argue that the energy of a single microstate can be written as:

(5)¶\[ E(\{\sigma\}) = - \frac{1}{2}H \sum \limits_{i=1}^N (\sigma_i+\sigma_{i+1}) + J \sum \limits_{i=1}^N \sigma_i\sigma_{i+1} \]

[it will become clear later, why one would like to write the energy like this…]

The sum can be separated into two identical sums:

\[\sum \limits_{i=1}^N (\sigma_i + \sigma_{i+1}) = \sum \limits_{i=1}^N \sigma_i + \sum \limits_{i=1}^N \sigma_{i+1} = 2 \sum \limits_{i=1}^N \sigma_i,\]

where one has to use the periodic condition, that $\sigma_{N+1} = \sigma_1$. Substituting this in equation (5) yields equation (4).
What is the probability of a single microstate, given that the system is equilibrated at a constant temperature.

That probability is given by the Boltzmann distribution

(6)¶\[ P(\mathbf{\sigma}) = \frac{1}{Z}e^{-\beta E(\mathbf{\sigma})}\]
Show that the partition function $Z$ for this system can be written as:

(7)¶\[ Z = \sum \limits_{\sigma_1 = \pm 1} \sum \limits_{\sigma_2 = \pm 1} ... \sum \limits_{\sigma_N = \pm 1} \prod \limits_{i=1}^N V(\sigma_{i}, \sigma_{i+1})\]

Here one has to use that the partition function has to normalise the probability distribution of equation (6), so that

\[Z = \sum \limits_{\sigma_1 = \pm 1} \sum \limits_{\sigma_2 = \pm 1} ... \sum \limits_{\sigma_N = \pm 1} e^{-\beta E(\mathbf{\sigma})}.\]

The next step is to write out the expression of the energy and factorise the exponent, using

\[e^{-\beta E(\mathbf{\sigma})} = e^{\frac{1}{2}\beta H \sum \limits_{i=1}^N (\sigma_i + \sigma_{i+1}) -\beta J \sum \limits_{i=1}^N \sigma_i\sigma_{i+1} } = \prod \limits_{i=1}^N e^{\frac{1}{2}\beta H (\sigma_i + \sigma_{i+1}) -\beta J \sigma_i\sigma_{i+1} }.\]

The last step is to introduce the function

\[V(\sigma_i, \sigma_{i+1}) \equiv e^{\frac{1}{2}\beta H (\sigma_i + \sigma_{i+1}) -\beta J \sigma_i\sigma_{i+1} }\]
Show that you can write this as:

(8)¶\[ Z = \mathrm{Tr} \left( V^N \right)\]

with $V(\sigma_i,\sigma_{i+1})$ a function depending on the interaction parameters $H$ and $J$.

To reduce the previous expression to this form, it may be useful to interpret $\sigma_i$ as the row index of the 2 by 2 matrix $V(\sigma_i, \sigma_{i+1})$. The matrix $V$ would then be

\[\begin{split}V \equiv \left( \begin{array}{cc} e^{\beta H - \beta J} & e^{\beta J} \\ e^{\beta J} & e^{-\beta H - \beta J} \end{array} \right)\end{split}\]

To get from equation (7) to equation (8) one has to interpret $Z$ as a product of $N$ identical matrices $V$, using

\[\sum \limits_{\sigma_2 = \pm 1} V(\sigma_1, \sigma_2) V(\sigma_2, \sigma_3) = V^2(\sigma_1, \sigma_3),\]

following the familiar rules for matrix multiplication

\[\sum \limits_j A_{ij}B_{jk} = C_{ik},\]

so that one ends up, after summing over all spins except $\sigma_1$, with

\[Z = \sum \limits_{\sigma_1 = \pm 1} V^N(\sigma_1, \sigma_{N+1}) = \sum \limits_{\sigma_1 = \pm 1} V^N(\sigma_1, \sigma_1) = \mathrm{Tr} (V^N)\]
Find the eigenvalues of $V$ and using that the trace is basis-independent, show that

\[Z = \mathrm{Tr} \left( V^N \right)= \mathrm{Tr} \left( \Lambda^N \right) = \lambda_1^N + \lambda_2^N \approx \lambda_1^N,\]

where $\Lambda$ is a diagonal matrix, containing the eigenvalues of $V$, and $\lambda_1$ is the largest eigenvalue.

For the eigenvalues, one has to solve the characteristic equation

\[\det \left| V - \lambda I \right| = 0,\]

or

\[\begin{split}\det \left| \begin{array}{cc} e^{\beta H - \beta J}-\lambda & e^{\beta J} \\ e^{\beta J} & e^{-\beta H - \beta J} - \lambda \end{array} \right| = 0,\end{split}\]

which yields a quadratic equation

\[\lambda^2 - \lambda(e^{\beta H - \beta J} + e^{-\beta H - \beta J})+e^{-2\beta J}-e^{2\beta J} = 0,\]

Using the abc-formula, one finds two solutions, or roots

\[\lambda_\pm = \frac{1}{2} (e^{\beta H - \beta J} + e^{-\beta H - \beta J}) \pm \frac{1}{2} \sqrt{(e^{\beta H - \beta J} + e^{-\beta H - \beta J})^2 + 4 (e^{2\beta J} - e^{-2\beta J}) },\]

which can be written in a shorter form by introducing the hyperbolic sine and cosine, with $2\sinh{a} = e^a - e^{-a}$, and $2\cosh{a} = e^a + e^{-a}$, so that

\[\lambda_\pm = e^{-\beta J} \cosh(\beta H) \pm \sqrt{ e^{- 2 \beta J} \sinh^2{\beta H} + e^{2\beta J} }.\]

To find $Z \approx \lambda_1^N$ one has to diagonalise $V = S \Lambda S^{-1}$, and check that all the transformation matrices $S$ cancel (see also the lecture notes). [Little fact: Here you actually show a mathematical property that the trace is basis-independent. In other words, that the trace of a matrix is not dependent on the representation of the matrix with respect to a certain basis]

If $\lambda_1>\lambda_2$, then certainly $\lambda_1^N \gg \lambda_2^N$ and

\[Z = \mathrm{Tr} \left( \Lambda^N \right) = \lambda_1^N + \lambda_2^N \approx \lambda_1^N,\]
One can now derive an expression for the mean spin. Similar to problem 5.5 one can derive that (no need to do it yourself, but it can be done with the same effort as in 5.5)

\[N \langle \sigma_i \rangle = -\frac{\partial}{\partial H} k_\mathrm{B}T \ln Z\]

Give an expression for $\langle \sigma_i \rangle$ in terms of the external field $H$ and interaction parameter $J$. [The derivation is not too easy. In principle it is only a matter of taking the derivative, but to simplify the final result requires some clever algebraic juggling. Be welcome to just “give” the result. ]

This end result is (see also the lecture notes)

\[\langle \sigma_i \rangle = \frac{e^{-\beta J} \sinh {\beta H}}{\sqrt{e^{-2\beta J} \sinh^2{\beta H} + e^{2 \beta J} }}\]

The sign convention may differ in the literature, and sometimes the $\beta$ is absorbed in them, so be careful with the field parameters $H$ and $J$.

The spin-spin correlation function in the Ising model¶

With a similar, but more involved derivation as the previous section, one can also derive an expression for the pair-correlation function $\langle \sigma_i \sigma_{j} \rangle$,

\[\langle \sigma_i \sigma_{j} \rangle - \langle \sigma_i \rangle \langle \sigma_j \rangle = A \left( \frac{\lambda_2}{\lambda_1} \right)^{|j-i|}\]

with the eigenvalues $\lambda_2 < \lambda_1$ as in the previous section, $A \equiv \sin^2 2\phi$ and

\[\tan 2\phi \equiv \frac{1}{e^{2\beta J} \sinh (\beta H)}\]

Describe what kind of information the function $\langle \sigma_i \sigma_{j} \rangle - \langle \sigma_i \rangle \langle \sigma_j \rangle$ provides.

This expression, the covariance, gives information about the range over which spins are correlated, so where the probabilities of two spin states are not mutually independent.
Show that this correlation function drops exponentially as $e^{- \frac{|i-j|}{\lambda}}$ with a characteristic length scale $\lambda \equiv [\ln (\lambda_1/\lambda_2) ]^{-1}$.

One has to use the mathematical identity

\[a^x = e^{x \ln a},\]

using $a \equiv \lambda_2/\lambda_1$, $x \equiv |i-j|$, and $\lambda \equiv 1 / \ln a$
How does this length scale change with the interaction parameter $J$ if there is no external field? [so, you may take $H=0$ for simplicity. Take $J$ here to be negative.]

Following the mathematical route, one finds, for $H=0$

\[\frac{\lambda_2}{\lambda_1} = -\tanh {\beta J}.\]

If $J$ becomes more negative (spins couple stronger, alignment decreases the energy) then this fraction increases. This means that $\lambda$ will increase, and become very large, as $\lambda_2/\lambda_1$ approaches 1 in the limit of very large $- \beta J$. This makes sense physically. A stronger coupling between the spins (more negative $J$) will make the spins align more strongly, and create larger domains where spins are all aligned (i.e. strongly correlated).
What would be the typical structure of your system for large negative values of $J$? What for large positive values? (you can conclude this based on conceptual arguments, without doing algebra)

For large negative $J$ there will be large domains where all the spins are in the same state. For large positive $J$ there will be large domains where each spin is in the opposite state as its two neighbours. Such a configuration will minimise the energy.
What would a “spin” represent when the Ising model is applied to:
1. a semi-flexible polymer: the orientation of a segment of the polymer.
2. a persistent random walk (a random walk were the probability of the next step depends on what the previous step was): the (direction of the) step that the walker is taking.
3. molecules binding cooperatively to a DNA-molecule: whether a site is occupied or not.
For the systems in the previous question, what would $J$ and $H$ represent, or be connected to?
1. a semi-flexible polymer: $J$ is the coupling between neighbouring segments, and determines the stiffness of the polymer. It is an energy needed to bend two segments with respect to each other. $H$ would be an external field, capable of aligning the segments, e.g. a substrate, a flow, …
2. a persistent random walk: $J$ is the related to the persistence of the walk, or the probability to keep stepping in the same direction, once a step has been made. $H$ is an external field that will give the walker a bias, by making the walker step in a certain direction on average (independent of the history of the walker).
3. molecules binding cooperatively to a DNA-molecule: $J$ is a cooperativity parameter, modelling how much molecules help each other attaching to the DNA, by lowering the energy of binding to sites next to a bound molecule. $H$ is an external potential lowering the general barrier to bind, and could for example be the chemical potential $\mu$ of the molecules in solution.

The Ising model in 2 dimensions with a Monte Carlo algorithm¶

A simple example of a Monte Carlo algorithm is a method that can be used to evaluate the number $\pi$. This method generates $N$ sets of two random numbers $x_i$ and $y_i$ from a uniform probability distribution between $-1$ and $1$, and counts how many times $N_{\mathrm{in}}$ the numbers obey the following relation $x_i^2 + y_i^2 \leq 1$. The ratio $4 N_{\mathrm{in}}/N$ approaches $\pi$ as $N$ increases. Explain why this is the case (conceptually). [Hint: making a little drawing in the $x,y$-plane may be helpful]

The condition $x_i^2 + y_i^2 \leq 1$ defines a disc in the $x,y$-plane with area $\pi$, whereas the random numbers are generated from a uniform distribution from a square domain in the $x,y$-plane, with area $4$. On average one would expect that the number of points obeying the condition $x_i^2 + y_i^2 \leq 1$ would be proportional to the area of the disc, and the total number of points generated would be proportional to the area of the square.
Use the python code to see what the 2-dimensional Ising model typically looks like:
1. at high and low temperature
2. for large and small negative values of $J$ and $H$
3. for positive values of $J$
Use the code here… (or see the lecture slides)
Did the results of the previous question make sense, physically?

At high temperature, entropy should be dominant, so that the number of spins in the up state should equal the number of spins in the down state, on average, and that neighbours are not correlated (only small, transient domains). At low temperature, the free energy of the system is mainly determined by the energy of the system, so that the system will tend towards the lowest energy state. This state may depend a little on whether $J$ is positive or negative, and whether $H$ is finite or zero.

Large values of $H$ will tend to align all spins in the direction of $H$. Large negative values of $J$, while keeping $H=0$ will tend to create large domains of aligned spins. Large positive values of $J$ will create large domains where all the spins are oppositely aligned as their neighbours.
Bonus: Find the critical point. How did you find it? [hint: trial and error is perfectly acceptable - but one could use several forms of information to check whether the system is close to the critical point or not]

There are several ways to find out… e.g. close to the critical point, the energy fluctuations will increase strongly. The system will have large density fluctuations, which cause the energy fluctuations to spike. One can also look at the domains of aligned spins. The size of these domains will increase strongly, while the system still has a spin close to zero on average (if $H=0$, which is a wise choise if one is seeking the critical point). These domains span all possible length scales that one could simulate.

NB2220 Statistical Physics