A common problem in Quantum Chemistry is the computation of select eigenvalues of a matrix. This problem can be solved much more cheaply than by complete diagonalization if the number of required eigenvalues is reasonably small.

## The Theory

A great account of the theory of iterative projection methods can be found in the book “Templates for the Solution of Algebraic Eigenvalue Problems” (ISBN 0-89871-471-0), see especially chapter 3.2 by Y. Saad. Here we consider only the Davidson method:

Say we want to find an approximation, ũ, to the lowest eigenvector of matrix A in the subspace spanned by some orthonormal basis vectors v₁,v₂, … Such a vector will have contributions in the direction of the exact eigenvector (for which Au-λu=0 holds) and some orthogonal contributions. If ũ is truly ideal in the {v}-subspace, then the so-called Galerkin condition must hold:

r := (Aũ-λũ) ⟂ v₁,v₂, …

where r is called the residue. To write this condition in a computationally meaningful way, let’s expand ũ in (v₁,v₂,…) which we write as the columns of a matrix, V, for convenience

ũ = V * [y₁; y₂; …]

(The coefficients {y} still need to be determined.) The Galerkin condition can now be written

0 = (Aũ-λũ)’*V = y’*V’*A’*V – λ y’ V’ V

or equivalently

(V’*A*V)y = λy

Solving this equation will therefore give us the best approximation to the eigenvalues, λ, and corresponding coefficients, y, within the subspace. These coefficients can then be used to compute residue vectors which can be added to the optimization subspace until the desired eigenvalues and vectors are found with sufficient accuracy.

## The Code

Let’s look at some code now to make this idea more clear.

### Preconsiderations

At the time of this writing the eachcolumn function has not yet been added to Julia, so I’ll just define a naive version myself for now:

1 2 3 |
function eachcolumn(V::AbstractMatrix) [V[:,i] for i in 1:size(V,2)] end |

We also need a way to orthonormalize a set of basis vectors V. To this end, we can simply use the QR decomposition (from the LinearAlgebra standard library) which basically just goes through the columns of a matrix, orthonormalizing each column against all previous columns and updates the R-factor accordingly. This means we can just use it and throw away the R-factor.

1 2 3 4 |
function orthonormalize(V::Matrix{Float64}) Q,R = qr(V) return Q*Matrix{Float64}(I,size(V)...) end |

Note, that I have multiplied the matrix Q with an explicit identity matrix. This is possibly the weirdest step in my code, so let me explain: For performance reasons the Q matrix is not actually computed in the QR decomposition step, but instead only the Householder transformations are stored. While we could just use the resultant Q as if it were a regular matrix, indexing into it’s elements would require putting these transformations together on the fly which is relatively slow. We could also just force conversion into a regular matrix once, but that would require computing the full orthogonal matrix. As we are really only interested in the first orthonormal vectors/columns we can, however, also just get those cheaply by multiplication with the first columns of an identity matrix, which I have chosen to do here.

### Davidson Algorithm

Now let’s finally get to the good stuff. Our function should consider not only the single lowest eigenvalue, but the lowest `NTargetEigvals`

many. We also want to increase the subspace `subspaceincrement`

many vectors at a time. We then start with a initial guess for V, iterate until the desired eigenvalues converge to some threshold, ϑ, or we reach `maxsubspacesize`

, in which case we abort, returning an error message.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
function Davidson(A::AbstractMatrix, NTargetEigvals::Int=1, subspaceincrement::Int=8, maxsubspacesize::Int=200, ϑ::Float64=1e-8) k = subspaceincrement M = min(maxsubspacesize, size(A,2)) V = Matrix{Float64}(I,size(A,1),k) eigvals = ones(NTargetEigvals) for m in k:k:M #... eigvalsN = eig.values[1:NTargetEigvals] eigvals = norm(eigvalsN-eigvals)>=ϑ ? eigvalsN : return eig.values end error("Davidson did not converge in $maxsubspacesize-dimensional subspace") end |

Now let’s see what we actually do in every iteration. We first orthonormalize the {v}-basis

1 |
V = orthonormalize(V) |

and solve for the eigenvalues in the spanned subspace

1 2 3 |
AV = A * V T = Symmetric(V' * AV) eig = eigen(T) |

The reason for me to do the matrix product AV in an extra step is just so that I can reuse it in the next step. The final step is the computation of residue vectors and their addition to the basis

1 2 |
w = [AV*y - λ*(V*y) for (λ,y) in zip(eig.values[1:k],eachcolumn(eig.vectors)[1:k])] V = [V w...] |

And that’s it! Here’s the complete code again

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
function Davidson(A::AbstractMatrix, NTargetEigvals::Int=1, subspaceincrement::Int=8, maxsubspacesize::Int=200, ϑ::Float64=1e-8) k = subspaceincrement M = min(maxsubspacesize, size(A,2)) V = Matrix{Float64}(I,size(A,1),k) eigvals = ones(NTargetEigvals) for m in k:k:M V = orthonormalize(V) AV = A * V T = Symmetric(V' * AV) eig = eigen(T) w = [AV*y - λ*(V*y) for (λ,y) in zip(eig.values[1:k],eachcolumn(eig.vectors)[1:k])] V = [V w...] eigvalsN = eig.values[1:NTargetEigvals] eigvals = norm(eigvalsN-eigvals)>=ϑ ? eigvalsN : return eig.values end error("Davidson did not converge in $maxsubspacesize-dimensional subspace") end |

## Results

As a quick test, I’ll run the same example as Joshua did in his post

1 2 3 4 5 6 |
testmatrix= Symmetric(diagm(0=>1:1200) + 0.0001*rand(1200,1200)) println("Exact eigenvalues: ",eigen(testmatrix).values[1:4]) @time eigen(testmatrix).values; println("Davidson eigenvls: ",Davidson(testmatrix,4)[1:4]) @time Davidson(testmatrix,4); |

and here is some example output to show our speedup:

1 2 3 4 |
Exact eigenvalues: [1.00007, 2.00003, 3.00002, 4.00007] 0.887761 seconds (20 allocations: 33.390 MiB) Davidson eigenvls: [1.00007, 2.00003, 3.00002, 4.00007] 0.026738 seconds (356 allocations: 3.476 MiB) |

## About This Post

Joshua Goings has a nice presentation of the topic on his blog in which he shows a nice and straightforward Python implementation. After reading that page, I thought it could be interesting to do the same again in Julia: First and foremost to help spread understanding of this beautiful idea, which lies at the heart of several Quantum Chemistry problems. But also – as a long-time Julia aficionado (who has been using Julia since version 0.2) – to give interested readers a chance to compare for themselves the look and feel of Julia code compared to equivalent Python code. For this reason, I chose to use a structure as close to Joshua’s as possible. Please don’t take this as a face-off, however, as I’m sure that both codes were written with only didactic (not performance) purposes in mind. Big thanks to Joshua for the inspiration.