Ground state energy increase after 10th sweep

I'm calculating the spin 1 bilinear biquadratic model at the critical point $\theta=-PI/4$. For N=32, I did 13 sweeps. The ground state energy after each sweep is:
Sweep 01: -88.497967398468
Sweep 02: -88.909137146883
Sweep 03: -88.917526981673
Sweep 04: -88.917544395609
Sweep 05: -88.917545655638
Sweep 06: -88.917545847203
Sweep 07: -88.917545866916
Sweep 08: -88.917545867250
Sweep 09: -88.917545867323
Sweep 10: -88.917545867276
Sweep 11: -88.917545867230
Sweep 12: -88.917545867091
Sweep 13: -88.917545867087

We can see from sweep 10, the ground state energy start to increase at order 10^(-10) instead of decreasing. Is it because of the largest truncation error I set is 10^-10? Do I consider this as convergence or wrong result? (The sweep info and code is attached in the end)

Besides this, I have a few questions:
1. How should one set "niter" in each sweep? Large or small number?

1. How many sweeps should I do to reach convergence? Should I keep the same "maxm" and do a few sweeps or keep increasing "maxm" to do sweeps (as long as the "maxm" I set is larger than the largest m during sweep)?

2. About the mid step information produced by "dmrg", say this example, I get:
Sweep=1, HS=1, Bond=(1,2)
I 0 q 2E+00 E -65.7609306503
I 2 q 4E-14 E -66.9810071517

What does "I 0" "q 2E+00" "E -65.7609306503" mean? And why are there 2 sets of data?

1. If I want to measure the correlation functions <Sz(j)Sz(N-j+1)>, how can I determine if the correlation function is convergent? Do I compare the correlation function after each sweep? And what's the accuracy of the correlation function compare to the ground state energy?

Thank you.

Sweep information:
input
{
N = 32
nsweeps = 13
sw_table
{
maxm minm cutoff niter noise
20 2 1E-8 9 1E-8
40 2 1E-8 8 1E-9
80 2 1E-10 7 1E-10
120 2 1E-10 6 1E-10
160 2 1E-10 5 0
200 2 1E-10 4 0
240 2 1E-10 4 0
280 2 1E-10 4 0
320 2 1E-10 5 0
360 2 1E-10 6 0
360 2 1E-10 7 0
360 2 1E-10 8 0
360 2 1E-10 9 0
}
quiet = no
}

Code:

include <math.h>

using namespace itensor;
using std::vector;
using std::string;
using std::min;
using std::max;

define PI 3.14159265358979323846

int main(int argc, char* argv[])
{
if(argc != 2)
{
//reminds us to give an input file if we forget
printfln("Usage: %s inputfile",argv[0]);
return 0;
}

auto input = InputGroup(argv[1],"input");
auto N = input.getInt("N");
auto nsweeps = input.getInt("nsweeps");
auto sw_table = InputGroup(input,"sw_table");
auto quiet = input.getYesNo("quiet",true);
auto sweeps = Sweeps(nsweeps,sw_table);
println(sweeps);
println(quiet);

double theta = -PI/4;
double cost = cos(theta);
double sint = sin(theta);

auto sites = SpinOne(N);

auto ampo = AutoMPO(sites);
for(int j = 1; j < N; ++j)
{
ampo += cost,"Sz",j,"Sz",j+1;
ampo += cost*0.5,"S+",j,"S-",j+1;
ampo += cost*0.5,"S-",j,"S+",j+1;
ampo += sint,"Sz*Sz",j,"Sz*Sz",j+1;
ampo += sint*0.5,"Sz*S+",j,"Sz*S-",j+1;
ampo += sint*0.5,"Sz*S-",j,"Sz*S+",j+1;
ampo += sint*0.5,"S+*Sz",j,"S-*Sz",j+1;
ampo += sint*0.5,"S-*Sz",j,"S+*Sz",j+1;
ampo += sint*0.25,"S+*S+",j,"S-*S-",j+1;
ampo += sint*0.25,"S-*S-",j,"S+*S+",j+1;
ampo += sint*0.25,"S+*S-",j,"S-*S+",j+1;
ampo += sint*0.25,"S-*S+",j,"S+*S-",j+1;
}
auto H = IQMPO(ampo);

auto state = InitState(sites);
for(int i = 1; i <= N; ++i)
{
if(i%2 == 1)
state.set(i,"Up");
else
state.set(i,"Dn");
}
auto psi = IQMPS(state);

auto energy = dmrg(psi,H,sweeps,{"Quiet",quiet});

//
// Print the final energy reported by DMRG
//
printfln("\nGround State Energy = %.10f",energy);

return 0;


}

+1 vote
answered Jun 26, 2017 by (20,240 points)

Interesting question. I can't easily say why the energy rises slightly toward the end of your calculation. My best guess would be that it's a subtle interplay of the convergence of the Davidson algorithm (the solver that ITensor DMRG uses at the core 2-site step) with the truncation errors caused by your parameters (cutoff of 1E-10). Because the truncation at each bond should tend to raise the energy, while the Davidson is always trying to lower it, it's conceivable that the behavior is not always a monotonic decrease in energy but rather some small oscillatory behavior as one reaches the fixed point of the algorithm (for your particular settings I mean).

One of your questions (1) is related to what you observed. Typically one wants to set niter to 2 (the minimum sensible value; niter=1 doesn't do anything for historical reasons). The reason is that for "easy" systems such as spin chains one wants to only do a single Davidson step at each bond (niter=2 is a single Davidson step). This is because each Davidson step at every bond contributes to the overall lowering of the energy, and there is usually not a good reason to fully converge the Davidson at each bond in the presence of an unconverged environment of the rest of the MPS. The Davidson steps are the most expensive part of a normal DMRG calculation so one doesn't want to do more than necessary.

However, there can be good reasons to raise niter larger than 2. One reason could be to overcome a poor initial state in the first few sweeps, later lowering niter to 2. Another reason could be for applying DMRG to a system with multiple, widely separated energy scales. Steve White and myself found in our recent quantum chemistry calculations (https://arxiv.org/abs/1702.03650) that taking niter = 6 worked better than niter =2 for converging in a smaller number of sweeps.

Finally, in your particular case of wanting a very high-accuracy result, taking niter > 2 could be a key step. When you get toward the end of your calcualtion, the fact that you are limiting niter=2 could mean that the Davidson is not fully converging, so that your errors are dominated by the truncation. This is consistent with your observation that the errors are of the order of your truncation cutoff.

About your question (2), the symbols printed by the Davidson algorithm are I n q x where n is the Davidson iteration number (only 0 and 2 shown usually) and x is the value of the "residual" vector "q" which is a measurement of how well converged the algorithm is. Ideally q should rapidly decrease with the number of iterations. Finally the number after E is the current energy at that step.

Finally, about correlation functions. If your wavefunction is converged well then measurements of it will be converged too. This is one of the main principles behind DMRG and MPS techniques, namely that reducing the error made when truncating the reduced density matrix also reduces the errors in local observables (since these are obtained by tracing with reduced density matrices after all). For a technical discussion of the rate of convergence of energy versus observables in DMRG and MPS, see the comments in this paper by White and Chernyshev: https://arxiv.org/abs/0705.2746 at the beginning of page 2.