lnd: In-memory graph population is very slow when running with Postgres backend

Background

I’m testing LND with Postgres locally and it seems to be running strangely slow. Database opens take 2-3 minutes before I can unlock. Running a getinfo command can take minutes to return. However, later on the getinfo calls take the normal 600ms. This isn’t some slight performance hit, it’s a substantial slow down so I’m wondering if I’m doing something wrong or there some other weird issue.

This is all running locally with Postgres in a Docker container. LND is using Neutrino as its chain backend. An additionally odd thing is I don’t actually see LND doing operations against Postgres. It can definitely connect and it creates tables and value, but I don’t see active connections hardly ever. I’m using this query to find this information:

SELECT * FROM pg_stat_activity WHERE datname = 'postgres' and state = 'active';

However, I do see some idle connections.

Command I’m running

lnd --bitcoin.active --bitcoin.node=neutrino --bitcoin.mainnet --lnddir=/Users/me/lnd --neutrino.feeurl=https://nodes.lightning.computer/fees/v1/btc-fee-estimates.json --db.backend=postgres --db.postgres.dsn=postgres://postgres:password@localhost:5432/postgres?sslmode=disable --db.postgres.maxconnections=1000

Logs

2021-12-21 06:46:52.978 [INF] CHDB: Checking for schema update: latest_version=24, db_version=24
2021-12-21 06:46:52.978 [INF] LTND: Database(s) now open (time_to_open=3m6.356551208s)!

GetInfo Command

$ time lncli getinfo
{
    "version": "0.14.1-beta commit=v0.14.1-beta",
    ...
}

real	1m40.783s
user	0m0.038s
sys	0m0.036s

Your environment

LND 0.14.1
MacOS
Neutrino
Postgres v14.1

Steps to reproduce

Run Postgres in a docker container locally and point LND at it and start it up.

Expected behaviour

I would expect LND to behavior similarly to bbolt in this scenario. Sure there’s some differences in how it interacts with the database, but this is a large slow down.

Actual behaviour

Things get really slow.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 35 (10 by maintainers)

Most upvoted comments

I think that sounds like a good idea as well re: making cache population async. Would at least let some things continue.

I don’t have a profile of the GetInfo but I can try to make one later. I’m sure it’s related to the graph cache population. As soon as that finishes GetInfo returns like normal.

gkrizek on Jan 4, 2022