lnd: Unable to use lncli and tor

Background

We are unable to use lncli on this node. It isn’t connecting to 127.0.0.1, instead, it is connecting to 36.37.242.94. Similarly, lnd is unable to connect to Tor when it is activated. The lnd-tor error also references 36.37.242.94:9051 instead of the expected 127.0.0.1:9051.

We tried to delete all the tor data and restart the containers, but the error persists. That makes me suspect this ip address (36.37.242.94) is being stored in one of the lnd databases.

The RPC endpoints are still working from other services we have on the node.

Your environment

  • Lnd 0.7.0.
  • Linux casa-node-apollo 4.14.70-v7+ #2 SMP Wed Sep 19 07:49:26 UTC 2018 armv7l GNU/Linux)
  • Bitcoind 0.18.0.
  • Running on a raspberry pi 3 b+ in docker containers.
  • Tor version 0.3.5.8.

Steps to reproduce

On this device, any lncli call results in an error. Lnd will run with Tor off. Lnd will crash if Tor is active.

Expected behaviour

lncli commands should succeed.

Actual behaviour

bash-4.4# lncli getinfo
[lncli] rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 36.37.242.94:10009: i/o timeout"

Tor error is something similar, but I don’t have it on me. Can find it if it helps.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 29 (9 by maintainers)

Most upvoted comments

@cedracine glad you were able to resolve the issue! thanks for the help @stridentbean 🎉

Network router settings was the problem. Enabling the use of 1.1.1.1 and 8.8.8.8 in the “PPPoE Advanced Settings” solved it. LN is now stable and channels connected. Thanks for the great support to everyone. Cedric

rpcserver=127.0.0.1:10009 will only get lncli to work, but it doesn’t solve the tor issue. We are going to try tor.control=127.0.0.1:9051. My working theory is that the default tor control (localhost) is being updated to an ip address other than 127.0.0.1 AND localhost isn’t being updated to the other ip address in other places in lnd.

Could possibly be related to the use of NAT traversal?

It’s easy enough for us (Casa) to hard code in rpcserver=127.0.0.1:10009 to our lnd.conf or our startup script. It’s just really strange that it is needed and really just for this one node. We have a large fleet of nodes and haven’t seen this issue before.

@cedracine What does your lnd.conf look like? (you can leave out the rpcuser and rpcpass)

lnd.conf for Casa Nodes are as follows

# Log file location
logdir=/usr/local/casa/chains/lnd/logs

# Static Channel Backup location
backupfilepath=/root/.lnd/data/scb/channels.backup

# Logging level
debuglevel=warn

Our start_lnd.sh is more complicated, but still no reference to --rpcserver

#!/usr/bin/env bash

# exit from script if error was raised.
set -e

# error function is used within a bash function in order to send the error
# message directly to the stderr output and exit.
error() {
    echo "$1" > /dev/stderr
    exit 0
}

# return is used within bash function in order to return the value.
return() {
    echo "$1"
}

# set_default function gives the ability to move the setting of default
# env variable from docker file to the script thereby giving the ability to the
# user override it durin container start.
set_default() {
    # docker initialized env variables with blank string and we can't just
    # use -z flag as usually.
    BLANK_STRING='""'
    VARIABLE="$1"
    DEFAULT="$2"

    if [[ -z "$VARIABLE" || "$VARIABLE" == "$BLANK_STRING" ]]; then

        if [ -z "$DEFAULT" ]; then
            error "You should specify default variable"
        else
            VARIABLE="$DEFAULT"
        fi
    fi

   return "$VARIABLE"
}

# default bitcoind pid to 1.
lnd_pid=1

LND_NETWORK=$(set_default "$LND_NETWORK" "regtest")
CHAIN=$(set_default "$CHAIN" "bitcoin")
BACKEND=$(set_default "$BACKEND" "bitcoind")
AUTOPILOT=$(set_default "$AUTOPILOT" false)

MACAROON_PATH="/root/.lnd/data/chain/bitcoin/testnet/admin.macaroon"
if [ "$LND_NETWORK" = "mainnet" ]; then
  MACAROON_PATH="/root/.lnd/data/chain/bitcoin/mainnet/admin.macaroon"
fi

term_handler() {
  lncli \
    --macaroonpath=$MACAROON_PATH \
    stop
  stopLndCommand="$!"
  wait "$stopLndCommand"
  wait "$lnd_pid"
}

# Setup signal handlers
trap "term_handler" SIGTERM

# Need to supply $RPCUSER and $RPCPASSWORD, otherwise LND-rpc to calls to bitcoind
# will fail authentication.
if [[ -z "$RPC_USER" || -z "$RPC_PASSWORD" ]]; then
  error '$RPC_USER and $RPC_PASSWORD need to be specified'
fi

# default is #3399FF
COLOR_COMMAND=''
if [[ -n "$COLOR" ]]; then
    COLOR_COMMAND="--color"="$COLOR"
fi

# default is 20000
MIN_CHANNEL_COMMAND=''
if [[ -n "$MIN_CHAN_SIZE" ]]; then
    MIN_CHANNEL_COMMAND="--minchansize"="$MIN_CHAN_SIZE"
fi

AUTOPILOT_COMMAND=''
if [ "$AUTOPILOT" = true ]; then
    AUTOPILOT_COMMAND="--autopilot.active"\ "--autopilot.maxchannels"="$MAX_CHANNELS"\ "--autopilot.maxchansize"="$MAX_CHAN_SIZE"
fi

ALIAS_COMMAND=''
if [[ -n "$LND_NODE_ALIAS" ]]; then
    ALIAS_COMMAND="--alias"="$LND_NODE_ALIAS"
fi

# determine if uPnP is available on this network
# if an error exists, uPnP is unavailable and lnd can not use nat.
# if nat is unavailable, we have the option to use an external ip.
EXTERNAL_IP_COMMAND=''

# we need to turn off exit on error temporarily for the upnpn command
set +e
upnpError=$(upnpc -s 2>&1 >out.tmp)
upnpSuccess=$(<out.tmp)
rm out.tmp
set -e

NAT_COMMAND="--nat"

# if a upnp device is unavailable or a upnp device is available, but failing in any way, then don't use the nat flag
# if EXTERNAL_IPS is not empty, the user is manually describing their EXTERNAL_IPS
if [[ -n "$upnpError" || $upnpSuccess =~ "failed" || -n "$EXTERNAL_IP" ]]; then
  NAT_COMMAND=''

  if [[ -n "$EXTERNAL_IP" ]]; then
    EXTERNAL_IP_COMMAND="--externalip"="$EXTERNAL_IP"
  fi
fi

# set bitcoind mainnet and testnet rpc ports
RPC_PORT="8332"
if [ "$LND_NETWORK" = "testnet" ]; then
    RPC_PORT="18332"
fi

# Tor commands
TOR_ACTIVE_COMMAND=""
if [[ "$LND_TOR" = true ]]; then
    TOR_ACTIVE_COMMAND="--tor.active"
fi

TOR_VERSION_COMMAND=""
if [[ "$LND_TOR" = true ]]; then
    TOR_VERSION_COMMAND="--tor.v3"
fi

LISTEN_COMMAND=""
if [[ "$LND_TOR" = true ]]; then
    LISTEN_COMMAND="--listen=127.0.0.1"
    EXTERNAL_IP_COMMAND=''
fi

# turn off nat discovery if Tor is active
if [[ "$LND_TOR" = true ]]; then
    NAT_COMMAND=""
fi

exec lnd \
    "--configfile"="/usr/local/casa/chains/lnd/conf/lnd.conf" \
    "--$CHAIN.active" \
    "--$CHAIN.$LND_NETWORK" \
    "--$CHAIN.node"="$BACKEND" \
    "--$BACKEND.rpcuser"=$RPC_USER \
    "--$BACKEND.rpcpass"=$RPC_PASSWORD \
    "--$BACKEND.rpchost"="127.0.0.1:"$RPC_PORT \
    "--$BACKEND.zmqpubrawblock"="tcp://127.0.0.1:28332" \
    "--$BACKEND.zmqpubrawtx"="tcp://127.0.0.1:28333" \
    "--rpclisten=127.0.0.1:10009" \
    "--restlisten=127.0.0.1:8080" \
    $NAT_COMMAND \
    $ALIAS_COMMAND \
    $COLOR_COMMAND \
    $MIN_CHANNEL_COMMAND \
    $AUTOPILOT_COMMAND \
    $EXTERNAL_IP_COMMAND \
    $TOR_ACTIVE_COMMAND \
    $TOR_VERSION_COMMAND \
    $LISTEN_COMMAND \
    "$@"

lnd_pid="$!"
wait ${!}

lncli doesn’t have any persistent storage so idt this is a database issue.

@wpaulino’s idea makes the most sense at this point. Is localhost being overridden in etc/hosts? FWIW that IP belongs to Metfone, a Cambodian ISP.