longhorn: [BUG] Excessive s3 call count

Describe the bug Longhorn seems to call s3_list_objects every 6 seconds, resulting in cumulative costs due to API call fees.

To Reproduce Steps to reproduce the behavior:

  1. Install longhorn, deploy workloads with persistent volumes
  2. Connect longhorn to Backblaze b2 via their s3 api
  3. Rack up to about 12GB of stored backups.
  4. Watch usage in caps and alerts and reports rise excessively, and thereby breaking 2.5k free calls per day.

Expected behavior Minimal backup storage checking, at most every hour, at least every time a backup has to happen (currently scheduled every sunday)

Log I’ve tried to look for logs where longhorn displays some sort of residue polling from a browser window open somewhere, but I found none.

Environment:

  • Longhorn version: v1.0.0
  • Kubernetes version: “v1.17.5+k3s1”
  • Node OS type and version: K3OS v0.10.1

Additional context My class C transactions as of 14:20 CEST for “today” is 8080, the “report” page shows that only the s3_list_objects has such an excessive high number.

My class B transactions for today (same time) is 4616, these comprise of both s3_get_object and s3_head_object calls equally.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 19 (10 by maintainers)

Most upvoted comments

Not sure if it’s worth reopening, but there still seems to be an excessive amount of Class C Transactions (Backblaze terminology) happening. I have roughly thirty volumes, totaling maybe 2TB of storage. When setting Longhorn (v1.4) to use B2 directly via S3 my monthly Backblaze Class C cost is more than my monthly Storage cost. When changing Longhorn to backup to a local NFS target, and then using rclone to sync the data from the NFS server to B2, the Class C transactions drop from tens of thousands per day to hundreds per day.

Leaving some notes:

Since backupstore poll monitor is run via the settings_controller which is run as part of the longhorn-manager (deamonset) Each backupstore poll interval, each of the settings_controllers will request the list of backup volumes backupstore poll call count: O(nodes) * O(vol * 4)

Cost of List:

List:
    getVolumeNames:
        driver.list // base (1) listObjects call
        for each first level dir
            driver.list // first level
            for each second level dir
                driver.list // first level
                
    foreach volumeName:
        loadVolume:
            fileExists (1) // fileSize
            read (1)       // getObject

@yasker we should refactor this so that there is only a single backupstore monitor for the whole longhorn installation. Besides the S3 call counts, each poll all the backupstore monitors will also try to update each Volumes lastBackup information.

For v1.1.1 we have added a change to the backupstore polling so that it’s only run by a single node. If you don’t use the DR volume feature you can still leave the backupstore polling turned off. Otherwise you can now safely enable it and the S3 call count should be heavily reduced since the equation now only depends on the volume count and no longer on the node count.

https://github.com/longhorn/longhorn-manager/pull/846

This seems to be a big issue. In my case I pay more than 4 USD per month for backup of my longhorn volumes which only consume 1 GB storage

Note the answer given to set the “backup store poll” to 0, this fixed my problem personally, maybe it does for you as well