badger: Seek iterator with Reverse doesn't work

Code:


package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"os"

	"github.com/dgraph-io/badger"
)

func main() {
	dir, err := ioutil.TempDir("", "badger")
	if err != nil {
		log.Fatal(err)
	}
	defer os.RemoveAll(dir)

	opts := badger.DefaultOptions
	opts.Dir = dir
	opts.ValueDir = dir

	db, err := badger.Open(opts)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	bkey := func(i int) []byte {
		return []byte(fmt.Sprintf("%09d", i))
	}
	bval := func(i int) []byte {
		return []byte(fmt.Sprintf("%025d", i))
	}

	txn := db.NewTransaction(true)

	// Fill in 1000 items
	n := 1000
	for i := 0; i < n; i++ {
		err := txn.Set(bkey(i), bval(i))
		if err != nil {
			log.Fatal(err)
		}
	}

	err = txn.Commit(nil)
	if err != nil {
		log.Fatal(err)
	}

	opt := badger.DefaultIteratorOptions
	opt.PrefetchSize = 10
	opt.Reverse = true

	// Iterate over 1000 items
	var count int
	err = db.View(func(txn *badger.Txn) error {
		it := txn.NewIterator(opt)
		for it.Seek([]byte("0")); it.ValidForPrefix([]byte("0")); it.Next() {
			count++
		}
		return nil
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Counted %d elements", count)
}

This code returns 0 elements in reverse mode and 1000 elements with reverse = false

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 15 (3 by maintainers)

Most upvoted comments

You can set reverse and then Seek to prefix + 0xFF byte. That’d bring you to prefix.

@manishrjain I realize this comment is going on 3 years old, but it’s subtly incorrect, and reverse iterating over a prefix with badger is still very difficult to get right.

The method suggested in this issue: to find the last key with prefix [0x1, 0x2], set reverse=true and then Seek to [0x1, 0x2, 0xFF] But this doesn’t quite solve the problem when data exists with both the prefix and the 0xFF byte, as in the following example DB:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF <----- reverse-Seek([0x1, 0x2, 0xFF])
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2
0x1 0x3
0x1 0x3 0x1

I believe the solution that accounts for all cases is much more involved.

  1. You need to compute the first key lexicographically greater than [prefix, 0xFF, 0xFF, ..., 0xFF], call this increment(prefix) a. increment([0x1, 0x2]) = [0x1, 0x3] b. increment([0x1, 0xFF, 0xFF]) = [0x2] c. increment([0xFF, 0xFF, 0xFF]) = nil, will need to call it.Rewind() instead i. ^ it.Rewind() does NOT work with a prefix AND reverse=True, will need to append one more 0xFF byte than the key length ii. So one special case when all bytes are 0xFF: increment([0xFF, 0xFF, 0xFF]) = [0xFF, 0xFF, 0xFF, 0xFF]
  2. Then reverse-Seek on increment(prefix)
  3. Finally, adjust by calling it.Next() if you arrive at a valid key that doesn’t match the prefix. In this case, you have overshot the last key with the prefix by exactly one.

DB without exact match on incremented key:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF 
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2 <--- reverse-Seek([0x1, 0x3]), it.Valid() == true, arrive at the correct final key for the prefix [0x1, 0x2]
0x1 0x3 0x1

DB with exact match on incremented key:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF 
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2 <--- after it.Next(), arrive at the correct final key for the prefix [0x1, 0x2]
0x1 0x3 <----- reverse-Seek([0x1, 0x3]), it.Valid() == false, it.Item() != nil
0x1 0x3 0x1

It’s all about the choice of seek key. If you want the first data from reverse, you can just call Rewind after setting Reverse option.