nushell: Don't keep duplicate entries in history file
Related problem
Currently, the history file contains duplicates. Sorting them using uniq will take a quadratic time complexity.
❯ benchmark { history | reverse | where exit_status != 1 | uniq }
39sec 115ms 678µs 598ns
❯ benchmark { history | reverse | where exit_status != 1 }
39ms 329µs 969ns
It is 39 seconds vs 39 milliseconds. The gap is huge.
One alternative is to pass only one column to uniq
benchmark { history | select command exit_status | where exit_status != 1 | get command | uniq }
This is faster in the command-line benchmark but the lag is still noticeable in menu history.
This is the excerpt of my history menu:
{
# List all unique successful commands
name: all_history_menu
only_buffer_difference: true
marker: "? "
type: {
layout: list
page_size: 10
}
style: {
text: green
selected_text: green_reverse
}
source: { |buffer, position|
history
| select command exit_status
| where exit_status != 1
| where command =~ $buffer
| each { |it| {value: $it.command } }
| reverse
| uniq # ⚠️
}
}
Without uniq it is instantaneous. But you get duplicates everywhere.
Describe the solution you’d like
We should have a similar feature as HISTCONTROL=ignoreboth:erasedups in bash.
Describe alternatives you’ve considered
Other user mention this alternative.
It is meaningful-ooo/sponge: 🧽 Clean fish history from typos automatically.
Sponge quietly runs in the background and keeps your shell history clean from typos, incorrectly used commands, and everything you don’t want to store due to privacy reasons.
Additional context and details
It is the output of our discussion on why the history menu took so long.
In the meantime, I use this script:
# clean-nushell-db
#!/usr/bin/env nu
let db = "~/dotfiles/nushell/.config/nushell/history.sqlite3"
def get_current_row [] {
let current_row = (^sqlite3 $db "SELECT COUNT(*) FROM history h")
echo $"current rows: ($current_row)"
}
get_current_row
# Remove failed commands
sqlite3 $db "DELETE FROM history WHERE exit_status != 0"
# Remove duplicates. But keep one.
# https://stackoverflow.com/a/53693544/6000005
sqlite3 $db "DELETE FROM history WHERE id NOT IN (SELECT MIN(id) FROM history h GROUP BY command_line);"
get_current_row
Related:
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 6
- Comments: 15 (11 by maintainers)
An option to skip duplicates on-fly would be great. Keeping the everlasting history for statistics may be useful, but only if you need statistics. And how many users are actually interested in analyzing their history?
At the moment I have 4500 history entries in .zsh_history with deduplication enabled, collected in few years, and 10000 entries in ~/.config/nushell/history.txt from few weeks of usage. This is definitely not scalable. While housekeeping is an option (like removing dups/repacking history once per week or so), the on-fly deduplication is better (and faster, as with unique entries you are unlikely to need to handle thousands of lines)
it would be even better to have a setting, that updates previous same command with current timestamp