alacritty: Performance regression in scrollback

I’ve just tested the scrollback PR because I wanted to make sure that scrolling is still faster than on master, and it is. However, while testing it I’ve noticed that there were some significant regressions comparing current master to the scrollback PR.

Test Results:

Test Master Scrollback
scrolling 6s 4.4s
alt-screen-random-write 8.5s 12s
scrolling-in-region 4s 13s

These tests are based on https://github.com/jwilm/alacritty.

I’m not sure why I haven’t noticed this earlier, but it seems like these regressions have been present ever since the scrollback PR was created. Especially the regression in scrolling-in-region seems quite massive.

These results were captured on X11/i3/compton/amd/mesa.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 21 (19 by maintainers)

Commits related to this issue

Most upvoted comments

So MacBook Pro 13 inch Iris 550 here, some weird times, discussed with chrisduerr on irc without an idea whats going on. Tested fullscreen.

cargo run --release -- -b 50000000 -w $(tput cols) -h $(tput lines) -c alt-screen-random-write > out.vte

### scrollback
> time cat out.vte
real    0m49,175s
user    0m0,004s
sys     0m1,171s

> time cat out.vte
real    0m38,315s
user    0m0,003s
sys     0m1,094s

### master
> time cat out.vte
real    0m25,754s
user    0m0,004s
sys     0m1,125s

> time cat out.vte
real    0m24,456s
user    0m0,004s
sys     0m1,098s

### kitty // weird thing is, even Mac terminal was about as fast as this
> time cat out.vte
real    0m3,915s
user    0m0,003s
sys     0m0,401s

> time cat out.vte
real    0m4,143s
user    0m0,003s
sys     0m0,407s

some 70MB plain text file, just if that helps

scrollback
real 0m16,870s
user 0m0,004s
sys 0m2,360s

master
real 0m14,022s
user 0m0,004s
sys 0m2,129s

kitty 
real    0m27,617s
user    0m0,002s
sys     0m2,694s

I’m not quite sure. There definitely is still a regression from master to scrollback when it comes to alt-screen-random-write. However the obvious stuff has been fixed.

I’d still like to keep track of places where a potential performance improvement could be achieved, but since I haven’t been able to figure out how to resolve this and the performance hit is not massive, I wouldn’t block scrollback on this.

So I’d either keep this open to track that or maybe open a new issue to indicate that the main issues mentioned in here have been resolved.

I just went through some debugging and performance testing with @chrisduerr on IRC. And we discovered some discrepancy between timing vtebench directly, and with timing cat some-output-of-vtebench. I’ve rerun all the performance tests from @jwilm’s comment above.

The format of this test session is as follows.

=== name of test ===

test command (direct vtebench usage)

test commands (file buffer)

--- master ---

time output (direct vtebench usage)

time output (file buffer)

--- scrollback ---

time output (direct vtebench usage)

time output (file buffer)

I’m lazy and I don’t want to format them as beautifully as he did however, sorry.

=== scrolling-in-region-1 ===

time ~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 1

~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 1 > scrolling-in-region-1.test
time cat scrolling-in-region-1.test

--- master ---

0.03user 65.69system 1:05.87elapsed 99%CPU (0avgtext+0avgdata 4108maxresident)k
0inputs+0outputs (0major+197minor)pagefaults 0swaps

0.00user 66.42system 1:06.53elapsed 99%CPU (0avgtext+0avgdata 1872maxresident)k
0inputs+0outputs (0major+104minor)pagefaults 0swaps

--- scrollback ---

0.03user 55.04system 0:55.15elapsed 99%CPU (0avgtext+0avgdata 3940maxresident)k
0inputs+0outputs (0major+197minor)pagefaults 0swaps

0.00user 54.32system 0:54.37elapsed 99%CPU (0avgtext+0avgdata 1700maxresident)k
0inputs+0outputs (0major+101minor)pagefaults 0swaps

=== scrolling-in-region-2 ===

time ~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 2

~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 2 > scrolling-in-region-2.test
time cat scrolling-in-region-2.test

--- master ---

0.01user 65.61system 1:05.74elapsed 99%CPU (0avgtext+0avgdata 3936maxresident)k
0inputs+0outputs (0major+197minor)pagefaults 0swaps

0.00user 65.42system 1:05.51elapsed 99%CPU (0avgtext+0avgdata 1956maxresident)k
0inputs+0outputs (0major+105minor)pagefaults 0swaps

--- scrollback ---

0.02user 55.41system 0:55.52elapsed 99%CPU (0avgtext+0avgdata 3928maxresident)k
0inputs+0outputs (0major+195minor)pagefaults 0swaps

0.00user 54.56system 0:54.61elapsed 99%CPU (0avgtext+0avgdata 1696maxresident)k
0inputs+0outputs (0major+101minor)pagefaults 0swaps

=== scrolling-in-region-50 ===

time ~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 50

~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) scrolling-in-region --lines-from-bottom 50 > scrolling-in-region-50.test
time cat scrolling-in-region-50.test

--- master ---

0.02user 68.02system 1:08.13elapsed 99%CPU (0avgtext+0avgdata 3952maxresident)k
0inputs+0outputs (0major+196minor)pagefaults 0swaps

0.00user 67.96system 1:08.07elapsed 99%CPU (0avgtext+0avgdata 2000maxresident)k
0inputs+0outputs (0major+107minor)pagefaults 0swaps

--- scrollback ---

0.03user 54.36system 0:54.48elapsed 99%CPU (0avgtext+0avgdata 3940maxresident)k
0inputs+0outputs (0major+197minor)pagefaults 0swaps

0.00user 53.30system 0:53.37elapsed 99%CPU (0avgtext+0avgdata 1784maxresident)k
0inputs+0outputs (0major+103minor)pagefaults 0swaps

=== alt-screen-random-write ===

time ~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) alt-screen-random-write

~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) alt-screen-random-write > alt-screen-random-write.test
time cat alt-screen-random-write.test

--- master ---

22.32user 2.64system 0:27.36elapsed 91%CPU (0avgtext+0avgdata 4056maxresident)k
0inputs+0outputs (0major+193minor)pagefaults 0swaps

0.00user 2.65system 0:05.89elapsed 45%CPU (0avgtext+0avgdata 1708maxresident)k
0inputs+0outputs (0major+102minor)pagefaults 0swaps

--- scrollback ---

22.34user 2.13system 0:26.88elapsed 91%CPU (0avgtext+0avgdata 3992maxresident)k
0inputs+0outputs (0major+193minor)pagefaults 0swaps

0.00user 3.35system 0:10.31elapsed 32%CPU (0avgtext+0avgdata 1700maxresident)k
0inputs+0outputs (0major+103minor)pagefaults 0swaps

=== alt-screen-random-write-color ===

time ~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) -c alt-screen-random-write

~/Code/vtebench/target/release/vtebench --term xterm -b 500000000 -w (tput cols) -h (tput lines) -c alt-screen-random-write > alt-screen-random-write-color.test
time cat alt-screen-random-write-color.test

--- master ---

28.69user 2.43system 0:33.86elapsed 91%CPU (0avgtext+0avgdata 4092maxresident)k
0inputs+0outputs (0major+197minor)pagefaults 0swaps

0.00user 2.76system 0:06.51elapsed 42%CPU (0avgtext+0avgdata 1868maxresident)k
0inputs+0outputs (0major+102minor)pagefaults 0swaps

--- scrollback ---

29.12user 2.26system 0:34.27elapsed 91%CPU (0avgtext+0avgdata 4004maxresident)k
0inputs+0outputs (0major+194minor)pagefaults 0swaps

0.00user 3.44system 0:09.72elapsed 35%CPU (0avgtext+0avgdata 1700maxresident)k
0inputs+0outputs (0major+103minor)pagefaults 0swaps

This somewhat clearly shows how vtebench itself was a bottleneck for the alt-screen-random-write tests. I’ll let @chrisduerr followup if he has anything else to add. It’s also worth noting for anyone else following this thread, that these performance numbers are highly dependent on the screen size, and basically all of the configuration options, not to mention the host machine. So it’s only relevant to compare the percentage difference between master and scrollback, and the two different methods of running. This should be somewhat obvious.

Relevant system info. I use X11, running i3. My video card is a GTX 980 running nouveau drivers. Some additional info below:

uname -a
Linux pukak 4.16.9-1-ARCH #1 SMP PREEMPT Thu May 17 02:10:09 UTC 2018 x86_64 GNU/Linux

cat /proc/cpuinfo | grep "model name" | head -n 1
model name	: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz

Ok, I’ll give it a shot on a few of my systems. I might need a little hand holding, I’ll just find someone on IRC if I do.

Disabling this branch by replacing the instruction with if false && region.start == Line(0) { fixes all performance issues with scrolling-in-region. So the culprit is probably in there.