napari: Incorrect processing of keybindings consisting of a combination of modifiers without a key

šŸ› Bug

napari incorrectly handles keybindings consisting of a combination of modifiers without a key such as Control-Alt. If you try to add a callback on that combination, it doesn’t work.

The issue comes from the app-model package in two different ways:

  1. It cannot correctly and consistently parse a combination of modifiers without a key from a string. For example, Ctrl-Alt and Alt-Ctrl produce two different keybindings (https://github.com/pyapp-kit/app-model/blob/332d59b06e5f85e8c39f8e72810b2ad6a344f6a0/src/app_model/types/_keys/_keybindings.py#L236-L270)
  2. When Ctrl-Alt comes from the system it produces two different keybindings depending on which key is pressed first:
  • SimpleKeyBinding(ctrl=True, alt=True, key=<KeyCode.Alt>) if Ctrl precedes Alt.
  • SimpleKeyBinding(ctrl=False, alt=True, key=<KeyCode.Ctrl>) if Alt precedes Ctrl.

You may not believe it, however this issue can be overcome by using some hacky key strings. To reliable handle, for example, the Control-Alt combination, you need to bind your callback to the following 2 combinations:

bind_key('Control-Alt-Alt', callback)  
bind_key('Alt-Control', callback)

Remember, that the order is very important! Control-Alt instead of Alt-Control doesn’t work šŸ˜„ .

To Reproduce

Steps to reproduce the behavior:

  1. Try to bind your callback with this: @napari.Viewer.bind_key('Control-Alt').
  2. It doesn’t work.

cc @tlambert03

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 36 (34 by maintainers)

Most upvoted comments

one more potential approach I want to put out there is the possibility of having the on-press only trigger after a certain time if there are conflicts (e.g. with Ctrl+Z and Ctrl bound, the Ctrl keybinding will only trigger after 200ms)

it is responsibility of a person who does it. If they do it, it means they do it intentionally.

this is a good point. I’d be open to this on the sole condition that it is done by the user. I don’t want a plugin dev setting this as their default keybinding for something and then making users confused when they’re pressing Shift as part of a combo and something weird happens

I commented on the other PR, but I’m a big fan of the functional shortcuts built around easy reach and gestures, rather than F is for fill T is for tool, etc. When using copy/paste/cut/undo, you naturally have a hand on that sector of the keyboard, so using the modifier keys and ASDFZXCV and the mouse/stylus/trackpad is very natural. For example, I’ve restored the old modifiers: alt for eraser and ctrl for fill, shift to preserve labels—it’s easier than reaching to the numbers or looking for P on the right side of the keyboard. Photoshop uses both the letter -> mnemonic and the modifiers on clicks/drags/etc—often tool specific. You need to learn it, but it becomes second nature–so says my wife anyways, who uses photoshop all the time.

In the brush mode CVAT allows to change the brush size by holding Ctrl+Alt + mouse move. I’d say it is much more faster and convenient to control the brush size in that way than using any keyboard keys for decreasing and increasing it by a fixed step.

Ah yes, Blender does this too but with the F key in sculpting mode (and Shift+F for intensity) - it’s incredibly useful

@kne42 You can use individual modifiers in other than just mouse mode scenarios. For example, I bind Shift to make a labels layer invisible while you hold it, and make it visible again when you release it (you can inspect that a mask is drawn correctly as the mask itself covers the original image). It’s really handy for me and for other people that I work with as the Shift is a big key that many people can automatically find and press without much thinking, which is very important when it comes to providing efficient annotation UI. You also can apply some temporary visual transformations on Shift/Ctrl/Alt when you hold them and so on.

The biggest advantage of these keys that they are much easier to locate and remember for many people than usual keys like A, B, etc. So, I’m strongly against removing individual modifiers from binding.

which do you think would be more appropriate?

I’m not sure the distinction matters all that much. given the amount of time you’ve spent thinking about it. I suspect everyone would be very happy to hear your opinion (and of course, the conversation will inevitably follow)

mouse events

yeah, I agree that modifier-mouse events are definitely useful. one way or the other, that should be possible

If someone tries to bind a modifier that conflicts with other shortcuts, issue a warning message that shows a list of possible conflicting shortcuts.

This seems like something that should go in no matter what we choose. It would also work with plugins, and it would allow users to know what to re-bind if they are using conflicting plugin.

Please avoid using the argument that we don’t care about our users.

@jni I’m really sorry if my phrase is interpreted like that, I didn’t want at all to imply that you don’t care about your users. I just wanted to emphasize that there is a possibility that you may not fully understand the real needs of your users, and by doing something with the best of intentions you could actually make it worse (if you disabled or limited using single key modifiers it would be exactly this case for me personally).

Your argument is that Shift is more useful than V because it’s easier to find, right?

One of the advantages, yes. First, it is a big well-known key, that almost anyone can find automatically without looking at the keyboard. Second, I place my left hand in the left bottom corner of the keyboard during segmentation, so that my fingers are on Ctrl, Alt, Shift, Z, X at the same time. It allows to do the following things without moving the hand at all:

  • Undo the last operation Ctrl+Z
  • Switch between adding and removing by pressing X
  • Hide/show labels by holding Shift
  • Change the brush size by holding Ctrl+Alt + mouse move

All of the mentioned actions can be performed tens of times during editing a single segmentation mask.

If you placed hide/show on V, I would need to move my index finger to find that key, which is an unnecessary movement and it would be easier to accidentally miss and hit another key. The whole thing about ergonomics is to minimize any unnecessary movements during the process as much as possible.

Do you really have struggling users with spurious Ctrl triggering when they press Ctrl-Z given the fact that they bind Ctrl themselves or any similar scenarios? I highly doubt so.

No, when binding is initiated by the user, this hasn’t happened. But on an application level, we indeed already had this very discussion:

https://github.com/napari/napari/pull/3072#issuecomment-893199773

That is my point, I’m not saying forbid this behaviour, I’m saying, we should be careful in how we expose it to minimise user confusion.

I think if you care about your users and devs, you should not decrease flexibility.

Please avoid using the argument that we don’t care about our users. (I recommend following this rule both here and in other projects.) Maintaining a popular open source project is hard enough (e.g.) without being accused of not caring about our users. It is very demotivating. We have hundreds of ongoing conversations like this one about many design decisions because we care about our users. The whole conversation is about reducing confusing behaviour for users.

As to whether more flexibility is always better, I recommend this blog post by Paul Ganssle:

https://blog.ganssle.io/articles/2023/01/attractive-nuisances.html

A particularly insidious type of attractive nuisance occurs when you have a feature that may have legitimate niche use cases, but also provides a solution to a particularly common XY Problem. Imagine that 0.1% of all your users would find feature X useful, but it will induce 5% of your users into using an anti-pattern — by adding the feature, you would get a 50:1 misuse to use ratio. This isn’t even so bad when the misuses are easily detectable (e.g. pandas’ SettingWithCopyWarning), but sometimes the anti-pattern is something that causes problems only at scale, or only way down the road.

I’m not saying that this is the case here! I’m saying that it is worth pausing and considering whether this is the case, before rushing to implement something.

Everyone here is trying to consider carefully the best way forward, and arguments like ā€œother approaches are bad if you care about your usersā€ do not promote careful consideration.

I apologise @ksofiyuk if all that came across as harsh; it is not my intent: I am just trying to express how I felt above and why I felt that way, so that we can work better together in the future. I totally appreciate the effort you’re putting into napari! And I’m confident we can find a solution here that satisfies everyone.

At any rate:

It’s Python, even if you remove this functionality from public API, I can get access to the internals of napari or Qt and change its behavior in completely unexpected ways by patching methods or doing other manipulations.

See above about ā€œattractive nuisanceā€ — the discussion is not about what’s possible, it’s about what’s encouraged, and how.

I’m sure there are other people who find very useful, for example, binding some temporary visual effects like making the labels layer visible/invisible to Shift

Oh yeah this is definitely useful. I remember some user logs for people tasked with proofreading segmentations at Janelia — one of the proofreaders pressed the F key (which was the shortcut to toggle hide/show segmentation) >18,000 times per day! 😵 Actually in napari this used to be V for the currently-selected layer, but that’s currently broken. I think this is a known issue but I can’t find it — I wonder if @psobolewskiPhD or @kne42 remember?

Your argument is that Shift is more useful than V because it’s easier to find, right?

Anyway, let’s work together to understand all the requirements, risks, and possible solutions — there’s been a lot of discussion and different examples (CVAT, Shift, modifier order), each of which might have different solutions.

@ksofiyuk

If someone binds something to Ctrl and it can be triggered by Ctrl-Z, it is responsibility of a person who does it. If they do it, it means they do it intentionally.

I think this argument goes out the window once we allow plugins to set key bindings. At that point, if a plugin enables this key binding, there is a high probability that napari gets a bug report about the weird visual flicker when they do Ctrl-Z, and it becomes the core devs’ job to hunt down which plugin caused this weird thing to happen.

@ksofiyuk I would love to see that PR, please ping me for review if you open it!

I think that kind of behavior is unacceptable for software that tries to be useful for annotation as there are many possible ways to effectively utilize modifiers keys. If you look at similar software such as CVAT, it has some useful hotkeys that are bind to only modifiers.

Circling back to this could you provide specific examples of what they’re doing since I’m not familiar with their software? That would help inform my proposal greatly.

Currently I’m thinking about allowing single-modifier bindings for users only but not multi-modifier only bindings due to the added complexity.

the above would be better implemented by adding some contexts around whether or not modifiers are pressed which could be queried directly during the mouse callback