ksh: Autocomplete should not fill partial multibyte characters

If I have files 'XXXá' and 'XXXë' ($'XXX\xc3\xa1' and $'XXX\xc3\xab') and have typed XX as a command argument, autocomplete should on the first Tab append X to show XXX. What actually happens is that autocomplete attempts to complete to $'XXX\xc3', since the first byte of á and ë is the same, displaying XXX^ and leaving the editor in a bad state where subsequent keypresses can move to before the start of the line.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 21

Commits related to this issue

Add --globcasedetect shell option for globbing and completion One of the best-kept secrets of libast/ksh93 is that the code includes support for case-insensitive file name generation (a.k.a. pathname... — committed to ksh93/ksh by McDutchie 3 years ago

Most upvoted comments

Answering the question asked here , I think if you do that, you violate least surprise. The expectation is that what I typed, was used as the match.

Probably not one person in 100 ever thinks about the underlying filesystem being case-insensitive, ESPECIALLY if it is case-preserving, as HFS+ and APFS are… or whack ideas like NTFS where the underlying filesystem is case-sensitive but the API calls are not.

macOS Catalina:

$ cat reproducer.sh
touch A
touch AB
touch AbC

echo A*
echo AB*
$ ksh reproducer.sh 
A AB AbC
AB

Mint Linux:

mint $ cat reproducer.sh 
touch A
touch AB
touch AbC

echo A*
echo AB*
mint $ ksh reproducer.sh 
A AB AbC
AB

I would not expect the second echo to produce AB AbC on macOS and not on Mint, and I know perfectly well that one of those is case-insensitive and the other is not.

Worse would be if the behavior was different.

It’s an issue of “correct” vs “right”.

posguy99 on Mar 19, 2021

Is this a crazy idea?

It’s probably the only the portable-ish way of doing it and should give accurate results, so not crazy. There may be different platform-specific solutions that you can use instead though that might be good enough. A search brings up that pathconf can take _PC_CASE_SENSITIVE on macOS or _PC_CASE_INSENSITIVE on Cygwin, though I have not tested these.

hvdijk on Mar 17, 2021

I think that’s fine, I think the bug is actually at https://github.com/ksh93/ksh/blob/14352ba0a7383151b9503757ae6b8838b57e7000/src/cmd/ksh93/edit/completion.c#L81-L82 If I change this to

       register const char *strnext;
       while((strnext=str,c= mbchar(strnext)) && (d= mbchar(newstr),charcmp(c,d,nocase)))
               str=strnext;

then the problem is avoided. However, charcmp is not designed to be called with anything other than unsigned chars converted to int so this will need more work.

hvdijk on Mar 16, 2021