ksh: Failure to launch external commands due to race condition in 'posix_spawn' usage

There has been a bug report of an exec format error when executing “cat TODO | while read line; do ls; done”.

Ref:

I was able to reproduce this issue on 93u+m 2020-07-14 and it appears to be an issue at least since 93u+ 2012-08-01.

$ cat TODO
1
2
$ cat TODO | while read line; do ls; done
ls: ls: cannot execute [Exec format error]
ls: ls: cannot execute [Exec format error]

Tagging @chrisbertin and @mirabilos - reporters of the issue.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 88

Commits related to this issue

Most upvoted comments

Here are some new tests on my Mac laptop.

With ksh’s racy use of posix_spawn:

$ ./ksh_POSIX_SPAWN
$ printf "%s\n" 1 2 3 | while read line; do ls /dev/null; done
/dev/null
ls: ls: cannot execute [Exec format error]
ls: ls: cannot execute [Exec format error]
$ time for ((i=1; i!=10000; i++)); do /usr/bin/true; done

real	0m13,84s
user	0m04,48s
sys	0m08,30s

With ksh’s vfork fallback, after applying the patch based on OpenSUSE’s:

$ ./ksh_VFORK
$ printf "%s\n" 1 2 3 | while read line; do ls /dev/null; done
/dev/null
/dev/null
/dev/null
$ time for ((i=1; i!=10000; i++)); do /usr/bin/true; done

real	0m13,97s
user	0m04,54s
sys	0m08,39s

…and, for reference, the classic fork/exec that ksh2020 reverted to:

$ ./ksh_FORK
$ printf "%s\n" 1 2 3 | while read line; do ls /dev/null; done
/dev/null
/dev/null
/dev/null
$ time for ((i=1; i!=10000; i++)); do /usr/bin/true; done

real	0m21,76s
user	0m07,23s
sys	0m11,65s

@JohnoKing’s Linux tests and my own Mac tests (above) suggest that vfork performs just as well as posix_spawn (if there is any difference, it’s lost in the margin of error), and it doesn’t have these problems. So in terms of performance, this is not going to be a significant loss at all.

I’ll note that there is currently a patch that adds a POSIX_SPAWN_TCSETPGROUP flag to glibc: https://patchwork.sourceware.org/project/glibc/patch/20211223173003.1037286-1-adhemerval.zanella@linaro.org

If that patch gets merged into glibc’s upstream codebase then perhaps we could use that flag to safely avoid usage of fork() in interactive shells.

I just merge this patch [1] and it will be included on glibc 2.35.

[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=342cc934a3bf74ac618e2318d738f22ac93257ba

I’ll note that there is currently a patch that adds a POSIX_SPAWN_TCSETPGROUP flag to glibc: https://patchwork.sourceware.org/project/glibc/patch/20211223173003.1037286-1-adhemerval.zanella@linaro.org

If that patch gets merged into glibc’s upstream codebase then perhaps we could use that flag to safely avoid usage of fork() in interactive shells.

So, after this slightly desperate megathread, could the fix really be this trivial? Please test…

diff --git a/src/cmd/ksh93/sh/xec.c b/src/cmd/ksh93/sh/xec.c
index 4f99919..0991406 100644
--- a/src/cmd/ksh93/sh/xec.c
+++ b/src/cmd/ksh93/sh/xec.c
@@ -1587,7 +1587,7 @@ int sh_exec(register const Shnode_t *t, int flags)
 #else
 #if SHOPT_SPAWN
 #   ifdef _lib_fork
-                               if(com)
+                               if(com && !job.jobcontrol)
                                        parent = sh_ntfork(shp,t,com,&jobid,ntflag);
                                else
                                        parent = sh_fork(shp,type,&jobid);

@hyenias, yes – disabling SHOPT_SPAWN (by adding -DSHOPT_SPAWN=0 to the compiler option) will still revert to the classic fork/exec mechanism, just as it does now. The patch causes the AST spawnveg() function to use vfork, but disabling SHOPT_SPAWN causes spawnveg() to not be used at all.

That’s what Apple has been doing all along for their /bin/ksh, which is how I know this. I found I had to remove -DSHOPT_SPAWN=0 as a default compiler option from cc.darwin in order for vfork to be used.

@aweeraman, hold off a little – this fix exposes a variant of #86 that only occurs with job control active. I’m working out a good way to fix it.

This is brilliant, @McDutchie @JohnoKing et al. Thanks for your persistence!

@chrisbertin and @mirabilos - this will be included in the next upload fairly shortly.

I think it makes sense now. If job control is not active, the terminal process group ID never needs to be set; child processes simply inherit the terminal group from the parent.

However, with job control active, each child process spawned from the main shell environment (not subshells) needs to be in its own terminal process group, or you can’t separate the jobs from each other. But you can’t do this with posix_spawn. Hence this bug.

So it’s clear that this posix_spawn usage is incompatible with job control and that is why this is so trivial to fix/avoid.

What I don’t understand is why it doesn’t fail consistently.

That patch is an effective fix for the race conditions, at least from my own testing.

@JohnoKing, first a question: I’ve sent you a test email, did you receive it? I think there may be a delivery problem.

I did receive the email. I also sent a reply.

Second, something occurred to me. The original bug report above is only reproducible on an interactive shell. So I decided to test for the testprocessgroup.c race condition on FreeBSD from a script. And I can’t reproduce it. Can you confirm the race condition cannot be reproduced if the test loop is run from a script?

I can confirm that the race condition is not reproducible in scripts with either posix_spawn or vfork; it only occurs in interactive shells (tested on Linux and FreeBSD). This means we can keep using posix_spawn, but only in scripts.

I’ve done some testing and the race condition still occurs on FreeBSD when spawnveg uses vfork. It looks like fork will have to be used there. ~as a reference the other shell that uses vfork on Linux (dash) uses fork on FreeBSD (going off of the output from truss)~. The feature test will still need to be fixed though since it fails with fork, even though the race condition doesn’t actually occur with fork.

The ENOTTY error makes sense: the argument to tcgetpgrp(2) is a file descriptor, and 2 is standard error. And standard error is not on a tty; iffe redirects standard error (to /dev/null, probably) so that negative test results don’t clutter the terminal with compiler errors.

I know how I would deal with that in shell: I’d use a 2>/dev/tty redirection. In C I’d probably just open /dev/tty using open(2) and use the returned file descriptor instead of 2, something like this:

/* testgrp.c */
#include <stdlib.h>
#include <unistd.h>
#include <termios.h>
#include <sys/wait.h>
#include <fcntl.h>

int main(void)
{
    int tty_fd = open("/dev/tty", O_RDWR | O_NONBLOCK | O_NOCTTY);
    
    if (tty_fd < 0)
        exit(2);

    /* note: on older OSes ioctl will need to be used instead */
    /*tcsetpgrp(tty_fd, getpid());*/
    pid_t parent_tcpgrp = tcgetpgrp(tty_fd);
    
    pid_t child = vfork();
    if(!child)
        tcsetpgrp(tty_fd, getpid());
    /* The wait is in case vfork is just fork */
    wait(NULL);
    if(tcgetpgrp(tty_fd) == parent_tcpgrp)
        exit(1);
    exit(0);
}

Re the crash (which I can reproduce on Linux but not on the Mac), I think it makes some vague kind of sense that this would terminate an interactive shell, as you’re running tcsetpgrp on the terminal associated with the interactive shell, which I think could cause an inconsistent state. I could be off base there, but my testing says that it works when run non-interactively, like:

$ arch/*/bin/ksh -c 'v=$(./testgrp; echo $?); echo $v' 2>/dev/null
0

I note that on Linux, $? is 0, but on the Mac and FreeBSD, it is 1… bad news?

If vfork can in fact be used validly (as dash does), then why wouldn’t the same apply to posix_spawn?

The posix_spawn API does not have a standard way of setting the terminal group in a child process. POSIX_SPAWN_SETPGROUP exists, but that doesn’t set the terminal group. From Glenn Fowler’s response to a similar question on the old mailing list:

the ast spawnveg() when iffe’d for posix_spawn() uses POSIX_SPAWN_SETPGROUP this doesn’t help with tcsetattr() however since process group != terminal group

Section of code in spawnveg that uses POSIX_SPAWN_SETPGROUP: https://github.com/ksh93/ksh/blob/bd88cc7f4f7207b48104e37cca90848adf5cb10c/src/lib/libast/comp/spawnveg.c#L62-L63

As far as extensions to posix_spawn are concerned, the only one I’m aware of that may fix this problem is POSIX_SPAWN_TCSETPGROUP, which is only present in QNX and BlackBerry OS: http://www.qnx.com/developers/docs/7.0.0/index.html#com.qnx.doc.neutrino.lib_ref/topic/p/posix_spawn.html https://developer.blackberry.com/native/reference/core/com.qnx.doc.neutrino.lib_ref/topic/p/posix_spawn.html

I found the trigger for the freeze. It is the redirect 2>/dev/null at the end of the test, which suppresses the extra newline ksh prints at the end of an interactive session. Removing that fixes the freeze, but prints an extra newline while running the tests. But an explicit exit also works to suppress that newline, and does not cause a freeze.

Here is a new version of the regression test that avoids the freeze. Note that stderr needs to be on a terminal to trigger a test failure on ksh with posix_spawn usage.

# ======
# Test for bug in ksh binaries that use posix_spawn() for external commands.
# See discussion at: https://github.com/ksh93/ksh/issues/79
actual=$(
	"$SHELL" -i <<-\EOF 2>/dev/tty
	printf '%s\n' 1 2 3 4 5 | while read
	do	ls /dev/null
	done 2>&1
	# suppress extra newline printed by ksh -i on termination
	exit
	EOF
)
expect=$'/dev/null\n/dev/null\n/dev/null\n/dev/null\n/dev/null'
[[ $actual == "$expect" ]] || err_exit 'Race condition while launching external commands' \
	"(expected $(printf %q "$expect"), got $(printf %q "$actual"))"

This is the code spawnveg runs in the child process after fork and vfork: https://github.com/ksh93/ksh/blob/bd88cc7f4f7207b48104e37cca90848adf5cb10c/src/lib/libast/comp/spawnveg.c#L207-L225 tcsetpgrp and ioctl aren’t run when launching commands. Inserting error(ERROR_warn(0) in the if(m) statements can be used to confirm it:

# Add a debug warning before tcsetpgrp
--- a/src/lib/libast/comp/spawnveg.c
+++ b/src/lib/libast/comp/spawnveg.c
@@ -216,7 +216,10 @@ spawnveg(const char* path, char* const argv[], char* const envv[], pid_t pgid)
 				setpgid(pgid, 0);
 #if _lib_tcgetpgrp
 			if (m)
+			{
+				error(ERROR_warn(0), "[DEBUG]: Setting process group");
 				tcsetpgrp(2, pgid);
+			}
 #else
 #ifdef TIOCSPGRP
 			if (m)
$ $(whence -p true)
# Even with the above patch, there is no debug output after running the external `true` command.
# This means the child process didn't set the terminal process group.

This issue can be fixed for vfork and fork by running tcsetpgrp (or ioctl) before setpgid. Below is a quick patch that does that:

--- a/src/lib/libast/comp/spawnveg.c
+++ b/src/lib/libast/comp/spawnveg.c
@@ -211,7 +211,14 @@ spawnveg(const char* path, char* const argv[], char* const envv[], pid_t pgid)
 		{
 			m = 0;
 			if (pgid == 1 || pgid == -2 && (m = 1))
+			{
 				pgid = getpid();
+				if(!m)
+				{
+					error(ERROR_warn(0), "[DEBUG]: Setting process group");
+					tcsetpgrp(2, pgid);
+				}
+			}
 			if (setpgid(0, pgid) < 0 && errno == EPERM)
 				setpgid(pgid, 0);
$ $(whence -p true)
arch/linux.i386-64/bin/ksh: /usr/bin/true: warning: [DEBUG]: Setting process group
$ 

After applying the above patch, I’m no longer able to reproduce the race condition with fork or vfork when SHOPT_SPAWN is enabled (reproducer). edit: This only fixes the race condition on FreeBSD when spawnveg uses fork. vfork can’t be used on FreeBSD, although it works fine on Linux (see https://github.com/ksh93/ksh/issues/79#issuecomment-661426491).

Since the thread is now quite long, here are direct links to the OpenSUSE patches: Disable posix_spawn (fallback to vfork): https://build.opensuse.org/package/view_file/shells/ksh/ksh-qemu.patch?expand=1

Disable vfork (combine with above patch to force spawnveg to use fork): https://build.opensuse.org/package/view_file/shells/ksh/ksh93-disable-vfork.dif?expand=1

I was reading through this issue’s content again. In it, you referenced a Tue, 14 Jun 2011 13:33:15 -0700 email from David Korn that stated:

ksh93 uses posix_spawn() rather than fork/exec to run a simple command for performance reasons. However, posix_spawnv() doesn’t have a way to set the terminal group before the exec(). The parent process sets the terminal group which leads to the race.

There is a simple change you can make to a program to guarentee that the terminal will get set at the beginning of the program. You can add two lines. One line is #include <termios.h>

and for the other line
tcsetpgrp(2,getpgrp()); Alternatively, you can write a program that does these two lines and then execs the original program.

Is it possible to create a shim program that performs these two lines of code then executes the original program? If so, does that resolve the issue, or would that be too messy and be slower than the fork solution? The shim program might even be able to be a builtin if possible. Perhaps an environment flag passed along to the subshell that tells ksh to fixup the terminal.

vfork() is a pre-VM, or, at least pre-COW, semi-kludge and I thought it would have been retired by now… IMHO…

Chris (chris.bertin@gmail.com or chris.bertin@protonmail.com)

On Sat, Jul 18, 2020 at 9:00 AM Martijn Dekker notifications@github.com wrote:

@hyenias https://github.com/hyenias, yes – disabling SHOPT_SPAWN (by adding -DSHOPT_SPAWN=0 to the compiler option) will still revert to the classic fork/exec mechanism, just as it does now. The patch causes the AST spawnveg() function to use vfork, but disabling SHOPT_SPAWN causes spawnveg() to not be used at all.

That’s what Apple has been doing all along for their /bin/ksh, which is how I know this. I found I had to remove -DSHOPT_SPAWN=0 as a default compiler option from cc.darwin in order for vfork to be used.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ksh93/ksh/issues/79#issuecomment-660502713, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQKGJUO23JAMHYWJYJGFRIDR4HBJDANCNFSM4O65E6KA .

@JohnoKing: Thank you for your review of the vfork caveats. Would it be possible to introduce the vfork implementation so one could selectively compile using vfork, posix_spawn, or fork [something like SHOPT_SPAWN]? I think having the ability to alter the way ksh creates a child process may assist in confirming its use and/or troubleshooting in the future. It also might be helpful to have some feature flag identifying it.

The lib_poll_fd_2 feature test doesn’t need to be removed to fix the posix_spawn related bugs. That change was likely made to fix something specific to QEMU.

Some notes regarding ksh usage of vfork:

If posix_spawn is removed it shouldn’t be done by removing the SHOPT_SPAWN ifdef; OpenSUSE’s patch should be instead. Results from performance testing on my Linux machine (ksh is compiled with CCFLAGS=-O2):

# Normal build with posix_spawn
$ ./ksh-posix-spawn -c 'time for ((i=1; i!=10000; i++)); do /usr/bin/true; done'

real    0m02.03s
user    0m01.73s
sys 0m00.36s

# posix_spawn removed with OpenSUSE's patch
$ ./ksh-no-posix-spawn -c 'time for ((i=1; i!=10000; i++)); do /usr/bin/true; done'

real    0m01.85s
user    0m01.60s
sys 0m00.28s

# SHOPT_SPAWN ifdef removed with unifdef(1)
$ ./ksh-no-shopt-spawn -c 'time for ((i=1; i!=10000; i++)); do /usr/bin/true; done'

real    0m03.15s
user    0m02.37s
sys 0m00.89s

Relevant strace output from ksh -c '/usr/bin/true; /usr/bin/true':

#  Normal build with posix_spawn
getpid()                                = 4296
mmap(NULL, 36864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fd12c55f000
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0 
clone(child_stack=0x7fd12c567ff0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD) = 4297
munmap(0x7fd12c55f000, 36864)           = 0 
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigaction(SIGINT, {sa_handler=0x55f8f72bbfd0, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT, sa_restorer=0x7fd12c39d3e0}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

# posix_spawn removed with OpenSUSE's patch
getpid()                                = 4115
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE CHLD], [], 8) = 0
vfork()                                 = 4116
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x56194d18b9f0, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT, sa_restorer=0x7fb7625d73e0}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

# SHOPT_SPAWN ifdef removed with unifdef(1)
stat(".", {st_mode=S_IFDIR|0755, st_size=130, ...}) = 0
rt_sigaction(SIGCHLD, {sa_handler=0x55e765fe75a0, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT, sa_restorer=0x7f9afb8523e0}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f9afb814e50) = 12221
rt_sigaction(SIGINT, {sa_handler=0x55e765fd9a40, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT, sa_restorer=0x7f9afb8523e0}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0

The OpenSUSE patch causes vfork to be used instead of posix_spawn which appears to increase performance slightly while still fixing the exec format error. Removing SHOPT_SPAWN causes the OS implementation of fork to be used instead of vfork and posix_spawn, which is slower (on Linux fork uses clone, but that doesn’t make up for the performance deficit).

Pointing out, tho, that if you follow those 2020 bugs all the way through, they lead you back to the performance degradation that was an large part of the original ksh2020 kerfuffle.

On macOS, the compiler flags used by the Apple version (which I’ve incorporated into src/cmd/INIT/cc.darwin) disable SHOPT_SPAWN. I can confirm that, if I change that flag to -DSHOPT_SPAWN=1 and recompile from scratch, I can reproduce this bug on the Mac.

I can also reproduce this bug on FreeBSD, on which SHOPT_SPAWN is not disabled by default.

Neither of these two systems use GNU libc, so it seems likely that this is a bug in ksh and not in any C library. Does anyone here have an idea how to fix it?

Alternatively, I wonder if there would be any disadvantage to simply removing the use of posix_spawn altogether.

According to the reporter, they get a working ksh93 if they disable posix_spawn configure option so it may be a, very widespread, bug in GNU libc (Linux/kFreeBSD/Hurd-only, I presume).

I can confirm that disabling posix_spawn causes this bug to go away. OpenSUSE disables it in their build of ksh with a patch titled ksh-qemu.patch, which implies there is another bug related to posix_spawn that affects QEMU.