openssl: openssl hang at crypto/rand/rand_unix.c:494
Hello, After upgrade openssl to 1.1.1d, our sshd process hang at function wait_random_seeded, as the /dev/random (in our old system) is always unavailable.
compat_futex(0x2ad32eb0, 0x81 /* FUTEX_??? */, 2147483647) = 0
shmget(0x72, 1, 0) = -1 ENOENT (No such file or directory)
newuname({sys="Linux", node="qd02-s00c08h4", ...}) = 0
open("/dev/random", O_RDONLY) = 3
compat_select(4, [3], NULL, NULL, NULL /// hang here
Could we enhance here to break after timeout?
Thanks, Mark
/* Open /dev/random and wait for it to be readable */
if ((fd = open(DEVRANDOM_WAIT, O_RDONLY)) != -1) {
if (DEVRANDM_WAIT_USE_SELECT && fd < FD_SETSIZE) {
FD_ZERO(&fds);
FD_SET(fd, &fds);
494---
while ((r = select(fd + 1, &fds, NULL, NULL, NULL)) < 0
&& errno == EINTR);
} else {
while ((r = read(fd, &c, 1)) < 0 && errno == EINTR);
}
-->
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 31 (26 by maintainers)
The point is, that /dev/urandom will likely produce the same random looking numbers each time the system boots up. You basically had no security at all with 1.1.1b in a system that has no entropy source. What changed is only that openssl is now aware of the situation.
Adding a timeout here would be bad. Without a source of entropy, there is no security: it’s the same as running telnetd.
To get going again:
/dev/randomworking.-with-rand-seed=egd.--with-rand-seed=noneto the configure line and accept that there will be no security.DEVRANDOM_WAITto be something that won’t block (e.g./dev/zero) and accept that there will be no security.The change log need to be read from bottom up, so the newest entry is “DEVRANDOM … improved for older Linux systems” But since your kernel is 4.19.10, your linux has the getrandom syscall, and your system will most likely “hang” in this call:
syscall(__NR_getrandom, buf, buflen, 0);as you said, that can take several minutes, but will only happen once, after that no further delay is expected.While I respect your ability to assess risk of the potential insecurity in your specific situation, I will state for the broader audience that
virtio-rng-pciis the generally recommended option for this sort of scenario, and the kernel PRNG is functioning as designed in the VM environment with poor entropy-collection characteristics.A timeout is probably reasonable, but using SIGALRM to do so is not – the library shouldn’t be messing with the application’s signal handlers more than we already are.
@bernd-edlinger got it spot on. /dev/urandom isn’t going to provide security in this instance.
@beldmit, a diagnostic after a timeout is a good suggestion. Assuming the wait continues afterwards…
@t8m you just invented the perfect perpetuum mobile which solves all our entropy source problems: let’s just send a few kBytes of Lorem ipsum to the syslog daemon before reading from /dev/random.
There could be a syslog message created after some timeout and then openssl could go back to waiting for entropy. Actually producing the syslog message could theoretically even help in producing entropy in the kernel if the syslog message is saved to a hard drive or sent over network somewhere.
Noted. We’ll investigate more and see if this is something that we can reasonably fit into our configuration.
I do, however, think that the release notes should clarify the situation around
DEVRANDOM_WAITas they’re currently very contradictory. It’s explicitly stated that it’s disabled for Linux, yet the functionality remains enabled for Linux.Edit: And thank you guys for taking the time to comment, I really appreciate it.
@bernd-edlinger From
openssl-1.1.1d/NEWS(==>added by me):From
openssl-1.1.1d/CHANGES:This is a fresh installation of Gentoo (OpenSSL 1.1.1d built from source) on the
4.19.10kernel (Yes, it’s a bit out of date, no, I unfortunately cannot change it). This system is running as a virtual machine and hangs until the crng successfully initializes (which can takes upwards of 6 minutes). The older VM we had that used OpenSSL 1.0.2o did not exhibit this issue. One of my coworkers was able to add avirtio-rng-pcidevice and get it to initialize in a more normal amount of time, but we’d prefer to not have to tweak our VM config as a work around to this issue. Our current plan is to rebuild OpenSSL 1.1.1d withDEVRANDOM_WAITset to/dev/urandom, as we’re not overly worried about the potential insecurity in our specific situation.