openssl: OSSL_DECODER_CTX_set_selection doesn't apply the selection value properly

I read carefully to decide which is better in the openssl-users@openssl.org mailing list or opening this issue ticket on GitHub. As my question is about the OpenSSL API used in the OpenSSL Ruby bindings. I am trying to fix a bug in the OpenSSL Ruby bindings. I thought that perhaps my question is more close to the developing OpenSSL rather than using OpenSSL. But let me know if you think my case is to ask on the mailing list. I am happy to post it on it. Sorry for that.


I am debugging the OpenSSL Ruby bindings to fix a bug. Please let me know what’s wrong in the code. Perhaps the APIs are wrongly called?

You can reproduce this bug by doing git clone on my forked repository branch: https://github.com/junaruga/openssl/tree/wip/fips-read-report that includes some debugging commits on the master branch. However, the reproducing steps are a bit complicated, please let me know if there are commands that you want me to run to find additional info.

Reproducing steps

Environment

My local environment is Fedora 37. However I was able to reproduce this issue on the Ubuntu (the ubuntu-latest) on the GitHub Actions too. And this issue also happens with the both cases of OpenSSL built from the source code without any patch files, and OpenSSL RPM package on RHEL 9.1.

In the reproducing steps below, the used OpenSSL version is OpenSSL 3.0.8 compiled from the source without any patch files. The LD_LIBRARY_PATH is used to load the OpenSSL.

$ cat /etc/fedora-release 
Fedora release 37 (Thirty Seven)

$ rpm -q gcc
gcc-12.2.1-4.fc37.x86_64

$ gcc --version
gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ LD_LIBRARY_PATH=/home/jaruga/.local/openssl-3.0.8-fips-debug/lib/ \
  ~/.local/openssl-3.0.8-fips-debug/bin/openssl version
OpenSSL 3.0.8 7 Feb 2023 (Library: OpenSSL 3.0.8 7 Feb 2023)

1. Install OpenSSL with FIPS mode option.

I compiled the openssl with fips mode, and debug flags (-O0 -g3 -ggdb3 -gdwarf-5 flags.) as I wanted to debug. But this issue happens with the openssl compiled without the debug flags.

$ ./Configure --prefix=$HOME/.local/openssl-3.0.8-fips-debug --libdir=lib shared linux-x86_64 enable-fips -O0 -g3 -ggdb3 -gdwarf-5
$ make -j4
$ make install

And here is the OpenSSL config file used in the later process.

$ cat ~/.local/openssl-3.0.8-fips-debug/ssl/openssl_fips.cnf
config_diagnostics = 1
openssl_conf = openssl_init

.include /home/jaruga/.local/openssl-3.0.8-fips-debug/ssl/fipsmodule.cnf
#.include ./fipsmodule.cnf

[openssl_init]
providers = provider_sect
alg_section = algorithm_sect

[provider_sect]
fips = fips_sect
base = base_sect

[base_sect]
activate = 1

[algorithm_sect]
default_properties = fips=yes

Then I used this program to check if the fips mode is available.

$ LD_LIBRARY_PATH=/home/jaruga/.local/openssl-3.0.8-fips-debug/lib/ \
  OPENSSL_CONF=/home/jaruga/.local/openssl-3.0.8-fips-debug/ssl/openssl_fips.cnf \
  ~/git/openssl-test/fips_mode
FIPS mode provider available: 1
FIPS mode enabled: 1

2. Install Ruby and Compile OpenSSL Ruby bindings.

Below is the steps to install Ruby and to compile the Ruby OpenSSL bindings with the latest stable Ruby 3.2. I am compiling with the -O0 -g3 -ggdb3 -gdwarf-5 flags. You can skip the section. Note that at that time, Ruby 3.2.1 was the latest stable one. But now the Ruby 3.2.2 is the latest stable one.

Install Ruby

In the case of compiling with the source on the GitHub.

$ pwd
/home/jaruga/git/ruby

$ git clone https://github.com/ruby/ruby.git

$ cd ruby

$ pwd
/home/jaruga/git/ruby/ruby

$ git checkout v3_2_1

Run autoconf to generate the configure script.

$ ./autogen.sh

Or in the case of downloading the Ruby 3.2.1 source article from the Ruby official website.

Then I installed the Ruby with the commands below. The --enable-mkmf-verbose option makes the bundle exec rake compile command in the later process print the C compiler (gcc) command lines.

$ ./configure \
  --prefix=/usr/local/ruby-3.2.1 \
  --enable-shared \
  --enable-mkmf-verbose
$ make
$ make install

Or in the case of using Ruby RPM package on Fedora Linux, you see an error by the bundle exec rake compile in the later process. Here is a workaround.

$ sudo dnf install ruby ruby-devel

Then you can set the PATH for the installed Ruby.

.bashrc

...
PATH="/usr/local/ruby-3.2.1/bin:${PATH}"
PATH="${HOME}/.gem/ruby/3.2.0/bin:${PATH}"
...
export PATH

Compile OpenSSL Ruby bindings.

If you want to compile with the branch on my forked repository to reproduce this issue ticket:

$ git clone -b wip/fips-read-report https://github.com/junaruga/openssl.git

or if you want to compile with the original repository:

$ git clone https://github.com/ruby/openssl.git

Then I installed the dependency RubyGems packages.

$ cd openssl

$ pwd
/home/jaruga/git/ruby/openssl

$ which ruby
/usr/local/ruby-3.2.1/bin/ruby

$ which bundle
/usr/local/ruby-3.2.1/bin/bundle

$ ruby -v
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]

$ bundle exec install --standalone

I compiled the OpenSSL Ruby bindings.

$ bundle exec rake compile

If you want to clean to compile again by bundle exec rake compile, you can run the command below.

$ rm -rf tmp/ lib/openssl.so

3. Run the command raising the error.

I created a testing pem file.

$ openssl genrsa -out key.pem 4096

Then I ran the OpenSSL Ruby binding to read the pem file from the OpenSSL Ruby binding. In the result of the command, you see the error message “Could not parse PKey (OpenSSL::PKey::PKeyError)” that comes from the OpenSSL Ruby binding, and it comes from the following the OSSL_DECODER_from_bio(dctx, bio) returning 0. See below.

The other parts in the output, the [DEBUG] ... is by my printf debugging log. And the ... Input type: ... is by the ERR_print_errors_fp(stdout). The ossl_pkey_read_generic function is called 2 times, and the OSSL_DECODER_from_bio function is called 3 times in each ossl_pkey_read_generic function called.

$ LD_LIBRARY_PATH=/home/jaruga/.local/openssl-3.0.8-fips-debug/lib/ \
  OPENSSL_CONF=/home/jaruga/.local/openssl-3.0.8-fips-debug/ssl/openssl_fips.cnf \
  ruby -I lib -e "require 'openssl'; OpenSSL::PKey.read(File.read('key.pem'))"
[DEBUG] Calling ossl_pkey_read_generic from ossl_dh_initialize.
[DEBUG] Calling OSSL_DECODER_from_bio 1.
003C0D92E17F0000:error:1E08010C:DECODER routines:OSSL_DECODER_from_bio:unsupported:crypto/encode_decode/decoder_lib.c:101:No supported data to decode. Input type: DER
[DEBUG] Calling OSSL_DECODER_from_bio 2.
003C0D92E17F0000:error:1E08010C:DECODER routines:OSSL_DECODER_from_bio:unsupported:crypto/encode_decode/decoder_lib.c:101:No supported data to decode. Input type: PEM
[DEBUG] Calling OSSL_DECODER_from_bio 3.
[DEBUG] Calling ossl_pkey_read_generic from ossl_pkey_new_from_data.
[DEBUG] Calling OSSL_DECODER_from_bio 1.
003C0D92E17F0000:error:1E08010C:DECODER routines:OSSL_DECODER_from_bio:unsupported:crypto/encode_decode/decoder_lib.c:101:No supported data to decode. Input type: DER
[DEBUG] Calling OSSL_DECODER_from_bio 2.
[DEBUG] Calling OSSL_DECODER_from_bio 3.
-e:1:in `read': Could not parse PKey (OpenSSL::PKey::PKeyError)
	from -e:1:in `<main>'

$ echo $?
1

The error comes from the OSSL_DECODER_from_bio returning the 0.

https://github.com/junaruga/openssl/blob/41bc792df2cf54660264bd6fc6368044f2877e99/ext/openssl/ossl_pkey.c#L149

ext/openssl/ossl_pkey.c#L149

 145     OSSL_BIO_reset(bio);
 146     OSSL_DECODER_CTX_set_selection(dctx, 0);
 147     while (1) {
 148         printf("[DEBUG] Calling OSSL_DECODER_from_bio 3.\n");
 149         if (OSSL_DECODER_from_bio(dctx, bio) == 1) /* <= This OSSL_DECODER_from_bio returns 0! */
 150             goto out;
 151         ERR_print_errors_fp(stdout);
 152         if (BIO_eof(bio))
 153             break;
 154         pos2 = BIO_tell(bio);
 155         if (pos2 < 0 || pos2 <= pos)
 156             break;
 157         ossl_clear_error();
 158         pos = pos2;
 159     }

Debugging

ltrace

First, I captured the ltrace log by the command below. Because I think the ltrace log is good to see how the OpenSSL APIs are called in the process. You can see the OSSL_DECODER_from_bio is called totally 6 times in the ltrace. So, the 6th called OSSL_DECODER_from_bio fails and causes the error. Note that unfortunately, the log is by the ltrace in the first part, and by the ruby in the second part unfortuntely.

$ LD_LIBRARY_PATH=/home/jaruga/.local/openssl-3.0.8-fips-debug/lib/ \
  OPENSSL_CONF=/home/jaruga/.local/openssl-3.0.8-fips-debug/ssl/openssl_fips.cnf \
  ltrace -ttt -f -l openssl.so -l libssl.so.3 -l libcrypto.so.3 \
  ruby -I lib -e "require 'openssl'; OpenSSL::PKey.read(File.read('key.pem'))" >& ltrace_ttt.log

GDB

Debug around the OSSL_DECODER_from_bio

I debugged the gdb by the command below. The reason why I am setting the LD_LIBRARY_PATH in the gdb prompt is because the system openssl is the dependency of the gdb command. The gdb fails hiding the system openssl by referring to the manually installed openssl by LD_LIBRARY_PATH.

$ OPENSSL_CONF=/home/jaruga/.local/openssl-3.0.8-fips-debug/ssl/openssl_fips.cnf \
  gdb --args ruby -I lib -e "require 'openssl'; OpenSSL::PKey.read(File.read('key.pem'))"
(gdb) set environment LD_LIBRARY_PATH /home/jaruga/.local/openssl-3.0.8-fips-debug/lib/

After some steps, below is soon after calling the 6th OSSL_DECODER_from_bio returning the 0 as a error. And the values of the input arguments *dctx and *bio are below.

(gdb) b ossl_pkey_read_generic
(gdb) r
(gdb) c
(gdb) n
(gdb) f
#0  ossl_pkey_read_generic (bio=0x7be610, pass=4) at ../../../../ext/openssl/ossl_pkey.c:151
151	        ERR_print_errors_fp(stdout);
(gdb) p *dctx
$4 = {start_input_type = 0x7fffe57d5489 "PEM", input_structure = 0x0, selection = 0, decoder_insts = 0x7bf030, construct = 0x7fffe51e5640 <decoder_construct_pkey>, 
  cleanup = 0x7fffe51e5984 <decoder_clean_pkey_construct_arg>, construct_data = 0x69ed30, pwdata = {type = is_pem_password, _ = {expl_passphrase = {
        passphrase_copy = 0x7fffe5792cdd <ossl_pem_passwd_cb> "UH\211\345SH\203\354HH\211}ȉuĉU\300H\211M\270H\213E\270H\211E\350H\213E\350H\211\307\350q\361\377\377\204\300\017\204", <incomplete sequence \356>, passphrase_len = 4}, pem_password = {password_cb = 0x7fffe5792cdd <ossl_pem_passwd_cb>, password_cbarg = 0x4}, ossl_passphrase = {passphrase_cb = 0x7fffe5792cdd <ossl_pem_passwd_cb>, passphrase_cbarg = 0x4}, 
      ui_method = {ui_method = 0x7fffe5792cdd <ossl_pem_passwd_cb>, ui_method_data = 0x4}}, flag_cache_passphrase = 1, cached_passphrase = 0x0, cached_passphrase_len = 0}}
(gdb) p *bio
$5 = {libctx = 0x0, method = 0x7fffe54c90a0 <mem_method>, callback = 0x0, callback_ex = 0x0, cb_arg = 0x0, init = 1, shutdown = 1, flags = 512, retry_reason = 0, num = 0, ptr = 0x7c1c30, next_bio = 0x0, 
  prev_bio = 0x0, references = 1, num_read = 1504, num_write = 0, ex_data = {ctx = 0x0, sk = 0x0}, lock = 0x6c7730}

And here is the backtrace.

(gdb) bt
#0  ossl_pkey_read_generic (bio=0x7be610, pass=4) at ../../../../ext/openssl/ossl_pkey.c:151
#1  0x00007fffe57ad75c in ossl_pkey_new_from_data (argc=1, argv=0x7ffff7443048, 
    self=140737035361920) at ../../../../ext/openssl/ossl_pkey.c:222
#2  0x00007ffff7b309f7 in vm_call_cfunc_with_frame (ec=0x40a0c0, reg_cfp=0x7ffff7542f90, 
    calling=<optimized out>) at /home/jaruga/src/ruby-3.2.1/vm_insnhelper.c:3268
#3  0x00007ffff7b35d44 in vm_sendish (method_explorer=<optimized out>, 
    block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, 
    ec=<optimized out>) at /home/jaruga/src/ruby-3.2.1/vm_callinfo.h:367
#4  vm_exec_core (ec=0x0, initial=initial@entry=0) at /home/jaruga/src/ruby-3.2.1/insns.def:820
#5  0x00007ffff7b3bdf9 in rb_vm_exec (ec=0x40a0c0, jit_enable_p=jit_enable_p@entry=true)
    at vm.c:2383
#6  0x00007ffff7b3cde8 in rb_iseq_eval_main (iseq=<optimized out>) at vm.c:2633
#7  0x00007ffff7951755 in rb_ec_exec_node (ec=ec@entry=0x40a0c0, n=n@entry=0x7ffff7e7bab8)
    at eval.c:289
#8  0x00007ffff7957c7b in ruby_run_node (n=0x7ffff7e7bab8) at eval.c:330
#9  0x0000000000401102 in rb_main (argv=0x7fffffffda48, argc=5) at ./main.c:38
#10 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:57

Debug deeply in the OSSL_DECODER_from_bio

As a reference, I stepped in the OSSL_DECODER_from_bio. Running the GDB from the start again, then here is a part that causes the error in the OSSL_DECODER_from_bio. As the ok is 0, and the decoder_process returns the 0.

(gdb) f
#0  decoder_process (params=0x7fffffffd100, arg=0x7fffffffd2b0)
    at crypto/encode_decode/decoder_lib.c:747
747	            ok = (rv > 0);
(gdb) p rv
$6 = 0

Here are input arguments of the function decoder_process and local variables at the same point crypto/encode_decode/decoder_lib.c:747.

(gdb) p *params
$8 = {key = 0x7fffe53ffb0b "data-structure", data_type = 4, data = 0x7fffe53ffb5e, 
  data_size = 14, return_size = 18446744073709551615}
(gdb) p *data
$9 = {ctx = 0x7be7c0, bio = 0x0, current_decoder_inst_index = 36, recursion = 1, 
  flag_next_level_called = 1, flag_construct_called = 1, flag_input_structure_checked = 0}
(gdb) i lo
rv = 0
p = 0x7fffe53ffb1f
trace_data_structure = 0x7fffffffd178 ""
data = 0x7fffffffd2b0
ctx = 0x7be7c0
decoder_inst = 0x7c0460
decoder = 0x7b9b20
cbio = 0x0
bio = 0x0
loc = 140737039563551
i = 1
ok = 0
new_data = {ctx = 0x7be7c0, bio = 0x0, current_decoder_inst_index = 0, recursion = 2, 
  flag_next_level_called = 0, flag_construct_called = 0, flag_input_structure_checked = 0}
data_type = 0x0
data_structure = 0x0
__func__ = "decoder_process"

Here is the backtrace.

(gdb) bt
#0  decoder_process (params=0x7fffffffd100, arg=0x7fffffffd2b0)
    at crypto/encode_decode/decoder_lib.c:747
#1  0x00007fffe5363268 in pem2der_decode (vctx=0x7c0440, cin=0x7c0b30, selection=0, 
    data_cb=0x7fffe51e36e9 <decoder_process>, data_cbarg=0x7fffffffd2b0, 
    pw_cb=0x7fffe525bc84 <ossl_pw_passphrase_callback_dec>, pw_cbarg=0x7be7f8)
    at providers/implementations/encode_decode/decode_pem2der.c:204
#2  0x00007fffe51e3d6e in decoder_process (params=0x0, arg=0x7fffffffd3e0)
    at crypto/encode_decode/decoder_lib.c:962
#3  0x00007fffe51e248a in OSSL_DECODER_from_bio (ctx=0x7be7c0, in=0x7bdea0)
    at crypto/encode_decode/decoder_lib.c:81
#4  0x00007fffe57ad64a in ossl_pkey_read_generic (bio=0x7bdea0, pass=4)
    at ../../../../ext/openssl/ossl_pkey.c:149
#5  0x00007fffe57ad75c in ossl_pkey_new_from_data (argc=1, argv=0x7ffff7443048, 
    self=140737035361920) at ../../../../ext/openssl/ossl_pkey.c:222
#6  0x00007ffff7b309f7 in vm_call_cfunc_with_frame (ec=0x40a0c0, reg_cfp=0x7ffff7542f90, 
    calling=<optimized out>) at /home/jaruga/src/ruby-3.2.1/vm_insnhelper.c:3268
#7  0x00007ffff7b35d44 in vm_sendish (method_explorer=<optimized out>, 
    block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, 
    ec=<optimized out>) at /home/jaruga/src/ruby-3.2.1/vm_callinfo.h:367
#8  vm_exec_core (ec=0x7fffe53d6ef4, initial=140737039563551, initial@entry=0)
    at /home/jaruga/src/ruby-3.2.1/insns.def:820
#9  0x00007ffff7b3bdf9 in rb_vm_exec (ec=0x40a0c0, jit_enable_p=jit_enable_p@entry=true)
    at vm.c:2383
#10 0x00007ffff7b3cde8 in rb_iseq_eval_main (iseq=<optimized out>) at vm.c:2633
#11 0x00007ffff7951755 in rb_ec_exec_node (ec=ec@entry=0x40a0c0, n=n@entry=0x7ffff7e7bab8)
    at eval.c:289
#12 0x00007ffff7957c7b in ruby_run_node (n=0x7ffff7e7bab8) at eval.c:330
#13 0x0000000000401102 in rb_main (argv=0x7fffffffda48, argc=5) at ./main.c:38
#14 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:57

Please let me know if you want to see additional information. I am happy to help for that! Thank you for reading this, and thank you for your help.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 46 (18 by maintainers)

Commits related to this issue

Most upvoted comments

Just to avoid having to deal with Ruby, I made a test program that essentially does what your extension does, but limits itself to the problem domain. I can confirm seeing the same problem in my runs.

https://gist.github.com/levitte/7a27cebdb9537ff0a59641c9a5bed53d

What is the status of this issue?

As for the 0 for selection - that’s slightly underdocumented but citing the OSSL_DECODER_CTX_new_for_pkey() manpage:

The search of decoder implementations can also be limited with I<keytype>
and I<selection>, which specifies the expected resulting keytype and contents.
NULL and zero are valid and signify that the decoder implementations will
find out the keytype and key contents on their own from the input they get.

Yeah, but this also demands the cooperation of surrounding code… and that cooperation is lacking for the moment. That’s an implementation detail, however, and is therefore not quite suitable for that particular manual.

I do have some possible ideas (yup, and old branch I have lying around)… it needs quite a bit of testing, though

Yeah, your workaround is correct.

As for the 0 for selection - that’s slightly underdocumented but citing the OSSL_DECODER_CTX_new_for_pkey() manpage:

The search of decoder implementations can also be limited with I<keytype>
and I<selection>, which specifies the expected resulting keytype and contents.
NULL and zero are valid and signify that the decoder implementations will
find out the keytype and key contents on their own from the input they get.

Thank you!! I am still reading your comment to understand it. And I will apply your patch above to my my OpenSSL Ruby binding code!

I figured it out…

This is caused by the combination of a decoder in one provider and the keymgmt in another. This causes an export/import dance ('cause you must assume that they have different internal representations of keys, even if they are the same key type), seen here:

https://github.com/openssl/openssl/blob/40f4884990a1717755df366e2aa06d01a1affd63/crypto/encode_decode/decoder_pkey.c#L151-L166

However, that’s not the problem per se. However, there are indeed two bugs:

  1. In the snippet above, data->selection is used, which is fine in itself. It’s set in ossl_decoder_ctx_setup_for_pkey(), precisely here:

    https://github.com/openssl/openssl/blob/40f4884990a1717755df366e2aa06d01a1affd63/crypto/encode_decode/decoder_pkey.c#L413

    However, later calls to OSSL_DECODER_CTX_set_selection() never updates this, so the initial selection from the OSSL_DECODER_CTX_new_for_pkey() call remains throughout the lifetime of that OSSL_DECODER_CTX. This means that the setting of the selection EVP_PKEY_KEYPAIR later on in ossl_pkey_read_generic() has no effect.

    This is the reason that Calling OSSL_DECODER_from_bio 2 failed.

  2. The selection 0 isn’t treated right. It should mean “gimme whatever you’ve got”, but our keymgmt implementations don’t cooperate with that mindset in the export/import scenario… I’m not quite sure how this should be resolved.

I think it shouldn’t be too hard to fix the first bug. The second… not so sure.

With an OpenSSL built with enable-trace, I added these lines to my program:

    BIO *trace_bio = BIO_new_fp(stderr, BIO_NOCLOSE | BIO_FP_TEXT);
    OSSL_trace_set_channel(OSSL_TRACE_CATEGORY_DECODER, trace_bio);

That’s a lot of output, but one line that I think tells a bit of the story is this (where {n} is really 0 or 1):

(ctx 0x...) >> Running constructor => {n}

When running with the FIPS module, the last such line has {n} being 0, while with the default module, it’s 1. That gives me an indication where to look:

https://github.com/openssl/openssl/blob/40f4884990a1717755df366e2aa06d01a1affd63/crypto/encode_decode/decoder_pkey.c#L68-L70

If this is an universal problem then I’d anyway check via strace if openssl config and providers are loaded.