extendr: `no_na()` always returns `FALSE`

The following code

use extendr_api::prelude::*;

fn main() {
    test! {
        let vals1 = Integers::from_values([Rint(1)]);
        dbg!(vals1.no_na());

        let vals2 = Integers::from_values([Rint::na()]);
        dbg!(vals2.no_na());

        let vals3 = Integers::from_values([Rint(1), Rint(2)]);
        dbg!(vals3.no_na());

        let vals4 = Integers::from_values([Rint(1), Rint::na()]);
        dbg!(vals4.no_na());
    }
}

outputs

[src/main.rs:6] vals1.no_na() = FALSE
[src/main.rs:9] vals2.no_na() = FALSE
[src/main.rs:12] vals3.no_na() = FALSE
[src/main.rs:15] vals4.no_na() = FALSE

whereas I would expect the first and third line to be TRUE. The same happens for Doubles and I imagine also other wrappers.

I’m using R 4.2.0 and extendr-api at master (4de57caf577371a9105f2a1669111a00e406f68f).

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 15 (13 by maintainers)

Most upvoted comments

Ok, now it seems I begin to understand. So if *_NO_NA(...) returns TRUE, we are sure there are no NAs (like in ALTREP 1:5). However, if it is a regular vector (e.g., c(1, 5)), it performs no real checks and returns FALSE, but there might still be no NAs. If that is indeed the case, than I am not a fan of .no_na() name, as it is misleading. It should be something like no_na_or_unknown() or any similar variation. And perhaps return not a bool or Rbool, but rather an enum with NoNA and Unknown variants. That way if we test an altrep guaranteed to have no NAs, we get a NoNA response. Otherwise, we get Unknown. Not exactly a thin wrapper, but at least it won’t confuse people that much.

In my understanding, this behavior is nothing strange. *_NO_NA() functions are prepared to allow the developers to shortcut when the vector contains no NA; the developer can implement “no NA” version of the function for efficiency. Just like that is_sorted() == FALSE doesn’t necessarily mean the vector is not sorted, no_na() == FALSE doesn’t guarantee there’s NA. This might sound strange, but this is what it is (that said, I haven’t used these functions by myself, so I might be wrong).

I’m yet to figure out how to expose these “shortcut” methods properly. They simply look too cryptic for ordinary human…

@andy-thomason , I found this the hard way just now. Very strange behaviour TBH.

cpp11::cpp_function("int test_fn(SEXP x) {return INTEGER_NO_NA(x);}")
testthat::expect_equal(
    test_fn(1L:2L),
    1
)
testthat::expect_equal(
    test_fn(c(1L, 2L)),
    1
)
#> Error: test_fn(c(1L, 2L)) not equal to 1.
#> 1/1 mismatches
#> [1] 0 - 1 == -1

Created on 2022-05-17 by the reprex package (v2.0.1)