arrow: [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as date
Sorry for so many issues, but I think this is another bug.
Wrong behavior of the arrow implementation of the lubridate::dmy
.
An invalid date such as ‘00001976’ is being parsed as a valid (and completely unrelated) date.
#in R ‘00001976’ %>% dmy [1] NA Warning message: All formats failed to parse. No formats found.
#In arrow q <- data.table(x=c(‘00001976’,‘30111976’,‘01011976’)) q %>% write_dataset(‘q’) q2 <- ‘q’ %>% open_dataset %>% mutate(x2=dmy) %>% collect q2 x 1: 1975-11-30 2: 1976-11-30 3: 1976-01-01 #notice ‘00001976’ is an invalid date. First row of x2 should be NA!!!
Reporter: Lucas Mation / @lucasmation
Note: This issue was originally created as ARROW-18242. Please see the migration documentation for further details.
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 16 (1 by maintainers)
@paleolimbot As discussed, I tested this out on Windows, setting the locale to “C”. There, I get the same results as shown in the initial reprex and changing the locale doesn’t fix it.