arrow: [C++] R Session Aborted, R encountered a fatal error after write_dataset command

Describe the bug, including details regarding any error messages, version, and platform.

I am trying to save gps data from an online biologging database to a local arrow dataset using R studio with R version R.4.2.3. I get the warning "R Session Aborted, R encountered a fatal error. The session was terminated, after I run this line of code.

gps %>% filter( dep_id %in% dd, # Only dep_ids from our list dd deployed == 1 # only data collected on the bird ) %>% collect() %>% group_by(site, subsite, species, year, metal_band, dep_id) %>% arrow::write_dataset('raw_data/gps', format = "csv")

The problem seems to be with the write_dataset function from the arrow package, but I’m not sure why and what I could try to prevent R from Aborting.

Any troubleshooting advice would be greatly appreciated!

Thank you in advance!

Component(s)

R

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 26 (23 by maintainers)

Most upvoted comments

Thank you @thisisnic! Here are the outputs to the codes:

arrow::arrow_info()

Arrow package version: 11.0.0.3

Capabilities:

dataset TRUE substrait FALSE parquet TRUE json TRUE s3 TRUE gcs TRUE utf8proc TRUE re2 TRUE snappy TRUE gzip TRUE brotli TRUE zstd TRUE lz4 TRUE lz4_frame TRUE lzo FALSE bz2 TRUE jemalloc FALSE mimalloc TRUE

Arrow options():

arrow.use_threads FALSE

Memory:

Allocator mimalloc Current 1.69 Mb Max 1.79 Mb

Runtime:

SIMD Level avx2 Detected SIMD Level avx2

Build:

C++ Library Version 11.0.0 C++ Compiler GNU C++ Compiler Version 10.3.0 Git ID 58286965ec6974f700ff9fe3f7dcbe56095878d7

sessionInfo()

R version 4.2.3 (2023-03-15 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=English_Canada.utf8 LC_CTYPE=English_Canada.utf8
[3] LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Canada.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_1.0.10 config_0.3.1 arrow_11.0.0.3 RPostgres_1.4.5
[5] RPostgreSQL_0.7-5 DBI_1.1.3

loaded via a namespace (and not attached): [1] Rcpp_1.0.9 dbplyr_2.2.1 pillar_1.8.1 compiler_4.2.3
[5] seabiRds_0.1.0 tools_4.2.3 bit_4.0.4 lubridate_1.8.0 [9] lifecycle_1.0.2 tibble_3.1.8 lattice_0.20-45 pkgconfig_2.0.3 [13] rlang_1.0.5 cli_3.4.0 rstudioapi_0.14 yaml_2.3.5
[17] crawl_2.3.0 mvtnorm_1.1-3 terra_1.6-17 raster_3.6-3
[21] generics_0.1.3 vctrs_0.4.1 hms_1.1.2 bit64_4.0.5
[25] grid_4.2.3 tidyselect_1.1.2 glue_1.6.2 R6_2.5.1
[29] fansi_1.0.3 sp_1.5-0 tzdb_0.3.0 purrr_0.3.4
[33] blob_1.2.3 magrittr_2.0.3 codetools_0.2-19 ellipsis_0.3.2
[37] assertthat_0.2.1 utf8_1.2.2

I will try writing a subset of rows to disk as requested and will get back to you ASAP.