netcdf-c: nc_get_vars incredibly slow in Windows compared to Linux
OS: Windows 10 NetCDF version: 4.9.1
I am trying to read a 3D double variable (2000 x 512 x 512) from a netCDF4 file with the following parameters: start = {0,0,0} count[] = {1000, 256, 256}; stride[] = {2, 2, 2}; chunk size: {20, 10, 10} shuffle: no deflate : yes deflate_level : 6
I time the call to nc_get_vars. On Debian 11, it takes ~25 seconds. On Windows 10, it takes ~130 seconds.
I would expect Windows to be slightly slower, but >5x slowdown is unexpected. I see similar slowdowns with ‘nc_get_vars_double’
On the contrary, using ‘nc_get_var_double’ or ‘nc_get_var’ to read the whole variable is significantly faster (~3 sec on Linux, and ~1 sec on Windows)
-
Is there a way to optimize the performance of ‘nc_get_vars’ or ‘nc_get_vars_double’ so that Windows performance is closer to Linux performance?
-
Is reading the whole variable using ‘nc_get_var’ to memory and then slicing it later an option? I have seen that there were some discussions regarding this (https://github.com/Unidata/netcdf-c/issues/908) and that a submission was made to make strided reads faster. But for my variable, reading the whole variable still seems to be significantly faster than strided reads (especially on Windows)
Please find the link to the nc file here. Here is my code:
#include <stdio.h>
#include <string.h>
#include <netcdf.h>
#include <cstdlib>
#include <iostream>
#include <chrono>
int
main()
{
int status;
int ncid;
int varid;
int elems_x = 256;
int elems_y = 256;
int elems_z = 1000;
double* outData = (double*)malloc (elems_x*elems_y*elems_z*sizeof(double));
size_t start[] = {0, 0, 0};
size_t count[] = {1000, 256, 256};
ptrdiff_t stride[] = {2, 2, 2};
// open the NetCDF-4 file
status = nc_open("repro_nc4file.nc", NC_NOWRITE, &ncid);
if(status != NC_NOERR) {
printf("Could not open file.\n");
}
// get the varid
status = nc_inq_varid(ncid, "my_var", &varid);
printf("status after inq var = %d\n", status);
printf("varid = %d\n", varid);
// get the strided subset
auto timestart = std::chrono::high_resolution_clock::now();
status = nc_get_vars(ncid, varid, start, count, stride, outData);
auto timeend = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::seconds>(timeend - timestart);
std::cout << "Execution time: " << duration.count() << " seconds" << std::endl;
printf("status after getting strided subset = %d\n", status);
// close the file
status = nc_close(ncid);
printf("status after close = %d\n", status);
printf("End of test.\n\n");
return 0;
}
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 15 (8 by maintainers)
Thanks @WardF for taking a look.