mne-python: Calling get_data() on preloaded Epochs returns wrong units
Description of the problem
Hello,
When creating an Epochs object with the argument preloaded set to True, the get_data method does not work as expected when passing a unit that is not volts. It will work well the first time, but subsequent calls will each time return data that is multiplied by the unit factor. Please note that this only happens when the preload argument is passed as True due to the if statement of line 1670 of epochs.py.
Steps to reproduce
# Create fake EEG Epochs: All channels are a constant value of 1uV.
eeg = np.ones((8, 2500)) * 1e-6
info = mne.create_info(ch_names=[str(i) for i in range(1,9)], sfreq=250, ch_types="eeg")
raw = mne.io.RawArray(eeg, info)
epochs = mne.make_fixed_length_epochs(raw, duration=0.5, overlap=0, preload=True)
# Get the data for the first time: Works well
data = epochs.get_data(units="uV")
print(data[0,0,0]) # This will correctly print the first sample of the first channel of the first epoch in uV (1 uV)
# Get the data for the second time: Unexpected result
data = epochs.get_data(units="uV")
print(data[0,0,0]) # This will return the value from before, but multiplied by 1e6, which is unexpected (100000 uV)
Link to data
No response
Expected results
Subsequent calls to epochs.get_data(units=“uV”) should always return the same value, since the underlying data is not supposed to be modified. In the previous code, this value should be 1uV.
Actual results
After the first call to epochs.get_data(units=“uV”) the underlying data is multiplied by 10^6 (if the unit is uV) each time that get_data is called, therefore giving different results each time.
Additional information
Platform macOS-10.16-x86_64-i386-64bit Python 3.8.13 (default, Mar 28 2022, 06:16:26) [Clang 12.0.0 ] Executable /Users/pmainar/miniconda3/envs/xdf/bin/python CPU i386 (8 cores) Memory 16.0 GB
Core ├☑ mne 1.5.0 ├☑ numpy 1.22.4 (OpenBLAS 0.3.18 with 4 threads) ├☑ scipy 1.9.0 ├☑ matplotlib 3.7.0 (backend=module://ipympl.backend_nbagg) ├☑ pooch 1.6.0 └☑ jinja2 3.1.2
Numerical (optional) ├☑ sklearn 1.0.2 ├☑ numba 0.57.1 ├☑ pandas 1.4.2 └☐ unavailable nibabel, nilearn, dipy, openmeeg, cupy
Visualization (optional) ├☑ qtpy 2.2.0 (None=None) ├☑ ipympl 0.9.3 └☐ unavailable pyvista, pyvistaqt, ipyvtklink, vtk, pyqtgraph, mne-qt-browser
Ecosystem (optional) └☐ unavailable mne-bids, mne-nirs, mne-features, mne-connectivity, mne-icalabel, mne-bids-pipeline
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 19 (18 by maintainers)
I think correctness/safety outweighs efficiency here. Returning a view sometimes and a copy others is dangerous and can lead to silent and hard to diagnose (or even see the symptoms of in the first place) bugs. A use could very easily .get_data, do an in place operation on the array, and now their Epochs are silently broken.
If we really want to return a view sometimes I think we’d have to add a copy=True (default) that people can set to false to get non-copied data when it’s possible to get it.
But in the latter case, I think we should really warn (“you requested a view, but this is not possible with your combination of arguments, so you’re getting a copy instead”).
Argh actually it looks like historically (for at least 10 years!) we returned a view / non-copy, e.g.:
https://github.com/mne-tools/mne-python/blob/265b9bfdd43ba571fca1e7271b0481a2ae2eb903/mne/epochs.py#L444
So I guess we should live with it. Looking at https://mne.tools/stable/generated/mne.Epochs.html#mne.Epochs.get_data I don’t see any warning about this so @pablomainar would you be up for adding the minimal fix (in the case of unit scaling, add a copy) and also adding a
.. warning::toepochs.get_datathat says sometimes the result will be a view, which means that modifying it in-place will modify the data inEpochsitself to be incorrect?