mne-python: How to deal with anonymization if one of the dates slips out of range?

Working with the somato dataset, I ran into an issue related to the fact that the measurement date I’m intending to anonymize deviates from the date in info['file_id']['secs'] such that, when I reach the desired adjustment for the measurement date, info['file_id']['secs'] goes out-of-range (even though the measurement date would be within the valid range).

MWE:

import mne
import os.path as op
from datetime import datetime

data_path = mne.datasets.somato.data_path()
raw_fname = op.join(data_path, 'sub-01', 'meg', 'sub-01_task-somato_meg.fif')

raw = mne.io.read_raw_fif(raw_fname, verbose='error')
print(f"Date in info['meas_date']: {raw.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.fromtimestamp(raw.info['file_id']['secs'])}")

# Works
print('\nAnonymizing with daysback -> 10 years')
daysback = 10 * 365
raw_anon = raw.copy().anonymize(daysback=daysback)
print(f"Date in info['meas_date']: {raw_anon.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.fromtimestamp(raw_anon.info['file_id']['secs'])}")

# Does not work
print('\nAnonymizing with daysback -> 90 years')
daysback = 90 * 365
raw_anon = raw.copy().anonymize(daysback=daysback)
print(f"Date in info['meas_date']: {raw_anon.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.fromtimestamp(raw_anon.info['file_id']['secs'])}")

Output:

Date in info['meas_date']: 2007-07-05 11:17:11.172243+00:00
Date in info['file_id']:   1970-01-01 01:00:00

Anonymizing with daysback -> 10 years
Date in info['meas_date']: 1997-07-07 11:17:11.172243+00:00
Date in info['file_id']:   1960-01-04 01:35:47

Anonymizing with daysback -> 90 years
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Untitled-2 in 
     21 print('\nAnonymizing with daysback -> 90 years')
     22 daysback = 90 * 365
---> 23 raw_anon = raw.copy().anonymize(daysback=daysback)
     24 print(f"Date in info['meas_date']: {raw_anon.info['meas_date']}")
     25 print(f"Date in info['file_id']:   {datetime.fromtimestamp(raw_anon.info['file_id']['secs'])}")

 in anonymize(self, daysback, keep_his, verbose)

~/Development/mne-python/mne/channels/channels.py in anonymize(self, daysback, keep_his, verbose)
    593         .. versionadded:: 0.13.0
    594         """
--> 595         anonymize_info(self.info, daysback=daysback, keep_his=keep_his,
    596                        verbose=verbose)
    597         self.set_meas_date(self.info['meas_date'])  # unify annot update

 in anonymize_info(info, daysback, keep_his, verbose)

~/Development/mne-python/mne/io/meas_info.py in anonymize_info(info, daysback, keep_his, verbose)
   2245                 'daysback parameter was too large.'
   2246                 'Underlying Error:\n')
-> 2247     _check_dates(info, prepend_error=err_mesg)
   2248 
   2249     return info

~/Development/mne-python/mne/io/meas_info.py in _check_dates(info, prepend_error)
   1454                 if (value[key_2] < np.iinfo('>i4').min or
   1455                         value[key_2] > np.iinfo('>i4').max):
-> 1456                     raise RuntimeError('%sinfo[%s][%s] must be between '
   1457                                        '"%r" and "%r", got "%r"'
   1458                                        % (prepend_error, key, key_2,

RuntimeError: anonymize_info generated an inconsistent info object. daysback parameter was too large.Underlying Error:
info[file_id][secs] must be between "-2147483648" and "2147483647", got "-2838237853"

How shall I go about anonymizing this dataset? I want to use daysback large enough to move the measurement date before 1925, as is required by BIDS.

cc @agramfort

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 32 (32 by maintainers)

Most upvoted comments

I think @bloyl worked on this part if I’m not mistaken. I can take a look but I was only using the functionality as an end-user with mne-bids

here is a fix:

import mne
import os.path as op
from datetime import datetime, timezone

data_path = mne.datasets.somato.data_path()
raw_fname = op.join(data_path, 'sub-01', 'meg', 'sub-01_task-somato_meg.fif')

raw = mne.io.read_raw_fif(raw_fname, verbose='error')

print(f"Date in info['meas_date']: {raw.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.utcfromtimestamp(raw.info['file_id']['secs'])}")

print('\nAfter calling set_meas_date():')
meas_date = datetime.utcfromtimestamp(
    raw.info['file_id']['secs'] + 1e-6 * raw.info['file_id']['usecs'])
meas_date = meas_date.replace(tzinfo=timezone.utc)
raw.set_meas_date(meas_date)
print(f"Date in info['meas_date']: {raw.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.utcfromtimestamp(raw.info['file_id']['secs'])}")

raw.anonymize(daysback=20000)
print(f"Date in info['meas_date']: {raw.info['meas_date']}")
print(f"Date in info['file_id']:   {datetime.utcfromtimestamp(raw.info['file_id']['secs'])}")

don’t know what happened to this file. But what bothers now with current code is that when you anonymize the default for file_id is 1970 and for meas_date it’s 2000 so you quickly end up with data where file_id and meas_date are not consistent. For me this should not happened. This deserves a first fix and then I think we should make sure that when we set_meas_date the file_id should then be adjusted.

thoughts? @alexrockhill @larsoner ?

True, I was only thinking of between subjects. I think the argument should support a date in addition to days.

I was thinking the same, but then – again! – it wouldn’t be easy to preserve relative date differences!

Considering this, having a default date as we have now is not very useful in this regard either. However, changing it to a fixed number of days could lead to an easy de-anonymization if users forget about manually passing daysback.

Hence I conclude: we actually cannot have any reasonable default.

But really this is beyond the scope of this particular issue 😃

Here I will only focus on this:

@agramfort

ok you can try to warn if the dates are inconsistent and only use meas_date as true value

will make a PR shortly

mne-bids will complain. mne-python is not tied to bids. You will write bids files with mne-bids so I don’t see the urge to change anything here. Now we need to make sure mne-bids warns properly.