photutils: Memory errors with refactored code

As noted in another thread, I’m consistently getting out-of-memory errors when running the new PSFPhotometry fitter.

My fitting runs have died at the following stages as ID’d by the progressbar:

Fit source/group:   6%|▋         | 11347/177215 [05:24<1:34:59, 29.10it/s]
Fit source/group:   5%|▍         | 11405/228909 [05:57<30:55:50,  1.95it/s]
Fit source/group:   4%|▍         | 11486/262107 [07:02<26:22:39,  2.64it/s]
Fit source/group:  11%|█         | 11379/102664 [06:45<2:06:09, 12.06it/s]
Fit source/group:   2%|▏         | 11396/591444 [06:51<8:59:34, 17.92it/s]

These are pretty consistent endpoints.

I suspect the problem is that fit_info is being stored in memory. iirc, fit_info includes at least one, and maybe several, copies of the data. Can we minimized the fit_info before storing it? I think only param_cov is used downstream?

Note that I have 256GB of memory allocated for these runs, which imo is a very large amount to dedicate to photometry of a single JWST field-of-view.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 24 (24 by maintainers)

Most upvoted comments

Thanks. Past 15k already, so it looks like an improvement.

Thanks. Your PSF model is ~20 MB. 12_000 of them is ~233 GB (just for the PSF models, not the data, results, etc.). So that seems to be the culprit. The code returns a copy of the fit models. But it’s copying the entire model. For the GriddedPSFModel that is unnecessary because the PSF grid is identical for each model. I can fix this.

yes, I’m using a webbpsf model. Can be reproduced with:

                    import webbpsf
                    obsdate = '2022-08-28'
                    nrc = webbpsf.NIRCam()
                    nrc.load_wss_opd_by_date(f'{obsdate}T00:00:00')
                    nrc.filter = 'F405N'
                    nrc.detector = 'NRCA5'
                    grid = nrc.psf_grid(num_psfs=16, all_detectors=False, verbose=True, save=True)
                    psf_model = grid

I think… I haven’t tested this; in production, the obsdate and some other variables come from FITS headers

EDIT: tested, this works now.