astropy: FittingWithOutlierRemoval doesn't work for model sets
Attempting to use FittingWithOutlierRemoval
in place of a simple LinearLSQFitter
instance produces a ValueError: Input argument u'x' does not have the correct dimensions in model_set_axis=0 for a model set with n_models=4176
. This blocks the very common use case of fitting each row of an image with rejection, as in IRAF’s fit1d
. One could, of course, fit each row with a separate model instance, but that would be much slower – and even with model sets, fitting already seems unpleasantly slow compared with IRAF, so it probably wouldn’t be viable.
Unfortunately, I think fixing this might be non-trivial: Within FittingWithOutlierRemoval
, L557 (also L567-568) selects only good points from an input array of x
co-ordinates, like so:
Obviously the rejected points will differ from one row/model to another, which means that FittingWithOutlierRemoval
cannot simply pass the underlying Fitter
a single array of x
co-ordinates (in N-1 dimensions) to use for every model. Unlike model evaluation, I think fitting of model sets in general only works with a common x
array for all the models, so LinearLSQFitter
and friends may need substantial changes for this to work. Moreover, I believe the matrix equation solved by the underlying np.linalg
routine assumes a 2D Vandermonde matrix, corresponding to a single set of points in x
– I’m not even sure (off the top of my head) that the problem can be expressed in a form that the back-end solver can work with, which would be rather unfortunate.
Would you mind confirming that this is a correct analysis, @nden (assuming you know off hand)? I’ve always had a bit of a hard time discerning from the documentation exactly what dimensionality of inputs the models and fitters accept under different circumstances.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (17 by maintainers)
Commits related to this issue
- Re-write FittingWithOutlierRemoval to work for model sets (where the fitter supports masked input) & add a couple of tests [#6819]. — committed to jehturner/astropy by jehturner 6 years ago
- Re-write FittingWithOutlierRemoval to work for model sets (where the fitter supports masked input) & add a few tests [#6819]. — committed to jehturner/astropy by jehturner 6 years ago
- Re-write FittingWithOutlierRemoval to work for model sets (where the fitter supports masked input) & add a few tests [#6819]. — committed to jehturner/astropy by jehturner 6 years ago
- Re-write FittingWithOutlierRemoval to work for model sets (where the fitter supports masked input) & add a few tests [#6819]. — committed to SaOgaz/astropy by jehturner 6 years ago
- Re-write FittingWithOutlierRemoval to work for model sets (where the fitter supports masked input) & add a few tests [#6819]. — committed to astromancer/astropy by jehturner 6 years ago
Not in the linear fitter. It solves a matrix equation where one of the terms is a Vandermonde matrix corresponding to a common set of points for all the models. If someone can point out an option I’m overlooking, great, but I don’t think so… (note that
np.linalg
complains if you try to give it a higher-dimensional matrix).I had just started having a go at this when my disk failed 🙁.
Nearly half of the execution time is spent in
np.lapack_lite.dgelsd
, which looks quite a thin wrapper for the lapack routine, so there’s not much prospect for removing the order-of-magnitude overhead when looping over models without writing non-trivial Fortran extensions. I’m not sure whynumpy.linalg.lstsq
has to calldgelsd
twice per invocation though (it doesn’t look accidental)… Update: ah, I see that’s just howDGELSD
is documented to work; the first invocation doesn’t actually do the calculation.