Ax: Small and medium sized dataset suitability for Ax, with GPU computing. Is 15000 too many?

I am using ~15000 samples across 8 dimensions (15000 x 8). Incidentally, I am also using Raytune per the Ax tutorial. New samples are somewhat expensive, around 20 min to 20 hrs per simulation based on the chosen parameters. Hoping to do somewhere between 100-1000 iterations of adaptive design, and max_parallel of somewhere between 8 and 12, based on number of CPUs available.

Does this seem feasible with consumer hardware (e.g. RTX 2060-Ti) and less than a week of runtime?

Will probably do some forecasting to see how the runtime scales with number of training points for this problem and use that to assess whether or not to switch to a genetic algorithm for this particular problem (or downselect training points to a more manageable size). Figured it was worth asking if anyone has had experience running with over 10000 initialized points.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (15 by maintainers)

Most upvoted comments

@Balandat thanks for this! This looks pretty doable with the tutorial in https://botorch.org/tutorials/custom_botorch_model_in_ax (swapping out the kernel, as described in the link you mentioned). Perhaps there’s a simpler way of swapping out the kernel as well. Will also give this a try!

Can’t have high enough praise for how responsive and helpful everyone has been over the last few months.

sgbaird on Apr 1, 2022

@saitcakmak tried out your suggestion, and that certainly seems to reduce the memory consumption. I’m still planning to give KeOps a try.

sgbaird on Apr 1, 2022

Another alternative here would be to try use KeOps. I’m currently off the grid-ish and can’t put together a demo but this could be a starting point: https://github.com/cornellius-gp/gpytorch/blob/master/examples/02_Scalable_Exact_GPs/KeOps_GP_Regression.ipynb

Balandat on Apr 1, 2022

@saitcakmak fantastic, thank you for looking into this! I will give it a try (and that’s promising to hear that you were able to run 12k points - our group has a machine with better GPUs that I’d be running this on after the testing stage).

sgbaird on Apr 1, 2022