cmdstanpy: Bad interaction between num_chains and fixed_param sampler

Summary:

somewhere late in PR https://github.com/stan-dev/cmdstanpy/pull/492 we introduced a bug. possibly around changing the logic w/r/t behavoir when user tries to explicitly run chains in parallel?

Description:

the following call to sample, given a model which was compiled with stan_threads==True

datagen_fit = datagen_model.sample(iter_sampling=100000, chains=7, parallel_chains=2)

sampler tries to run multi-threaded - resulting CmdStan call is this:

DEBUG:cmdstanpy:running CmdStan, num_threads: 2
DEBUG:cmdstanpy:CmdStan args: ['/Users/mitzi/github/stan-dev/cmdstanpy/test/data/datagen_poisson_glm', 'id=1', 'random', 'seed=27365', 'output', 'file=/var/folders/db/4jnggnf549s42z50bd61jskm0000gq/T/tmpn5muntyf/datagen_poisson_glm-20211112112133.csv', 'method=sample', 'num_samples=100000', 'algorithm=hmc', 'adapt', 'engaged=1', 'num_chains=7']

which craps out because now there are only 2 output CSV files, not the expected 7. investigate logic and fix.

Additional Information:

Provide any additional information here.

Current Version:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 21 (21 by maintainers)

Commits related to this issue

Most upvoted comments

It’s done the same way the help command works, which also isn’t a classic argument: https://github.com/stan-dev/cmdstan/blob/da4be4530793a23e021fc94b7976513b5cde2816/src/cmdstan/arguments/argument_parser.hpp#L91 That is, it’s not the cleanest code I have ever written 😃

I see you are talking about obtaining the number of parameters. You can do that via stanc3 and the --info flag. It returns a JSON file. That doesn’t help you when the CmdStanModel object was made from the exe_file, but I would say that is the less common case.

The solution we should do for now, IMO, is add a check to these lines

https://github.com/stan-dev/cmdstanpy/blob/3cbfae4c89c91c9c1c41910c20d41be41ac217dd/cmdstanpy/model.py#L1467-L1473

If one_process is true and num_chains > 1, throw an error there telling the user fixed_param doesn’t support it and they should set force_one_process_per_chain to True

Edit: Actually, we probably need to do this in sample, not in _run_cmdstan, but the idea is the same.

Ah, this is because of that incredibly annoying fixed_param behavior