scikit-learn: is random_state in LogisticRegression useless ?

In LogisticRegression:

  • the constructor adds a parameter random_state that is never used
  • the solver 'liblinear' has a random_state optional parameter, but it is not used.

I tried wiring correctly the parameter random_state to the solver, and tested it with multiple configurations, but I never obtained different results for different random_state values.

_ Note that the previous test was twice useless (the parameter was not wired, and seems not to change the result)

def test_liblinear_random_state():
    X, y = make_classification(n_samples=20)
    lr1 = LogisticRegression(random_state=0)
    lr1.fit(X, y)
    lr2 = LogisticRegression(random_state=0)
    lr2.fit(X, y)
    assert_array_almost_equal(lr1.coef_, lr2.coef_)

Please note also that new solver 'sag' (#4738) will require a 'random_state` parameter.

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 25 (24 by maintainers)

Most upvoted comments

Ok I got it: In liblinear, the optimization uses TRON solver, which does not use any shuffling (nor any random). However, the optimization for the dual problem uses shuffling.

The random_state is useful only if dual=True. In this case, I get different results for different random states, and same results for same random state.