scikit-learn: is random_state in LogisticRegression useless ?
In LogisticRegression
:
- the constructor adds a parameter
random_state
that is never used - the solver
'liblinear'
has arandom_state
optional parameter, but it is not used.
I tried wiring correctly the parameter random_state
to the solver, and tested it with multiple configurations, but I never obtained different results for different random_state
values.
_ Note that the previous test was twice useless (the parameter was not wired, and seems not to change the result)
def test_liblinear_random_state():
X, y = make_classification(n_samples=20)
lr1 = LogisticRegression(random_state=0)
lr1.fit(X, y)
lr2 = LogisticRegression(random_state=0)
lr2.fit(X, y)
assert_array_almost_equal(lr1.coef_, lr2.coef_)
Please note also that new solver 'sag'
(#4738) will require a 'random_state` parameter.
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Comments: 25 (24 by maintainers)
Ok I got it: In liblinear, the optimization uses TRON solver, which does not use any shuffling (nor any random). However, the optimization for the dual problem uses shuffling.
The
random_state
is useful only ifdual=True
. In this case, I get different results for different random states, and same results for same random state.