xgboost: model can't be reproduced when data is big with tree_method = "gpu_hist" ?
I did some tests in R(xgboost-0.81.0.1), when data N is big, I found the models trained with the same parameter are not the same on GPU (tree_method = "gpu_hist"), when data N is relatively small, the models are the same. But when I use tree_method = "hist" to train the model repeatedly on cpu, all the models result are the same. I don’t know what happened on GPU training, due to the precision?
GPU test code, big data:
library(xgboost)
# Simulate N x p random matrix with some binomial response dependent on pp columns
set.seed(111)
N <- 800000
p <- 100
X <- matrix(runif(N * p), ncol = p)
beta <- runif(p)
y <- X %*% beta + rnorm(N, mean = 0, sd = 0.1)
tr <- sample.int(N, N * 0.75)
param <- list(nrounds = 10, num_parallel_tree = 1, nthread = 1L, eta = 0.3, max_depth = 30,
seed = 2018, colsample_bytree = 0.4, subsample = 0.6, min_child_weight = 1000,
tree_method = 'gpu_hist', grow_policy = "lossguide", max_leaves = 1e4, max_bin = 256,
n_gpus = 1, gpu_id = 3, verbose = FALSE)
param$data <- X[tr,]
param$label <- y[tr]
set.seed(2019)
bst_gpu1 <- do.call(xgboost::xgboost, param)
test_pred1 <- predict(bst_gpu1, newdata = X)
set.seed(2019)
bst_gpu2 <- do.call(xgboost::xgboost, param)
test_pred2 <- predict(bst_gpu2, newdata = X)
set.seed(2019)
bst_gpu3 <- do.call(xgboost::xgboost, param)
test_pred3 <- predict(bst_gpu3, newdata = X)
set.seed(2019)
bst_gpu4 <- do.call(xgboost::xgboost, param)
test_pred4 <- predict(bst_gpu4, newdata = X)
set.seed(2019)
bst_gpu5 <- do.call(xgboost::xgboost, param)
test_pred5 <- predict(bst_gpu5, newdata = X)
all_pred <- cbind(test_pred1, test_pred2, test_pred3, test_pred4, test_pred5)
head(all_pred)
# test_pred1 test_pred2 test_pred3 test_pred4 test_pred5
# [1,] 22.43434 22.65794 22.46917 22.60526 22.43433
# [2,] 24.28225 24.42978 24.34619 24.60111 24.28225
# [3,] 23.11788 23.15692 23.07406 23.22111 23.11788
# [4,] 23.74367 23.92602 24.26277 24.11207 23.74367
# [5,] 22.97502 23.24378 23.25752 22.92594 22.97502
# [6,] 23.34638 23.52209 23.47491 23.71274 23.34638
summary(test_pred1 - test_pred2)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# -1.2855778 -0.1688867 -0.0002308 -0.0002195 0.1685147 1.3110085
summary(test_pred1 - test_pred3)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# -1.3292294 -0.1703205 0.0000973 -0.0001312 0.1701469 1.3229237
the difference is big, but change N to 80000 or replace tree_method = "gpu_hist" to tree_method = "hist", the results are the same.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 15 (7 by maintainers)
Using
single_precision_histogram = Fwill give you the reproducible results.@trivialfis This line sets the seed globally: https://github.com/dmlc/xgboost/blob/0a0d4239d32cc90e48fb95be0b122164582c9d96/src/learner.cc#L300