auto-sklearn: Classifier processes eat up all memory and freeze

The autosklearn is run in a multiprocessing mode using the Python Pool. I have a smaller dataset that doesn’t take much memory mostly (about 1GB per process). But some processes manage to get to as high as 58GB and sit there idle forever. After four of them run the box seems to be out of memory so other processes seems to be blocked as well.

14177 ekobylki  20   0 58.9g  58g 1648 S  0.0 24.6  37:56.59 python
14183 ekobylki  20   0 57.0g  56g 1528 S  0.0 23.8  37:24.30 python
14191 ekobylki  20   0 56.9g  55g   12 S  0.0 23.6  38:27.54 python
14190 ekobylki  20   0 56.9g  54g   12 S  0.0 23.2  39:12.09 python
28971 ekobylki  20   0  931m  29m  816 S  0.0  0.0   0:00.09 python
26886 ekobylki  20   0  931m  28m    4 S  0.0  0.0   0:00.03 python
26785 ekobylki  20   0  931m  28m    4 S  0.0  0.0   0:00.03 python
26743 ekobylki  20   0  931m  28m    4 S  0.0  0.0   0:00.00 python

these are the only errors reported in the run-err*.txt so these above may or may not be the libsvm_svc models that eat up this memory.

21:04:20.572 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘none’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘967.1406585393779’ -classifier:libsvm_svc:coef0 ‘-0.8043440523197929’ -classifier:libsvm_svc:degree ‘4’ -classifier:libsvm_svc:gamma ‘0.9629176596789994’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘False’ -classifier:libsvm_svc:tol ‘3.780913961396449E-5’ -imputation:strategy ‘median’ -one_hot_encoding:minimum_fraction ‘6.536726975871556E-4’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘standardize’ 21:11:58.611 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘weighting’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘980.1693781609448’ -classifier:libsvm_svc:coef0 ‘0.9655846663184429’ -classifier:libsvm_svc:degree ‘5’ -classifier:libsvm_svc:gamma ‘1.1160123630458856’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘False’ -classifier:libsvm_svc:tol ‘0.06035882680156773’ -imputation:strategy ‘most_frequent’ -one_hot_encoding:use_minimum_fraction ‘False’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘standardize’ 20:51:51.392 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘weighting’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘1928.806880985533’ -classifier:libsvm_svc:coef0 ‘0.4875440101240127’ -classifier:libsvm_svc:degree ‘5’ -classifier:libsvm_svc:gamma ‘0.2262949152205443’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘False’ -classifier:libsvm_svc:tol ‘1.994581419473514E-5’ -imputation:strategy ‘most_frequent’ -one_hot_encoding:minimum_fraction ‘0.002428618650930115’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘standardize’ 16:45:32.235 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘weighting’ -classifier:choice ‘multinomial_nb’ -classifier:multinomial_nb:alpha ‘0.26529447713685506’ -classifier:multinomial_nb:fit_prior ‘True’ -imputation:strategy ‘most_frequent’ -one_hot_encoding:minimum_fraction ‘0.010693674573559887’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘min/max’ 20:37:52.185 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘none’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘1113.6255293600818’ -classifier:libsvm_svc:coef0 ‘-0.9090303757992946’ -classifier:libsvm_svc:degree ‘4’ -classifier:libsvm_svc:gamma ‘0.5701674840721168’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘True’ -classifier:libsvm_svc:tol ‘1.65010853496975E-5’ -imputation:strategy ‘most_frequent’ -one_hot_encoding:minimum_fraction ‘0.021107927190034653’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘standardize’ 00:01:14.702 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘none’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘175.35300387835784’ -classifier:libsvm_svc:coef0 ‘-0.44666123056615636’ -classifier:libsvm_svc:degree ‘5’ -classifier:libsvm_svc:gamma ‘4.213441046368573’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘False’ -classifier:libsvm_svc:tol ‘1.540015151031467E-4’ -imputation:strategy ‘most_frequent’ -one_hot_encoding:minimum_fraction ‘0.0021976812568160094’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice ‘standardize’ 21:14:14.883 [CLI TAE (Master Thread - #0)] ERROR c.u.c.b.a.t.b.c.CommandLineAlgorithmRun - The following algorithm call failed: cd “/x/truffles/./nfs_share/atsklrn_tmp” ; runsolver --watcher-data /dev/null -W 9865 -d 30 -M 5000 python /home/ekobylkin/anaconda2/lib/python2.7/site-packages/AutoSklearn-0.0.1.dev0-py2.7-linux-x86_64.egg/autosklearn/cli/SMAC_interface.py holdout ./nfs_share/atsklrn_tmp/.auto-sklearn/datamanager.pkl 9865.0 2147483647 -1 -balancing:strategy ‘none’ -classifier:choice ‘libsvm_svc’ -classifier:libsvm_svc:C ‘16.117183011803604’ -classifier:libsvm_svc:coef0 ‘-0.2687983462436514’ -classifier:libsvm_svc:degree ‘5’ -classifier:libsvm_svc:gamma ‘1.322366658499254’ -classifier:libsvm_svc:kernel ‘poly’ -classifier:libsvm_svc:max_iter ‘-1’ -classifier:libsvm_svc:shrinking ‘False’ -classifier:libsvm_svc:tol ‘0.05943499267707366’ -imputation:strategy ‘mean’ -one_hot_encoding:minimum_fraction ‘0.0036462574639183’ -one_hot_encoding:use_minimum_fraction ‘True’ -preprocessor:choice ‘no_preprocessing’ -rescaling:choice 'standardize

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 16 (14 by maintainers)

Most upvoted comments

Yes, that would probably mitigate some issues. I also added an example for parallel processing in the examples directory of the development branch since I had to change the interface a little bit.