ML: Undefined offset in metric class while training Multilayer Perceptron Classifier

Describe the bug

When attempting to train a Multilayer Perception Classifier, I occasionally get the following type of exception. I have been able to replicate this with both the MCC and FBeta metrics. Unfortunately this exception does not occur consistently even with the same dataset.

[2020-04-04 22:32:21] production.ERROR: Undefined offset: 0 {"exception":"[object] (ErrorException(code: 0): Undefined offset: 0 at /[REDACTED]/vendor/rubix/ml/src/CrossValidation/Metrics/MCC.php:107)
[stacktrace]
#0 /[REDACTED]/vendor/rubix/ml/src/CrossValidation/Metrics/MCC.php(107): Illuminate\\Foundation\\Bootstrap\\HandleExceptions->handleError()
#1 /[REDACTED]/vendor/rubix/ml/src/Classifiers/MultilayerPerceptron.php(414): Rubix\\ML\\CrossValidation\\Metrics\\MCC->score()
#2 /[REDACTED]/vendor/rubix/ml/src/Classifiers/MultilayerPerceptron.php(360): Rubix\\ML\\Classifiers\\MultilayerPerceptron->partial()
#3 /[REDACTED]/vendor/rubix/ml/src/Pipeline.php(189): Rubix\\ML\\Classifiers\\MultilayerPerceptron->train()
#4 /[REDACTED]/vendor/rubix/ml/src/PersistentModel.php(191): Rubix\\ML\\Pipeline->train()
#5 /[REDACTED]/app/Console/Commands/TrainModel.php(89): Rubix\\ML\\PersistentModel->train()
#6 [internal function]: App\\Console\\Commands\\TrainModel->handle()
#7 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(32): call_user_func_array()
#8 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Container/Util.php(36): Illuminate\\Container\\BoundMethod::Illuminate\\Container\\{closure}()
#9 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(90): Illuminate\\Container\\Util::unwrapIfClosure()
#10 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php(34): Illuminate\\Container\\BoundMethod::callBoundMethod()
#11 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Container/Container.php(592): Illuminate\\Container\\BoundMethod::call()
#12 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Console/Command.php(134): Illuminate\\Container\\Container->call()
#13 /[REDACTED]/vendor/symfony/console/Command/Command.php(255): Illuminate\\Console\\Command->execute()
#14 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Console/Command.php(121): Symfony\\Component\\Console\\Command\\Command->run()
#15 /[REDACTED]/vendor/symfony/console/Application.php(912): Illuminate\\Console\\Command->run()
#16 /[REDACTED]/vendor/symfony/console/Application.php(264): Symfony\\Component\\Console\\Application->doRunCommand()
#17 /[REDACTED]/vendor/symfony/console/Application.php(140): Symfony\\Component\\Console\\Application->doRun()
#18 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Console/Application.php(93): Symfony\\Component\\Console\\Application->run()
#19 /[REDACTED]/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php(129): Illuminate\\Console\\Application->run()
#20 /[REDACTED]/artisan(37): Illuminate\\Foundation\\Console\\Kernel->handle()
#21 {main}
"}

To Reproduce

The following code is capable to recreating this error occasionally.

$estimator = new PersistentModel(
    new Pipeline(
        [
            new TextNormalizer(),
            new WordCountVectorizer(10000, 3, new NGram(1, 3)),
            new TfIdfTransformer(),
            new ZScaleStandardizer()
        ],
        new MultilayerPerceptron([
            new Dense(100),
            new PReLU(),
            new Dense(100),
            new PReLU(),
            new Dense(100),
            new PReLU(),
            new Dense(50),
            new PReLU(),
            new Dense(50),
            new PReLU(),
        ], 100, null, 1e-4, 1000, 1e-4, 10, 0.1, null, new MCC())
    ),
    new Filesystem($modelPath.'classifier.model')
);

$estimator->setLogger(new Screen('train-model'));

$estimator->train($dataset);

The labelled dataset used is a series of text files split into different directories that indicate their class names. This dataset is built using the following function.

    public static function buildLabeled(): Labeled
    {
        $samples = $labels = [];

        $directories = glob(storage_path('app/dataset/*'));

        foreach($directories as $directory) {
            foreach (glob($directory.'/*.txt') as $file) {
                $text = file_get_contents($file);
                $samples[] = [$text];
                $labels[] = basename($directory);
            }
        }

        return Labeled::build($samples, $labels);
    }

Expected behavior

Training should complete without any errors within the metric class.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 16 (16 by maintainers)

Most upvoted comments

Summarizing what we talked about in chat …

This error is caused by a chain of silent errors starting with numerical under/overflow due to a learning rate that is too high for the user’s particular dataset. As a result the network produces NaN values at the output layer which in turn produce a prediction of false when run through the argmax function. This false value is then silently converted (thanks PHP) to the integer 0 when used as the key/index of an array entry used to accumulate false positives in the MCC and FBeta metrics.

The solution to this is to decrease the learning rate of the Gradient Descent optimizer to prevent the network from blowing up. To aid the user in identifying when the network has become unstable, we will catch NaN values before scoring the validation set and then throw an informative exception.

Here is a good article on exploding gradients and why decreasing the learning rate has the effect of stabilizing training https://machinelearningmastery.com/exploding-gradients-in-neural-networks/