talos: Reporting Data has incorrect column associations to frame
- 
I’m up-to-date with the latest release:
pip install -U talos - 
I’ve confirmed that my Keras model works outside of Talos.
 
I noticed that after scanning a Parameter dictionary, the inner data object of the Reporter has the column/row associations incorrectly ordered in Python 2.7.
For example, something like:
p = {
        'compile_loss': ['mean_squared_error'], 
        'compile_optimizer': ['sgd'], 
        'hidden_units': [64, 128, 512],
        'inner_activation': ['relu'],
        'output_activation': ['relu'], 
        'recurrent_activation': ['relu'],
        'lstm_layers': [0],
        'gru_layers': [0],
        'dropout_ratio': [.2],
        'activate_regularizers': [1, 0],
        'batch_window_size': [ 5, 50 ],
        'epochs': [ 100 ]
    }
    scan = ta.Scan(train_inputs, train_outputs, p, self._create_keras_model)
    reporting = ta.Reporting(scan)
    print(reporting.data.columns)
Prints out:
   round_epochs                  acc         loss              val_acc  \
1           10  0.20000000298023224  315008640.0  0.20000000298023224   
2           10  0.30000001192092896  170956672.0  0.20000000298023224   
      val_loss    lr recurrent_activation inner_activation  \
1  308987328.0  0.01                  0.2             relu   
2  169688720.0  0.01                  0.2             relu   
  activate_regularizers epochs lstm_layers dropout_ratio hidden_units  \
1    mean_squared_error      0        relu            50          sgd   
2    mean_squared_error      0        relu            50          sgd   
  compile_loss gru_layers batch_window_size compile_optimizer  \
1           10          0               128              relu   
2           10          0               512              relu   
  output_activation  
1                 0  
You can see in the output above, the scan values appear to be one column index off, which leads to incorrect reporting and difficulty reconstructing the best parameter associations.
This originally popped up when I realized that best_params is an array, leaving no real clear way of reconstructing the original parameter dictionary associations for storing the parms for replay later.
The ask is to simply offer a way of extracting the best_params in a format that allows restoration of the values to the original parameter keys.
BTW, thank you for this module.
About this issue
- Original URL
 - State: closed
 - Created 6 years ago
 - Reactions: 2
 - Comments: 28 (16 by maintainers)
 
@felixriese
Amazing. That’s a huge relief, as the next steps would have become pretty involving 😃 Yes, I will implement OrderedDict as a default, but I have to first put some time into it as it relates with a key aspect of the mainline problem. It should be ok as it’s really just retaining the order of the dict, but still better be safe than sorry. I think that by v.0.4.8 this would be incorporated. So maybe a few weeks down the road.
I’ve created a new issue #172 to handle this. Given that OP’s issue is resolved, as well as yours, closing here. Thank’s for taking the time to go through the hoops. Great that we squashed it!
@felixriese Below I will propose a very simple solution that might just fix this issue for you.
In the interest of transparency, I will also share with you what I know so far. The problem you are suffering from is previously unknown, and it is not the same problem as OP’s problem which is shifting of column headers by one (a problem related with older version of Talos).
The line of code that this likely relates with is:
[_rr_out.append(key) for key in self.params.keys()]…where the resulting list becomes the columns after the last key from the history object (which in your case is val_loss). Right now my working theory is that for some reason in your system the order of the dictionary is changed in what seems visually an arbitrary manner. In case you would add to /utils/results.py to line 16 (before the current line 16) the following:
print(self.params.keys())…you will most likely get the a print out which is identical with the column order you are getting. Why that would happen in your system vs. any other, I’m not clear on. The good news is that if indeed this is the case, the fix might be very simple.
The proposed solution:
Instead of doing:
ta.Scan(... params=p ...)Try:
When you do that, what happens?
Here the screenshot from the terminal:
As you can see, the problem still exists (e.g. padding). The columns look like that:
And with Talos, it looks the same (also wrong):
I use this Dockerfile and install Talos afterwards with
what is essentially what you are proposing.
The OS of the machine that starts the docker (is that important?) is: