ludwig: ValueError: Invalid reduction dimension 2 for input with 2 dimensions for translation model training after updating Ludwig
Describe the bug
Hi,
I previously used Ludwig when it was using the Tensorflow1.x
backend. And I created a machine translation project using it. But now after updating ludwig to the latest version, I can no longer run the same project.
I am basically following the configuration mentioned in the example for translation: https://ludwig-ai.github.io/ludwig-docs/examples/#machine-translation
below is my model_definition.yaml
training:
epochs: 500
early_stop: 50
batch_size: 128
# dropout_rate: 0.3
input_features:
-
name: column1
type: text
level: word
encoder: parallel_cnn
representation: sparse
reduce_output: null
preprocessing:
word_tokenizer: space
output_features:
-
name: column2
type: text
level: word
decoder: generator
cell_type: lstm
attention: bahdanau
loss:
type: sampled_softmax_cross_entropy
my terminal command:
ludwig train --experiment_name translate-1 --dataset training_file.csv --config_file model_definition.yaml --output_directory results
This is the error that is produced:
ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node ecd/text_output_feature/Max}} = Max[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](ecd/text_output_featu
re/Abs_1, ecd/text_output_feature/Max/reduction_indices)' with input shapes: [128,256], [] and with computed input tensors: input[1] = <2>.
Here are a few lines from the training_file.csv
column1,column2
k k klk k hjkj hg k kg h k jlk k kj kg hk k k k k k k k klk kjh jkj hg ghk kj kh khgh hg,N S SHL S LHHL LL H SL H H LHL S SL HL HH S S S S S S S SHL SLL HHL LL SHH SL HL HLLH SL
hk lk kh klk l lmlk lml mn m klm mn m ml lj kl klk kjhj jh h h h klm l lm l l l l lk lmkl lkk hjh h h klk kj k klm mlkj kl ml lk lk m lk jkjh jh k k hkh hg hk lm kj gh hg hjk jh,NH HL SL HHL H SHLL HHL HH L LHH SH L SL SL HH LHL SLLH SL S S S HHH L SH L S S S SL HHLH SLS LHL S S HHL SL H SHH SLLL HH HL SL HL H LL LHLL HL H S LHL SL HH HH LL LH SL HHH LL
kj klkjkjh ghg hj j j jh jk hj ghjh hg g fg g g g hjhg hjh gf gh hkjklkjh hjhg hg,NL HHLLHLL LHL HH S S SL HH LH LHHL SL S LH S S S HHLL HHL LL HH SHLHHLLL SHLL HL
g j k l k h k k g g g k k kj g hk k kj h kj h g g,N H H H L L H S L S S H S SL L HH S SL L HL L L S
hkj k k k k kkk kh kl kmlkjk kj h hl l l lk kmlm jlkk j hl l l lk k k kmlm k k k kmlk k k k k kml k kkk hjkjhj jh jkl ljl lmlkj kjhg hg kk hk h kkkh jkl lmlk lkj,NHL H S S S SSS SL HH LHLLLH SL L SH S S SL SHLH LHLS L LH S S SL S S SHLH L S S SHLL S S S S SHL L SSS LHHLLH SL HHH SLH SHLLL HLLL HL HS LH L HSSL HHH SHLL HLL
e e e e de dcded edb cb dc de dc bcdc cb c ac c c bcdcb d dededc bc dcbc ba,N S S S LH LLHHL HLL HL HL HH LL LHHL SL H LH S S LHHLL H SHLHLL LH HLLH LL
g j k l l l k h l k j g h g f gh h k k jk h h g g,N H H H S S L L H L L L H L L HH S H S LH L S L S
f fedf d dfe fg g g ggf ed df ef d dhj h hgf g f efg fe de dd c ed d cd d df ghgfg e,N SLLH L SHL HH S S SSL LL SH LH L SHH L SLL H L LHH LL LH LS L HL S LH S SH HHLLH L
d fgh g g g g gh g g g ge g fgh fe dfc de fefed dc f ghg g gh g g gh g ghkjh jkjh g ghjhgh hg h gf g h kl kjkl lk h gf hkj klk kjh jh hg hg gh g g g fgh fe dfc de fefed g ghgf gh g ghkjh jkjh ghjhgh hg,N HHH L S S S SH L S S SL H LHH LL LHL HH HLHLL SL H HHL S SH L S SH L SHHLL HHLL L SHHLLH SL H LL H H HH LLHH SL L LL HHL HHL SLL HL SL HL SH L S S LHH LL LHL HH HLHLL H SHLL HH L SHHLL HHLL LHHLLH SL
Environment (please complete the following information):
- OS: Windows 10
- Python version 3.6.8
- Ludwig version 0.33
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 19
We are looking into this issue anyway, will update soon.
@farazk86 re: the
sampled_softmax
issue. We are still looking at it. In the meantime, you should be able to use the regular softmax cross entropy loss function.@w4nderlust Yes, I was using v0.4.1 but I just installed v0.5 on a separate cluster to test and it seems to have resolved the issue. Thanks for your help! (N.B. I mentioned a different KeyError after switching to v0.5 in an earlier version of this reply but was able to resolve it, so please disregard it if you caught it pre-edit).
@farazk86 thank you for the feedback. Looks like there may be another subtle issue still left over from the latest PR. Let me take a look at it.