ludwig: ValueError: Invalid reduction dimension 2 for input with 2 dimensions for translation model training after updating Ludwig

Describe the bug Hi, I previously used Ludwig when it was using the Tensorflow1.x backend. And I created a machine translation project using it. But now after updating ludwig to the latest version, I can no longer run the same project.

I am basically following the configuration mentioned in the example for translation: https://ludwig-ai.github.io/ludwig-docs/examples/#machine-translation

below is my model_definition.yaml

training:
    epochs: 500
    early_stop: 50
    batch_size: 128
#    dropout_rate: 0.3

input_features:
    -
        name: column1
        type: text
        level: word
        encoder: parallel_cnn
        representation: sparse      
        reduce_output: null
        preprocessing:
            word_tokenizer: space

output_features:
    -
        name: column2
        type: text
        level: word
        decoder: generator
        cell_type: lstm
        attention: bahdanau
        loss:
            type: sampled_softmax_cross_entropy

my terminal command:

ludwig train --experiment_name translate-1 --dataset training_file.csv --config_file model_definition.yaml --output_directory results

This is the error that is produced:

 ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node ecd/text_output_feature/Max}} = Max[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](ecd/text_output_featu
re/Abs_1, ecd/text_output_feature/Max/reduction_indices)' with input shapes: [128,256], [] and with computed input tensors: input[1] = <2>.

Here are a few lines from the training_file.csv

column1,column2
k k klk k hjkj hg k kg h k jlk k kj kg hk k k k k k k k klk kjh jkj hg ghk kj kh khgh hg,N S SHL S LHHL LL H SL H H LHL S SL HL HH S S S S S S S SHL SLL HHL LL SHH SL HL HLLH SL
hk lk kh klk l lmlk lml mn m klm mn m ml lj kl klk kjhj jh h h h klm l lm l l l l lk lmkl lkk hjh h h klk kj k klm mlkj kl ml lk lk m lk jkjh jh k k hkh hg hk lm kj gh hg hjk jh,NH HL SL HHL H SHLL HHL HH L LHH SH L SL SL HH LHL SLLH SL S S S HHH L SH L S S S SL HHLH SLS LHL S S HHL SL H SHH SLLL HH HL SL HL H LL LHLL HL H S LHL SL HH HH LL LH SL HHH LL
kj klkjkjh ghg hj j j jh jk hj ghjh hg g fg g g g hjhg hjh gf gh hkjklkjh hjhg hg,NL HHLLHLL LHL HH S S SL HH LH LHHL SL S LH S S S HHLL HHL LL HH SHLHHLLL SHLL HL
g j k l k h k k g g g k k kj g hk k kj h kj h g g,N H H H L L H S L S S H S SL L HH S SL L HL L L S
hkj k k k k kkk kh kl kmlkjk kj h hl l l lk kmlm jlkk j hl l l lk k k kmlm k k k kmlk k k k k kml k kkk hjkjhj jh jkl ljl lmlkj kjhg hg kk hk h kkkh jkl lmlk lkj,NHL H S S S SSS SL HH LHLLLH SL L SH S S SL SHLH LHLS L LH S S SL S S SHLH L S S SHLL S S S S SHL L SSS LHHLLH SL HHH SLH SHLLL HLLL HL HS LH L HSSL HHH SHLL HLL
e e e e de dcded edb cb dc de dc bcdc cb c ac c c bcdcb d dededc bc dcbc ba,N S S S LH LLHHL HLL HL HL HH LL LHHL SL H LH S S LHHLL H SHLHLL LH HLLH LL
g j k l l l k h l k j g h g f gh h k k jk h h g g,N H H H S S L L H L L L H L L HH S H S LH L S L S
f fedf d dfe fg g g ggf ed df ef d dhj h hgf g f efg fe de dd c ed d cd d df ghgfg e,N SLLH L SHL HH S S SSL LL SH LH L SHH L SLL H L LHH LL LH LS L HL S LH S SH HHLLH L
d fgh g g g g gh g g g ge g fgh fe dfc de fefed dc f ghg g gh g g gh g ghkjh jkjh g ghjhgh hg h gf g h kl kjkl lk h gf hkj klk kjh jh hg hg gh g g g fgh fe dfc de fefed g ghgf gh g ghkjh jkjh ghjhgh hg,N HHH L S S S SH L S S SL H LHH LL LHL HH HLHLL SL H HHL S SH L S SH L SHHLL HHLL L SHHLLH SL H LL H H HH LLHH SL L LL HHL HHL SLL HL SL HL SH L S S LHH LL LHL HH HLHLL H SHLL HH L SHHLL HHLL LHHLLH SL

Environment (please complete the following information):

  • OS: Windows 10
  • Python version 3.6.8
  • Ludwig version 0.33

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 19

Most upvoted comments

We are looking into this issue anyway, will update soon.

@farazk86 re: the sampled_softmax issue. We are still looking at it. In the meantime, you should be able to use the regular softmax cross entropy loss function.

@w4nderlust Yes, I was using v0.4.1 but I just installed v0.5 on a separate cluster to test and it seems to have resolved the issue. Thanks for your help! (N.B. I mentioned a different KeyError after switching to v0.5 in an earlier version of this reply but was able to resolve it, so please disregard it if you caught it pre-edit).

@farazk86 thank you for the feedback. Looks like there may be another subtle issue still left over from the latest PR. Let me take a look at it.