MidiTok: ValueError: invalid literal for int() with base 10: '3.6.8' OR ValueError: not enough values to unpack (expected 2, got 1)
First of all, using the framework has been very useful already!
I am having two kinds of errors and don’t know why. I use GPT2 architecture (in repository example notebook) successfully trained and Miditok 1.1.9.
Code structure
Encoding:
pitch_range = range(21, 109)
beat_res = {(0, 4): 8}
nb_velocities = 32
additional_tokens = {'Chord': False, 'Rest': False, 'Tempo': True, 'Program': True, 'TimeSignature': True,
'nb_tempos': 32,
'tempo_range': (40, 250),
'time_signature_range': (8, 2)}
tokenizer = Octuple(pitch_range, beat_res, nb_velocities, additional_tokens)
Preprocessing:
# Converts MIDI files to tokens saved as JSON files
tokenizer.tokenize_midi_dataset(paths, relative_path_to_json, midi_valid)
json_paths = list(path.Path(relative_path_to_json).glob('*.json'))
entire_pop909_json_with_bools = []
for json_file in json_paths:
with open(json_file) as f:
data = json.load(f)
entire_pop909_json_with_bools.extend(data) # where elements are found in the list of lists
entire_pop909_json_list = []
# just take song tokens, not boolean track signs
for slot in entire_pop909_json_with_bools:
if False not in slot[0]: # TAKE CARE: just for Pop909 dataset
entire_pop909_json_list.append(slot)
flatten_different_songs = [item for sublist in entire_pop909_json_list for item in sublist]
# just trying to make token units to fit the [4, 1024] shape, otherwise it would be [4, 1024, 8]
flatten_time_steps = [item for sublist in flatten_different_songs for item in sublist]
train_data = []
train_data.extend(flatten_time_steps)
Output tensors shape from DataLoader:
Train loader
X shape: torch.Size([4, 1024])
Target shape: torch.Size([4, 1024])
Generating from scratch:
rand_seq = model.generate(torch.Tensor([1]), target_seq_length=512)
out = rand_seq[0].cpu().numpy().tolist()
converted_back_midi = tokenizer.tokens_to_midi([out], None)
converted_back_midi.dump('output.mid')
Errors
When the generating part is executed two kinds of errors could show, this one:
MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_5234/3425966451.py in <module>
14 out = rand_seq[0].cpu().numpy().tolist()
15
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
17 converted_back_midi.dump('4_model_1_OUTPUT(256).mid')
18
~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
230
231 if self.additional_tokens['TimeSignature']:
--> 232 time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
233 else: # default
234 time_sig = TIME_SIGNATURE
~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
447 :return: the numerator and denominator of a time signature
448 """
--> 449 numerator, denominator = map(int, token_time_sig.split('/'))
450 return numerator, denominator
451
ValueError: invalid literal for int() with base 10: '3.6.8'
Or this one:
MidiTok Model Generator
Generating sequence of max length: 512
50 / 512
100 / 512
150 / 512
200 / 512
250 / 512
300 / 512
350 / 512
400 / 512
450 / 512
500 / 512
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_5234/761086941.py in <module>
14 out = rand_seq[0].cpu().numpy().tolist()
15
---> 16 converted_back_midi = tokenizer.tokens_to_midi([out], None)
17 converted_back_midi.dump('output.mid')
18
~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/octuple.py in tokens_to_midi(self, tokens, _, output_path, time_division)
230
231 if self.additional_tokens['TimeSignature']:
--> 232 time_sig = self._parse_token_time_signature(self.tokens_to_events(tokens[0])[-1].value)
233 else: # default
234 time_sig = TIME_SIGNATURE
~/miniconda3/envs/remiTest/lib/python3.9/site-packages/miditok/midi_tokenizer_base.py in _parse_token_time_signature(token_time_sig)
447 :return: the numerator and denominator of a time signature
448 """
--> 449 numerator, denominator = map(int, token_time_sig.split('/'))
450 return numerator, denominator
451
ValueError: not enough values to unpack (expected 2, got 1)
The ValueError: invalid literal for int() with base 10: '3.6.8'
one can be ‘x.x.x’ literal, it can change in every execution.
Thanks in advance!
PS: Sorry if I made it too long, just wanted to be clear on each point 😃.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (12 by maintainers)
Hi Env,
After a few tests I did not run into bugs, so I released the update in v1.2.0 ! If you get new bugs / crashes please re-open this issue or create a new one ! 😃
BTW Octuple is pretty “demanding” in computer ressource, meaning the multi input / output requires a relatively high amount of model parameters (and therefore gpu). The original authors used 8 V100 (32GB vram), which is quite a lot. My results with one V100 weren’t very good also, the model often producing errors like predicting Bar / Positions already passed (going backward in time). For smaller hardware / model size, representations like REMI / Structured are more suitable.
Amazing!
I will try to train for many epochs in an Amazon GPU, so when I have results I can tell you. The new version seems to work well!
Hi 😃
I just corrected it in 960cbfa8eac1750aec1fb95d623e3ab2a51370f1 (a really stupid bug ahah) Hoping to not find any other bugs, if so please tell me. And if after testing with generated tokens you don’t encounter any other bug, please also report it so that I release this in the next version. 😃
I get what you say. Maybe that’s deeper to build than I could, but I will try my best these days 😄
I’ll report everything I find 👍
You were right. I used your transformer class and octuple with some changes, and it seems to work now. The only problem is that I am struggling with the predict function, how would you use it to generate a sequence without primer melody?
PS: Should I open another issue for this? Is this one too far?
Yes I am 99% confident this error was caused by the “flattening”.
If this takes you too much time, maybe you could just switch to a 1D representation (Remi, Structured etc). I have currently things running, when it’s done I’ll try this version of octuple.
If this can help you, here is how to compute the several losses
Thank you !
By tokens I am referring to a token sequence produce by the model (list of list of integers in the case of octuple).
I looked at the GPT2Model from hugging face, and the problem (for us here) is that it automatically comes with an Embedding layer, so it can’t be used with multi embeddings.
But if you are using PyTorch, the Transformer module is almost exactly the same. Here is how to create the model, with multi input / output modules for octuple: (I did not test it as here, I just wrote this from code blocks I had)
And the create the model :
For your last question, by 443 and 580 do you mean the sum of the vocabularies ? And yes the size can change between different datasets, as the durations of files would be different, the length of the Bar vocab would also be different.
Let me try this week and whithin some days hopefully I could give you the results.
Hi @envilk, thanks for your comment and for this bug report ! I’ll look into it in the next few days to fix it.
My guess is that the decoded token is not of type TimeSignature (3.6.8 looks like a Duration token). A check might solve it.
Also for Octuple, CP Word and MuMIDI tokenizations, I will soon give an update so that each tokenizer have several vocabularies, one for each token type. This allows to more easily create Embedding layers of appropriate sizes, and so that a model also returns several sequences of logits of the associated sizes.
Nathan