trlx: (perhaps) data preparation bug in RLHF
🐛 Describe the bug
I have several problems about data preparations while running summarize_rlhf
.
-
In
get_prompt_dataset
https://github.com/CarperAI/trlx/blob/f115eeaa3cfd2c997a345b8891b5f9427f1a08ee/examples/summarize_rlhf/trlx_gptj_text_summarization.py#L58 Current implementations will first truncate the original prompts withmax_length-5
. However, I notice that the tokenized results of the prompts will change after appending\nTL;DR:
in some cases, which then lead to the truncation of the suffix of\nTL;DR:
and finally lead to keyerror in https://github.com/CarperAI/trlx/blob/f115eeaa3cfd2c997a345b8891b5f9427f1a08ee/examples/summarize_rlhf/trlx_gptj_text_summarization.py#L83 -
Even after I fix this truncation bug in
get_prompt_dataset
, current implementations will still raise keyerror in https://github.com/CarperAI/trlx/blob/f115eeaa3cfd2c997a345b8891b5f9427f1a08ee/examples/summarize_rlhf/trlx_gptj_text_summarization.py#L83 during PPO training.
Which trlX version are you using?
main
Additional system and package information
transformers==4.26.0
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 35 (3 by maintainers)
The hack has produced a similar error:
KeyError: "SUBREDDIT: r/relationships\nTITLE: [Update 2] I [18 M] want to ask out a girl [18 F] out on a date, general tips needed.\nPOST: [Original](\n(Clarification on this one, I didn’t mean the one as the girl I wanted to marry)\nTL;DR: "
I’ve reproduced this problem before. The temporary (terrible) hack I used was to truncate the OpenAI summary keys to 272 chars. Something like:
Still looking for a proper fix…