FasterTransformer: GPT-NeoX HuggingFace Converter does not work
Branch/Tag/Commit
main
Docker Image Version
not-specific-to-docker-image
GPU name
all GPUs
CUDA Driver
n/a
Reproduced Steps
Merely running the example at https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/gptneox/utils/huggingface_jp_gptneox_convert.py does not appear to work, even with the version of Transformers fixed to the one listed in the comment. This seems to be because the weights for HuggingFace do not have the names that are like `gpt_neox.layers.0.post_attention_layernorm.weight` and not like `transformer.X`, which is what the code seems to be expecting.
Am I missing something here? It seems this code does not apply to the model config
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 17
https://github.com/TabbyML/tabby/blob/main/tabby/tools/converter/huggingface_gptneox_convert.py
This is a working version for generic gpt-neox model.
converted model: https://huggingface.co/TabbyML/NeoX-1.3B/tree/main/triton