FasterTransformer: GPT-NeoX HuggingFace Converter does not work

Branch/Tag/Commit

main

Docker Image Version

not-specific-to-docker-image

GPU name

all GPUs

CUDA Driver

n/a

Reproduced Steps

Merely running the example at https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/gptneox/utils/huggingface_jp_gptneox_convert.py does not appear to work, even with the version of Transformers fixed to the one listed in the comment. This seems to be because the weights for HuggingFace do not have the names that are like `gpt_neox.layers.0.post_attention_layernorm.weight` and not like `transformer.X`, which is what the code seems to be expecting.

Am I missing something here? It seems this code does not apply to the model config

About this issue

Original URL
State: open
Created a year ago
Comments: 17

Most upvoted comments

https://github.com/TabbyML/tabby/blob/main/tabby/tools/converter/huggingface_gptneox_convert.py

This is a working version for generic gpt-neox model.

converted model: https://huggingface.co/TabbyML/NeoX-1.3B/tree/main/triton

wsxiaoys on Apr 9, 2023