DeepSpeed: Cuda 11 cannot be supported
I wanted to run Deepdpeed on RTX 3090, cudA 11 only on 3090, and in your docker release I updated Pytorch version to 1.7.0 and ran an error:
building GPT2 model ...
Traceback (most recent call last):
File "pretrain_gpt2.py", line 711, in <module>
main()
File "pretrain_gpt2.py", line 659, in main
model, optimizer, lr_scheduler = setup_model_and_optimizer(args)
File "pretrain_gpt2.py", line 158, in setup_model_and_optimizer
model = get_model(args)
File "pretrain_gpt2.py", line 69, in get_model
parallel_output=True)
File "/data/Megatron-LM/model/gpt2_modeling.py", line 81, in __init__
checkpoint_num_layers)
File "/data/Megatron-LM/mpu/transformer.py", line 384, in __init__
[get_layer() for _ in range(num_layers)])
File "/data/Megatron-LM/mpu/transformer.py", line 384, in <listcomp>
[get_layer() for _ in range(num_layers)])
File "/data/Megatron-LM/mpu/transformer.py", line 380, in get_layer
output_layer_init_method=output_layer_init_method)
File "/data/Megatron-LM/mpu/transformer.py", line 259, in __init__
self.input_layernorm = LayerNorm(hidden_size, eps=layernorm_epsilon)
File "/usr/local/lib/python3.6/dist-packages/apex/normalization/fused_layer_norm.py", line 133, in __init__
fused_layer_norm_cuda = importlib.import_module("fused_layer_norm_cuda")
File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 658, in _load_unlocked
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 922, in create_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.6/dist-packages/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E
how can I use it in 3090?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (7 by maintainers)
@jeffra ,I am excited. Yeah, it worked very well in th RTX3090 with cuda11.0,pytorch1.8.0. or pytorch1.7.0 This question can be closed. Thanks again.