DeepSpeed: [RESOLVED] An error occurs when running in the deepspeed cpu inference

Describe the bug I am using 0.9.4 version after reflecting the two PRs below, but cpu inference is not working CUDA optional deepspeed ops https://github.com/microsoft/DeepSpeed/pull/2507 Enable page-locked tensors without CUDA https://github.com/microsoft/DeepSpeed/pull/2775

To Reproduce

code

torch_dtype = torch.bfloat16
ds_engine = deepspeed.init_inference(
    model,
    tensor_parallel={
        "tp_size": world_size,
    },
    dtype=torch_dtype,
    replace_with_kernel_inject=True,
    save_mp_checkpoint_path=args.save_mp_checkpoint_path,
    # injection_policy=injection_policy,
)
model = ds_engine.module

Describe the bug

error

/opt/conda/envs/py310/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py:545 in    │
│ compute_capability_args                                                                          │
│                                                                                                  │
│   542 │   │   │   │   if cc not in ccs:                                                          │
│   543 │   │   │   │   │   ccs.append(cc)                                                         │
│   544 │   │   │   ccs = sorted(ccs)                                                              │
│ ❱ 545 │   │   │   ccs[-1] += '+PTX'                                                              │
│   546 │   │   else:                                                                              │
│   547 │   │   │   # Cross-compile mode, compile for various architectures                        │
│   548 │   │   │   # env override takes priority

In the code below, if the cuda is not available, an error always appears https://github.com/microsoft/DeepSpeed/blob/7711bdbbd27c62ab4986f35c1ed01a0268fed92f/op_builder/builder.py#L528-L537

Expected behavior If there is a cpu inference guide such as an option that does not go through the code, Please provide it to me. I may not be able to find it well, but it is difficult to find information about cpu inference in the guide.

System info (please complete the following information):

OS: Ubuntu 22.04
GPU count and types : cpu only
Python version : 3.10
deepspeed version : 0.9.4

About this issue

Original URL
State: closed
Created 10 months ago
Comments: 19 (6 by maintainers)

Most upvoted comments

I solved it by updating the gcc version

park12sj on Sep 12, 2023

From the error posted DeepSpeed seems going through CUDA path instead of CPU path. Can you check the log to see if there is any line like the following?

Setting ds_accelerator to cpu (auto detect)

There is a section in DeepSpeed tutorial showing how to run DeepSpeed model on CPU. https://www.deepspeed.ai/tutorials/accelerator-abstraction-interface/#run-deepspeed-model-on-cpu

delock on Aug 21, 2023