ColossalAI: [BUG]: Memory consumption by fp16 is not normal
๐ Describe the bug
When i used pytorch origin amp, the gpu memory is much smaller than colossai, why? the config is
from colossalai.amp import AMP_TYPE
from colossalai.zero.shard_utils import TensorShardStrategy
from colossalai.nn.optimizer import HybridAdam
fp16 = dict(
mode=AMP_TYPE.TORCH,
)
optimizer = dict(
type=HybridAdam,
lr=0.001,
# weight_decay=1e-2,
)
<html>
<body>
model | dataset | machine | batch | gradient accmulate size | ZeRO | speed | GPU memory | OPT | tensor_placement_policy | ย | ย โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ | โ ir18 | private dataset | 1 | 64 | 1 | no ZeRO | 24%|โโโ ย ย ย | 2089/8549 [02:51<08:39, 12.43it/s] | 8703M | HybridAdam | ย | single machine + Engine | ย ir18 | private dataset | 1 | 64 | 1 | no ZeRO | 19%|โโ ย ย ย ย | 1599/8549 [02:24<10:21, 11.17it/s] | 5769M | HybridAdam | ย | single machineย + wo Engineย + pytorch origin fp16 |
</body> </html>Environment
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (11 by maintainers)
the command is