FastChat: chatglm3 error: why are there multiple <|assistant|> <|user|> tags in the generated datas?
How to write the format of api request payload and stop, the generated data results are not correct?
deploy model: python -m fastchat.serve.model_worker --model-path chatglm3-6b
code:
headers = {"Content-Type": "application/json"}
pload = {
"model": "chatglm3-6b",
"prompt": "<|user|>\n 介绍下广州<|assistant|>",
# "stop": [
# 64795,
# 64797,
# 2,
# ],
# "stop":["<|user|>", "<|observation|>", "</s>","<|assistant|>"],
"stop":"###",
"max_new_tokens": 512,
}
response = requests.post("http://19***1:21002/worker_generate_stream", headers=headers, json=pload, stream=True,timeout=3)
# print(response.text)
for chunk in response.iter_lines(chunk_size=1024,decode_unicode=False, delimiter=b"\0"):
if chunk:
# print(chunk.decode("utf-8"))
data = json.loads(chunk.decode("utf-8"))
print(data["text"])
[gMASK]sop <|user|>
介绍下广州<|assistant|> 广州,简称“粤”,省会、副省级市,是广东省会,地处中国南部、广东省中部、珠江三角洲北部。全市总面积11,396平方千米,人口约为1530万(截至2021年底)。广州自古以来就是商业和文化的中心,被称为“羊城”,因 historical landmarks such as the Forbidden City and Lenin Memorial Hall.广州还是一些 large-scale shopping centers such as the Chengdu Road, and is famous for its delicious cuisine, including dim sum, roasted goose, and other traditional dishes.
About this issue
- Original URL
- State: open
- Created 7 months ago
- Reactions: 3
- Comments: 15
楼主测试api的时候有没有报错?比如v1/model的时候,返回data是空列表?