FastChat: chatglm3 error: why are there multiple <|assistant|> <|user|> tags in the generated datas?

How to write the format of api request payload and stop, the generated data results are not correct?

deploy model: python -m fastchat.serve.model_worker --model-path chatglm3-6b

code:

headers = {"Content-Type": "application/json"}
pload = {
    "model": "chatglm3-6b",
    "prompt": "<|user|>\n 介绍下广州<|assistant|>",
    # "stop": [
    #         64795,
    #         64797,
    #         2,
    #     ],
    # "stop":["<|user|>", "<|observation|>", "</s>","<|assistant|>"],
    "stop":"###",


    "max_new_tokens": 512,
  }
response = requests.post("http://19***1:21002/worker_generate_stream", headers=headers, json=pload, stream=True,timeout=3)
# print(response.text)
for chunk in response.iter_lines(chunk_size=1024,decode_unicode=False, delimiter=b"\0"):
    if chunk:
        # print(chunk.decode("utf-8"))
        data = json.loads(chunk.decode("utf-8"))
        print(data["text"])

image

[gMASK]sop <|user|>
 介绍下广州<|assistant|> 广州,简称“粤”,省会、副省级市,是广东省会,地处中国南部、广东省中部、珠江三角洲北部。全市总面积11,396平方千米,人口约为1530万(截至2021年底)。广州自古以来就是商业和文化的中心,被称为“羊城”,因 historical landmarks such as the Forbidden City and Lenin Memorial Hall.广州还是一些 large-scale shopping centers such as the Chengdu Road, and is famous for its delicious cuisine, including dim sum, roasted goose, and other traditional dishes.

About this issue

  • Original URL
  • State: open
  • Created 7 months ago
  • Reactions: 3
  • Comments: 15

Most upvoted comments

楼主测试api的时候有没有报错?比如v1/model的时候,返回data是空列表?