====== transformer-token原理 ====== ===== tokenization在渲染jinja的模板代码 ===== {%- if tools %} <|im_start|>system {%- if messages[0]['role'] == 'system' %} {{ messages[0]['content'] }} {%- else %} You are a helpful assistant. {%- endif %} # Tools You may call one or more functions to assist with the user query. You are provided with function signatures within XML tags: {%- for tool in tools %} {{ tool | tojson }} {%- endfor %} For each function call, return a json object with function name and arguments within XML tags: {"name": , "arguments": } <|im_end|> {%- else %} {%- if messages[0]['role'] == 'system' %} <|im_start|>system{{ messages[0]['content'] }}<|im_end|> {%- else %} <|im_start|>systemYou are a helpful assistant.<|im_end|> {%- endif %} {%- endif %} {%- for message in messages %} {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %} <|im_start|>{{ message.role }}{{ message.content }}<|im_end|> {%- elif message.role == "assistant" %} <|im_start|>{{ message.role }} {%- if message.content %} {{ message.content }} {%- endif %} {%- for tool_call in message.tool_calls %} {%- if tool_call.function is defined %} {%- set tool_call = tool_call.function %} {%- endif %} {{ '\n\n{"name": "' + tool_call.name + '", "arguments": ' + tool_call.arguments | tojson + '}\n' }} {%- endfor %} <|im_end|> {%- elif message.role == "tool" %} {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %} <|im_start|>user {%- endif %} {{ '\n\n' + message.content + '\n' }} {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %} <|im_end|> {%- endif %} {%- endif %} {%- endfor %} {%- if add_generation_prompt %} <|im_start|>assistant {%- endif %} 在系统输入 You are a helpful assistant. 用户输入:你好,请介绍一下你自己。 的情况下,模板渲染的输出是: <|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user 你好,请介绍一下你自己。<|im_end|> <|im_start|>assistant