langchain: Issue: openai functions agent does not respect tools and arguments
Issue you’d like to raise.
When mixing gpt-3.5-turbo-0613
, openai-functions
agent, and PythonAstREPLTool
tool, GPT3.5 stops respecting the tool name and the arguments hack introduced in the OpenAIFunctionsAgent.
The error log is:
Could not parse tool input: {'name': 'python', 'arguments': "len(cases_df['case_id'].unique())"} because the `arguments` is not valid JSON.
Which means the model isn’t respecting the specs accurately. In my case, the confusion was always that the name of the tool is python
instead of python_repl_ast
, and the arguments
is the actual python code instead of the requested obj format with __arg1
attr.
Suggestion:
I temporarily fixed it by
1- extending the OpenAIFunctionsAgent
and overriding the _parse_ai_message
to handle arguments confusion.
2- extending the PythonAstREPLTool
and altering its name and description a bit.
class CustomPythonAstREPLTool(PythonAstREPLTool):
name = "python"
description = (
"A Python shell. Use this to execute python commands. "
"The input must be an object as follows: "
"{'__arg1': 'a valid python command.'} "
"When using this tool, sometimes output is abbreviated - "
"Make sure it does not look abbreviated before using it in your answer. "
"Don't add comments to your python code."
)
def _parse_ai_message(message: BaseMessage) -> Union[AgentAction, AgentFinish]:
"""Parse an AI message."""
if not isinstance(message, AIMessage):
raise TypeError(f"Expected an AI message got {type(message)}")
function_call = message.additional_kwargs.get("function_call", {})
if function_call:
function_call = message.additional_kwargs["function_call"]
function_name = function_call["name"]
try:
_tool_input = json.loads(function_call["arguments"])
except JSONDecodeError:
print(
f"Could not parse tool input: {function_call} because "
f"the `arguments` is not valid JSON."
)
_tool_input = function_call["arguments"]
# HACK HACK HACK:
# The code that encodes tool input into Open AI uses a special variable
# name called `__arg1` to handle old style tools that do not expose a
# schema and expect a single string argument as an input.
# We unpack the argument here if it exists.
# Open AI does not support passing in a JSON array as an argument.
if "__arg1" in _tool_input:
tool_input = _tool_input["__arg1"]
else:
tool_input = _tool_input
content_msg = "responded: {content}\n" if message.content else "\n"
return _FunctionsAgentAction(
tool=function_name,
tool_input=tool_input,
log=f"\nInvoking: `{function_name}` with `{tool_input}`\n{content_msg}\n",
message_log=[message],
)
return AgentFinish(return_values={"output": message.content}, log=message.content)
class CustomOpenAIFunctionsAgent(OpenAIFunctionsAgent):
def plan(
self,
intermediate_steps: List[Tuple[AgentAction, str]],
callbacks: Callbacks = None,
**kwargs: Any,
) -> Union[AgentAction, AgentFinish]:
"""Given input, decided what to do.
Args:
intermediate_steps: Steps the LLM has taken to date, along with observations
**kwargs: User inputs.
Returns:
Action specifying what tool to use.
"""
user_input = kwargs["input"]
agent_scratchpad = _format_intermediate_steps(intermediate_steps)
prompt = self.prompt.format_prompt(
input=user_input, agent_scratchpad=agent_scratchpad
)
messages = prompt.to_messages()
predicted_message = self.llm.predict_messages(
messages, functions=self.functions, callbacks=callbacks
)
agent_decision = _parse_ai_message(predicted_message)
return agent_decision
Not sure if this will be improved on the API level, but it is worth looking at it. Improving the fake arguments’ names and tools names might improve this as it seems related to the issue.
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 7
- Comments: 21 (8 by maintainers)
hey @li-xiaohui
I had to initialize the agent manually, here is how.
It’s a general performance issue with the model - it hallucinates and ignores instructions like enums. Using natural language or sudocode prompts to return JSON gives consistently better results. The solution is not to use functions.
I also encountered a similar problem (when using gpt-3.5-turbo), but this issue disappeared when I switched to using gpt4.
It also works better when you add arg_schema Field to PythonAstREPLTool. Code:
This is really interesting because it is supposed to be the other way. This model was fine-tuned on returning the correct requested json, so its error rate should be lower than using just a prompt. I don’t know what they are doing internally, but I think they are converting the functions list piece to a prompt in the same format they used to fine-tune the model, so it is really weird that we might get better results by crafting our own prompt instead of using functions.
This somehow makes sense as I think the model cares about the functions and the parameter names, hence the confusion it is making when calling back the python tool. I believe that improving this piece in langcahin might yield better results as I didn’t have any issues since I implemented the changes above.