crewAI: Invalid Format: Missing 'Action:' after 'Thought
My agent keep running into this error whenever I use any of the models locally (I tried llama2, openhermes, starling and Mistral). The only model that didn’t run into this problem is Mistral.
Very often this error is followed by another error:
“Error executing tool. Missing exact 3 pipe (|) separated values. For example, coworker|task|information.”
Whenever any of these two errors appeared, I wouldn’t be able to get any valid output. I was experimenting with simple examples like internet scraping with DuckDuckGo and custom reddit scraping tool. Also, worth mentioning, I don’t have these problems when I use openai.
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Reactions: 3
- Comments: 24 (1 by maintainers)
I found another cause of this bug. When the task prompt is too strong, it changes some important (internal) keywords, like
The agent will fail to parse the text.
For example, I have a task to ask the agent to use a markdown header to denote headers. It transformed
Action:into**Action:**To resolve this, I add the following to all of my task prompts:
Hey folks, finally catching up to this!
Indeed smaller models do struggle with certain tools, specially more complex ones, I think there is room for us to optimize the crew prompts a bit maybe, I’ll look into that, but in the end of the day smaller models do struggle with cognition.
I’m collecting data to fine tune these models into agentic models that will be trained to behave more like agents, this should provide way more reliability in even small models.
I think a good next action here might be to mention the best models on our new docs, and doing some test on slightly changing the prompts for smaller models, I’ll take a look at that, meanwhile I’m closing this one, but open to re-open if there are requests 😃
I had success running a simple crew with one function. Benchmarks of the different models and if they worked with function calling is below. Hopefully, this helps someone! All testing was done using LM Studio as the API Server.
Hello, I have experienced the same issue with Openhermes before, but since I configured the temperture to 0.1, it works great.
I was having looping problem before as well, but with Gemini Pro, with temperature at 0.6, all issues gone.
@kingychiu could you actually please tell where exactly did you changed the above , In my case model is failed to follow the instructions so it never returns the output into that format which agent tool expects
Hi everyone, thank you all for replying and sharing your experiences. I wanted to share my observations and maybe somebody might find them helpful and save up some time.
Over the last 10 days, I’ve experimented with 15 different models. My laptop has 16 GB RAM, my goal for my agents was to scrape data from a particular subreddit and to turn that data into simple, short newsletter written in layman words.
Of those 15 models, only 2 were able to accomplish the task: GPT4 and Llama 2 13B (base model).
Models I’ve played with that have failed were:
I have tried to tweak my prompts, I’ve played with modelfile by setting all kinds of parameters, but the only conclusion that I came up with is: more parameters = more reasoning.
The reason why agents failed is because they either:
I have one more theory but I can’t test it due to insufficient RAM my laptop. I wonder if models with 7B p but context window of 16K tokens would be able to perform the task. In other words, would bigger context window = more reasoning?
With the @kingychiu hack, I’ve got
Error executing tool. Missing exact 3 pipe (|) separated values.I had to addAction Input should be formatted as coworker|task|context.@kingychiu that worked. I’m running
TheBloke/dolphin-2.2.1-mistral-7B-GGUFon LMStudio.sorry, I didn’t check my email bc vacations.
OpenHermes.
I tried running a few 13B models - Llama 2 and Vicuna. I assumed that the bigger model = better results but that wasn’t the case. I think that “losing track” is a right way to describe the issue. It looks like local model totally forgets about all the prompts and starts looping.