continue: Allow configuring asyncio timeout in continue server
Is your feature request related to a problem? Please describe. I am running inference on a local LLM on CPU RAM. Hence the inference can be slow. However, I am using the extension to automate non-time-critical work (inserting comments on code that I’ve just written). However, quite often, the request takes quite some time to finish and by the time it’s finished, asyncio throws TimeoutError and the entire suggestion is tossed.
Traceback (most recent call last):
File "continuedev/src/continuedev/core/autopilot.py", line 368, in _run_singular_step
observation = await step(self.continue_sdk)
File "continuedev/src/continuedev/core/main.py", line 359, in __call__
return await self.run(sdk)
File "continuedev/src/continuedev/plugins/steps/chat.py", line 98, in run
async for chunk in generator:
File "continuedev/src/continuedev/libs/llm/ggml.py", line 105, in _stream_chat
async for chunk in generator():
File "continuedev/src/continuedev/libs/llm/ggml.py", line 79, in generator
async for line, end in resp.content.iter_chunks():
File "aiohttp/streams.py", line 51, in __anext__
File "aiohttp/streams.py", line 433, in readchunk
File "aiohttp/streams.py", line 303, in _wait
File "aiohttp/helpers.py", line 721, in __exit__
asyncio.exceptions.TimeoutError
Describe the solution you’d like
I get why timeout exists, but I’d like a configuration that allows setting a timeout value (or 0 to disable it)
About this issue
- Original URL
- State: closed
- Created 10 months ago
- Reactions: 1
- Comments: 15 (10 by maintainers)
@shrikrishnaholla this is now ready, and on every LLM class. Appreciate the patience : )
@sestinj ah, I’ve been there. No worries. I’ve set it to a large number now and it’s no longer complaining. Let me know when you have fresh version out. Love the quick responses. ❤️
@sestinj Tried it. Same result
For context, I am loading a 7B llama based model through Oobabooga’s OpenAI interface
@shrikrishnaholla I just released a new version (v0.0.362) that will allow you to do this:
GGML(..., timeout=None)for no timeout orGGML(..., timeout=3600)for a one hour timeout for exampleFirst of all, this is a super cool use case. We’ve been thinking a lot about what better UI there might be for doing “async” tasks like this in the background, cool to see the first real example of usage
And yes, I can add this as an argument. I’ll release a version soon and let you know