continue: Allow configuring asyncio timeout in continue server

Is your feature request related to a problem? Please describe. I am running inference on a local LLM on CPU RAM. Hence the inference can be slow. However, I am using the extension to automate non-time-critical work (inserting comments on code that I’ve just written). However, quite often, the request takes quite some time to finish and by the time it’s finished, asyncio throws TimeoutError and the entire suggestion is tossed.

Traceback (most recent call last):

  File "continuedev/src/continuedev/core/autopilot.py", line 368, in _run_singular_step
    observation = await step(self.continue_sdk)

  File "continuedev/src/continuedev/core/main.py", line 359, in __call__
    return await self.run(sdk)

  File "continuedev/src/continuedev/plugins/steps/chat.py", line 98, in run
    async for chunk in generator:

  File "continuedev/src/continuedev/libs/llm/ggml.py", line 105, in _stream_chat
    async for chunk in generator():

  File "continuedev/src/continuedev/libs/llm/ggml.py", line 79, in generator
    async for line, end in resp.content.iter_chunks():

  File "aiohttp/streams.py", line 51, in __anext__

  File "aiohttp/streams.py", line 433, in readchunk

  File "aiohttp/streams.py", line 303, in _wait

  File "aiohttp/helpers.py", line 721, in __exit__

asyncio.exceptions.TimeoutError

Describe the solution you’d like

I get why timeout exists, but I’d like a configuration that allows setting a timeout value (or 0 to disable it)

About this issue

Original URL
State: closed
Created 10 months ago
Reactions: 1
Comments: 15 (10 by maintainers)

Most upvoted comments

@shrikrishnaholla this is now ready, and on every LLM class. Appreciate the patience : )

sestinj on Sep 5, 2023

@sestinj ah, I’ve been there. No worries. I’ve set it to a large number now and it’s no longer complaining. Let me know when you have fresh version out. Love the quick responses. ❤️

shrikrishnaholla on Sep 4, 2023

@sestinj Tried it. Same result

Screenshot_20230904_113953

For context, I am loading a 7B llama based model through Oobabooga’s OpenAI interface

shrikrishnaholla on Sep 4, 2023

@shrikrishnaholla I just released a new version (v0.0.362) that will allow you to do this: GGML(..., timeout=None) for no timeout or GGML(..., timeout=3600) for a one hour timeout for example

sestinj on Sep 2, 2023

First of all, this is a super cool use case. We’ve been thinking a lot about what better UI there might be for doing “async” tasks like this in the background, cool to see the first real example of usage

And yes, I can add this as an argument. I’ll release a version soon and let you know

sestinj on Sep 2, 2023