spark: Support of event based (non-blocking) request processing.

Currently, for request processing, SparkJava completely relies on HTTP thread pool of Jetty (by default 8 - 200 threads). Which in it’s own turn is non-blocking on networking end, but it’s blocking on request processing (business logic) side (Handlers/Filters/Matchers…). Any blocking (I/O bound, JDBC etc) operation has a potential to exhaust the Jetty’s HTTP thread pool. In that sense, Spark currently is not leveraging existing asynchronous Servlet 3.1 implementation by Jetty. To increase performance potential of Spark, framework needs to add support of event based (non-blocking) processing based on it’s own thread pool. Which is currently easily achievable with a combination of Servlet 3.1 + Java 8 CompletableFuture API.
With this combination there is no need for integration with high level Akka or RX frameworks.

Following is a sample code that can actually achieve the goal stated above:

/**
 * Simple Jetty Handler
 *
 * @author Per Wendel
 */
public class JettyHandler extends SessionHandler {

    private Filter filter;
    public JettyHandler(Filter filter) {
        this.filter = filter;
    }

    @Override
    public void doHandle(
            String target,
            Request baseRequest,
            HttpServletRequest request,
            HttpServletResponse response) throws IOException, ServletException {

        if (NOT_ASYNCH) {
            filter.doFilter(wrapper, response, null);
        } else {
            AsyncContext asyncContext = wrapper.startAsync();
            asyncContext.setTimeout(60000);
            CompletableFuture.runAsync(() -> {
                try {
                    filter.doFilter(wrapper, response, null);
                }
                catch (IOException | ServletException ex) {
                    throw new RuntimeException(ex);
                }
            }, executor)
            .thenAccept((Void) -> {
                baseRequest.setHandled(!wrapper.notConsumed());
                asyncContext.complete();
            });
        }
    }
}

The SPARKS_OWN_ASYNC_EXECUTOR above may look like this:

private static final ThreadPoolExecutor executor = new ThreadPoolExecutor(200, 200, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>());
    static { executor.allowCoreThreadTimeOut(true); }

If you run a benchmark with significant amount of simultaneously active client sockets, and monitor threads of the active application with change above. You’ll see that Jetty will create multiple times less of HTTP threads (you can identify them by “qtp” prefix) than it typically does. An those created will be nicely alternating between being busy and parked. Instead, there will be Spark’s own thread pool created witch will have all threads being 100% busy under high load. Which is the exactly the goal with event based approach. And in case on of Sparks own thread to block, then it will not degrade the performance of Jetty’s performance.

The idea is to cut the reliance on Jetty’s HTTP thread pool. And not force Spark users do following:

public static void main(String[] args) {
    get("/benchmark", (request, response) -> {
        AsyncContext ac = request.raw().startAsync();
        CompletableFuture<Void> cf = CompletableFuture.supplyAsync(() -> getMeSomethingBlocking());
        cf.thenAccept((Void) -> ac.complete());
        return "ok";
    });
}

(This comment was edited by @tipsy, to fix formatting issues)

About this issue

Original URL
State: open
Created 8 years ago
Reactions: 13
Comments: 51 (21 by maintainers)

Commits related to this issue

Issue #549 - async future tests — committed to mj1618/spark by mj1618 7 years ago
Issue #549 - async impl — committed to mj1618/spark by mj1618 7 years ago
Issue #549 - use correct response — committed to mj1618/spark by mj1618 7 years ago
Issue #549 - test comment for sleep — committed to mj1618/spark by mj1618 7 years ago
Issue #549 - remove unnecessary output — committed to mj1618/spark by mj1618 7 years ago

Most upvoted comments

NOT AGAIN! Asynchronous Spark has already been beaten to death in another issue. -1500456 😦

+10

ruurd on May 19, 2016

Not all software we create is intended to grow big. Why waste energy in unblocking a piece of software that is blocking to begin with and then use it for stuff that does not grow big? Now THAT is a waste. Use spark for what it is good at. If you grow big you have time to switch to something that fits your future needs. Only using tools and libraries that are geared towards big-bigger-biggest is lunacy. Why would I want to gear up a bunch of replicated hardware with all the difficulties that brings with it for a system that only needs to process let’s say 10 requests per minute? Blocking IO in that case is a much easier to understand model and Spark is covering that excellently. Keep in mind that a lot of systems are small and that only the very lucky few will grow big. And those that do will go through lots of increments and lots of changes in the tooling.

ruurd on Feb 18, 2017

In fact, if you say on the front page that spark can be used as a great replacement for node, you should provide this functionality… otherwise it’s not entirely true, is it?

kboom on Apr 5, 2018

I haven’t had time to look through it properly, but my first impression from skimming through the code was good. The internal change seems quite big, and it looks like some refactoring is needed. I’ll have a closer look next year (holidays soon). Ultimately it’s not up to me though.

tipsy on Apr 14, 2018

Hey guys - just had a go at a basic implementation, just so we’ve got something concrete to look at. I know this isn’t good to merge and will get plenty of feedback, but at least it’s something to chat about.

https://github.com/perwendel/spark/pull/953

Basically if you return a CompletableFuture, then it will make the request async. Otherwise everything remains the same as before.

I did break up the two quite long functions, though I understand if this is too much change and I’d be happy to rewrite it based on your comments.

Thoughts?

mj1618 on Dec 18, 2017

#208 - spark is simple and blocking. that is why it is simple to use.

ruurd on Dec 6, 2016

You know what? I give up.

ruurd on Dec 17, 2017

I encountered this thread while researching the framework. I understand that the spark developer would trade it for the ease of usage (not suitable for some cases, but good selling point).

What I don’t understand is why you let ppl like @ruurd, who clearly doesn’t understand how and why asynchronous processing could mitigate IO latency, kept trolling almost every single issue about async support.

Frankly, I was though that @ruurd is one of spark maintainers (which is unbelievable)

Read above and #208 then judge it yourself

phongphan on Mar 5, 2017

Hehheh… I also have to find some time to do that. After doing my 8-hour shift at work.

ruurd on Dec 19, 2017

@phongphan I disagree about @ruurd trolling and that he “clearly doesn’t understand”. His posts are a bit abrasive and exaggerated, but he makes some good points, and he understands the intention behind Spark. We’re considering async for v3, but we’ll only do it if we can maintain the simplicity Spark currently has.

tipsy on Mar 5, 2017

Is this something you’re keen to have? If so I don’t mind doing some work on it.

Only if you can make it work without bothering @ruurd 😁

I know that seems hard since he’s bothered by people even discussing it, but if you manage to introduce it to the framework in a way that non-async people don’t notice it, I think it should be included.

tipsy on Dec 17, 2017

I think @mlengle is right. no modern software should be blocking this is the whole point. Blocking is simple, but some kind of a waste. All of us agree blocking should be avoided if you want to grow big.

wpc009 on Feb 16, 2017

@ruurd @Lewiscowles1986 There’s Play framework heavily built upon async principles, check it if you’re curious. There’re things like streaming and websockets which create long living connections by their definition. And yeah, there’re long latency services, sometimes because of a bad architecture, sometimes because of huge amount of data to process, or heavy load with limited hardware resources and the storage systems that may have their own failures, that’s reality and web frameworks have to deal with real world problems, not with the perfectly polished backend code. And we’re talking here not about some ugly hack, but about a popular principle in modern software development: “don’t block”

mlengle on Dec 16, 2016

hey @ruurd yea I read your comments - you make some good points and I really respect your opinion.

Spark is simple and lets keep it that way - 100% agree with you on that, I think most people do. And given most Java libraries are sync + blocking (e.g. jdbc protocol itself is sync blocking) most request handlers are going to be the same. However it does seem the community is keen to at least have the option of async on the odd occasion it comes in handy.

For me at least, I’d like to have the option to use an async library in my request handlers. E.g. an async http client, or a call to a streaming application such as kafka.

The reason is not to offload to another thread pool - I agree thread shifting is just unnecessary overhead. But for truly non-blocking calls (similar to what NIO provides) it can be better to not chew up a thread waiting for bits to go over the wire.

But considering lean principles I think it’s prudent to introduce only the most basic capability though - and to keep the default sync + blocking. Then we can see how the community responds - and see if it actually does open up any opportunities. If not - you can always remove it! A lot of great open source projects remove features that made sense at the time but turned out to be unnecessary, or were surpassed - e.g. GC in Rust.

One downside to request.async() is that there is still a required return value - so this might be confusing when doing async. Express gets around this by having the response passed to the response object and not having a return value. But that would change the protocol here.

How about we just handle the return value being a Future? That way it would change nothing about the current interfaces, so to the average sync+blocking handlers it would make no difference. And when we see a Future as the return value, we async the request and respond on the future completing?

If it’s ok with you @ruurd I’ll code up some sample code for you and @tipsy to review.

mj1618 on Dec 17, 2017

I think it’s good to have the option to respond to a request asynchronously. I don’t think it has to compromise the simplicity of the current blocking features.

Simply allowing the async() method will allow people to use libraries that provide callbacks instead of blocking for IO. In this case I couldn’t find a way to make Spark allow the response to happen on this callback. startAsync() doesn’t seem to work for me, the output stream seems to get closed anyway.

I think it’s an important feature as more and more java libraries out there are moving towards non-blocking IO, and by making this feature optional to Spark users means the framework has more flexibility. And as long as it’s an optional one I don’t see how it would make anything more complicated.

Tbh I don’t see the point of another Thread pool for this, that doesn’t seem necessary to accomplish async.

Is this something you’re keen to have? If so I don’t mind doing some work on it.

mj1618 on Dec 17, 2017

@tipsy Cause this is completely wrong way to the non-blocking web layer 😃 The right way with jetty is mentioned in the docs https://wiki.eclipse.org/Jetty/Feature/Continuations#Using_Continuations

so, we should have separate async handler which suspends continuation, then async-supporting controller code should return CompletionStage, and we add processing stage like this:

returnedCompletionStage 
.thenApply(returnedResponse -> {/* code to pass the response */  contunuation.resume()})
.exceptionally(e -> { /* appropriate err handling */});

mlengle on May 31, 2017

@ruurd nice trolling, but no. and reread the link I’ve sent, it’s not about async io

mlengle on Dec 16, 2016