process-compose: Too much log output seems to cause processes to freeze

Hi! We use this tool in our team to orchestrate a web application on our development machines. It’s been very pleasant so far.

One of the tools in our stack managed by process-compose is Hasura. Specifically, the process is a shell script that sets environment variables and then executes graphql-engine.

For the longest time, we’ve observed graphql-engine occasionally freezing up for no apparent reason. A process-compose restart is enough to get it to behave again. Today I discovered something new - the problem is:

  • Reliably reproducible by having graphql-engine output more logs (when running it through process-compose)
  • Impossible to reproduce with a lower graphql-engine verbosity (when running it through process-compose)
  • Impossible to reproduce when running graphql-engine outside of process-compose

One way that I can run graphql-engine through process-compose without any freezing is by redirecting stdout to a log file, i.e. adding 1>graphql-engine.log to the process shell script. Setting a log_location and modifying log_level in process-compose.yaml did not seem to fix freezing.

I can’t say if the sheer amount of logs is what causes process-compose to choke or if something else is going on. If there exists an error log of process-compose itself or any other information that may be useful, let me know and I’ll try to provide.

I’ll expand this issue if I discover anything more.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 2
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

I ran into this issue as well. One of my process is logging an excessively long line and it hangs writing to stdout. I had a lot of goroutines in my process stuck on zap logger trying to log to stdout (and doing it under a lock to probably prevent mangling lines).

I recompiled process-compose with pprof enabled and dumped the goroutines. At a glance, nothing seems to be deadlocked, but I do see a bunch of goroutines scanning lines (see https://gist.github.com/appaquet/d960f5f4bf7aae018971735c5f73b0a4), as it should be.

I then tried increasing the max token size to a bigger value, and it does fix my issue:

diff --git a/src/app/process.go b/src/app/process.go
index f595e9c..cf070aa 100644
--- a/src/app/process.go
+++ b/src/app/process.go
@@ -4,7 +4,6 @@ import (
 	"bufio"
 	"context"
 	"fmt"
-	"github.com/f1bonacc1/process-compose/src/types"
 	"io"
 	"math/rand"
 	"os"
@@ -14,6 +13,8 @@ import (
 	"syscall"
 	"time"
 
+	"github.com/f1bonacc1/process-compose/src/types"
+
 	"github.com/f1bonacc1/process-compose/src/command"
 	"github.com/f1bonacc1/process-compose/src/health"
 	"github.com/f1bonacc1/process-compose/src/pclog"
@@ -335,6 +336,7 @@ func (p *Process) handleInput(pipe io.WriteCloser) {
 
 func (p *Process) handleOutput(pipe io.ReadCloser, handler func(message string)) {
 	outscanner := bufio.NewScanner(pipe)
+	outscanner.Buffer(make([]byte, 1024), 1024*1024)
 	outscanner.Split(bufio.ScanLines)
 	for outscanner.Scan() {
 		handler(outscanner.Text())

Perhaps making this max line value configurable per process could be a fix

@dzmitry-lahoda Do you mind opening a separate issue for this scenario?

Not sure if that response to me) But did you tried one big 5mb or like line? not many lines, but one big?

I tried to cat 1GB file. OK But when I replaced \n with in a 20MB file it didn’t go so well. Do you think it’s a realistic scenario where a single line is 23040578 chars long?

Not sure that it’s the same issue @john-rodewald is facing.