mtail: file tailing not working when a relative log path name is passed to --logs flag

Hi!

Using the count_lines sample, since version rc4 I can’t get tailing to work. I get the right results if I run mtail in one_shot mode, but no metric on the Prometheus metrics endpoint.

log:

$ ./mtail_rc10 --progs="/apps/solidmon/agent/mtail/programs/" --logs="logs/a.log" -v=10 -logtostderr
I0405 06:10:19.350138   62463 main.go:108] mtail version v3.0.0-rc10 git revision 60ed333f3672ec835c09c507f095fdd0be6ec1d7 go version go1.9.4
I0405 06:10:19.350205   62463 main.go:109] Commandline: ["./mtail_rc10" "--progs=/apps/solidmon/agent/mtail/programs/" "--logs=logs/a.log" "-v=10" "-logtostderr"]
I0405 06:10:19.350429   62463 lexer.go:179] Emitting COUNTER at count_lines.mtail:2:23
I0405 06:10:19.350455   62463 lexer.go:179] Emitting ID at count_lines.mtail:2:9-18
I0405 06:10:19.350467   62463 lexer.go:179] Emitting NL at count_lines.mtail:3:20
I0405 06:10:19.350478   62463 lexer.go:179] Emitting DIV at count_lines.mtail:3:1
I0405 06:10:19.350506   62463 yaccpar:954] position marked at count_lines.mtail:3:1
I0405 06:10:19.350515   62463 driver.go:76] Entering regex
I0405 06:10:19.350521   62463 lexer.go:179] Emitting REGEX at count_lines.mtail:3:2
I0405 06:10:19.350530   62463 lexer.go:555] Exiting regex
I0405 06:10:19.350535   62463 lexer.go:556] Regex at line 2, startcol 2, col 2
I0405 06:10:19.350542   62463 lexer.go:179] Emitting DIV at count_lines.mtail:3:3
I0405 06:10:19.350554   62463 lexer.go:179] Emitting LCURLY at count_lines.mtail:3:5
I0405 06:10:19.350564   62463 lexer.go:179] Emitting NL at count_lines.mtail:4:7
I0405 06:10:19.350574   62463 lexer.go:179] Emitting ID at count_lines.mtail:4:3-12
I0405 06:10:19.350585   62463 lexer.go:179] Emitting INC at count_lines.mtail:4:13-14
I0405 06:10:19.350595   62463 lexer.go:179] Emitting NL at count_lines.mtail:5:16
I0405 06:10:19.350605   62463 lexer.go:179] Emitting RCURLY at count_lines.mtail:5:1
I0405 06:10:19.350618   62463 lexer.go:179] Emitting EOF at count_lines.mtail:5:2
I0405 06:10:19.350660   62463 checker.go:84] found sym &{line_count variable typeVar4 count_lines.mtail:2:9-18 <nil> 0 false}
I0405 06:10:19.350691   62463 types.go:285] Unifying Int and typeVar4
I0405 06:10:19.350700   62463 types.go:285] Unifying typeVar4 and Int
I0405 06:10:19.350707   62463 types.go:300] Making "typeVar4" type "Int"
I0405 06:10:19.350746   62463 loader.go:200] Loaded program count_lines.mtail
I0405 06:10:19.350755   62463 loader.go:219] Program count_lines.mtail has goroutine marker 0x636f756e
I0405 06:10:19.350785   62463 vm.go:640] Starting program count_lines.mtail
I0405 06:10:19.350800   62463 loader.go:223] Started count_lines.mtail
I0405 06:10:19.350849   62463 mtail.go:55] Tail pattern "logs/a.log"
I0405 06:10:19.350872   62463 tail.go:129] glob matches: [logs/a.log]
I0405 06:10:19.350931   62463 tail.go:222] Read: 0 EOF
I0405 06:10:19.350942   62463 tail.go:227] Suspected truncation.
I0405 06:10:19.350948   62463 tail.go:194] current seek position at 629148964
I0405 06:10:19.350955   62463 tail.go:204] File size is 629148964
I0405 06:10:19.350961   62463 tail.go:230] handletrunc with error 'no truncate appears to have occurred'
I0405 06:10:19.350975   62463 tail.go:404] EOF on first read
I0405 06:10:19.351001   62463 tail.go:423] Tailing logs/a.log
I0405 06:10:19.351137   62463 mtail.go:227] Listening on port :3903
I0405 06:10:27.916078   62463 log_watcher.go:72] watcher event "logs/a.log": WRITE
I0405 06:10:27.916197   62463 log_watcher.go:72] watcher event "logs/a.log": WRITE
I0405 06:10:27.916243   62463 loader.go:97] Skipping logs/a.log due to file extension.
I0405 06:10:27.916260   62463 loader.go:97] Skipping logs/a.log due to file extension.
I0405 06:10:27.916307   62463 tail.go:435] Event type watcher.UpdateEvent{Pathname:"logs/a.log"}
I0405 06:10:27.916334   62463 tail.go:435] Event type watcher.UpdateEvent{Pathname:"logs/a.log"}

At 06:10:27 I added a line to the tailed file using echo "Hello" >>logs/a.log. in the metrics endpoint I get:

# TYPE line_count counter
# line_count defined at count_lines.mtail:2:9-18
line_count{prog="count_lines.mtail"} 0

The same scenario works perfectly under version rc2:

$ ./mtail_rc2 --progs="/apps/solidmon/agent/mtail/programs/" --logs="logs/a.log" -v=10 -logtostderr  I0405 06:14:47.558366   63852 main.go:101] mtail version v3.0.0-rc2 git revision 5e6d38908091a8648c0f26c44ebd708e241f3814 go version go1.9.4
I0405 06:14:47.558424   63852 main.go:102] Commandline: ["./mtail_rc2" "--progs=/apps/solidmon/agent/mtail/programs/" "--logs=logs/a.log" "-v=10" "-logtostderr"]
I0405 06:14:47.558702   63852 loader.go:196] Loaded program count_lines.mtail
I0405 06:14:47.558713   63852 loader.go:215] Program count_lines.mtail has goroutine marker 0x636f756e
I0405 06:14:47.558730   63852 loader.go:218] Started count_lines.mtail
I0405 06:14:47.558772   63852 mtail.go:103] Tail pattern "logs/a.log"
I0405 06:14:47.558796   63852 tail.go:125] glob matches: [logs/a.log]
I0405 06:14:47.559063   63852 tail.go:349] Tailing /apps/solidmon/agent/mtail/logs/a.log
I0405 06:14:47.559087   63852 vm.go:715] Starting program count_lines.mtail
I0405 06:14:47.559131   63852 mtail.go:290] Listening on port :3903

with metrics output:

# TYPE line_count counter
# line_count defined at count_lines.mtail:2:9-18
line_count{prog="count_lines.mtail"} 1

In both versions running with the option one_shot works perfectly.

Am I doing something wrong or is tailing broken currently?

I’ve tried with the precompiled binaries and builds I made myself. I’ve tested in ubuntu 16.04, an old redhat and wsl on windows with Ubuntu 16.04.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 43 (22 by maintainers)

Commits related to this issue

Most upvoted comments

Well, following the traditions of our ancestors, this bug is hereby named “riclib”. 😃

woot!

Tagged as rc11

On 23 April 2018 at 22:41, Ricardo Liberato notifications@github.com wrote:

Tested and fix works. Thanks @jaqx0r https://github.com/jaqx0r !

riclib@libMac  ~/mt  clear riclib@libMac  ~/mt  echo date >> a.log riclib@libMac  ~/mt  echo date >> a.log riclib@libMac  ~/mt  curl http://localhost:3903/metrics

TYPE line_count counter

line_count defined at linecount.mtail:4:9-18

line_count{prog=“linecount.mtail”} 2 riclib@libMac  ~/mt  echo date >> a.log riclib@libMac  ~/mt  curl http://localhost:3903/metrics

TYPE line_count counter

line_count defined at linecount.mtail:4:9-18

line_count{prog=“linecount.mtail”} 3

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/mtail/issues/151#issuecomment-383561079, or mute the thread https://github.com/notifications/unsubscribe-auth/AC5b--8PG6FabZxmfdpEn6nF3W_qJcm7ks5trcwUgaJpZM4TIM1A .

bugs called riclib are like cockroaches, hard to get rid of @jaqx0r @jnovack 😉

Yep, that’s what I’m thinkgin too. I managed to reproduce the bug with a relative path to the logfile directly, instead of either an absolute path or a relative path to the directory holding the log, and I’m seeing log lines that indicate that the pattern match from fsnotify is different than what’s in the ‘watched’ map.

I’m trying to use filepath.Abs to resolve the absolute path of the files for comparison, but it’s making some other tests fail so the fix will take a little while.

I’m glad I was able to reproduce it here though. I didn’t realise before that you were using relative paths to a single file and that’s what it took to trigger the bug.

Seems to be a much more mundane error. If I put a full absolute path it works, if I put a relative path it doesn’t @jaqx0r. 😱

If I put /Home/riclib/mt/a.log (same file) as input to --logs than tailing works. mtail -logtostderr -v 1 -progs ./progs -logs /Users/riclib/mt/a.log --vmodule=tail=2,log_watcher=2

I’m not concerned about the old rc releases at this point. I’m concerned that you are suggesting there’s a inconsistency between v3.0.0-rc10-106-g517231f running in Docker, and v3.0.0-rc10-106-g517231f running as a native OSX binary.

Can you confirm (just hit the Thumbs Up), that v3.0.0-rc10-106-g517231f Docker WORKS as expected (does NOT have the issue), and v3.0.0-rc10-106-g517231f native OSX binary does NOT work (has the issue)?