go: cmd/compile: performance problems with many long, exported identifiers

What version of Go are you using (go version)?

1.7, 1.8+

What operating system and processor architecture are you using (go env)?

win64 and nix64

What did you do?

Tried to compile 150MB of code in a single package. Really it was a simple main.go that referenced this huge package and I executed “go build” but it is the compile executable that consumed ridiculous resources. But this even happens in my case when I get the code down to half that size (~75 MB). Not removing inlining and/or optimizations only delays the extreme resource usage, it does not remove it.

What did you expect to see?

A successful result and timid RAM usage that does not climb to extreme proportions and “streams” the build limited only by disk space of the result.

What did you see instead?

Consumed proportional amount of memory to code size, running out of memory (Win) or swapping forever until receiving a kill signal (Nix).

I am afraid I cannot provide the large set of code at the immediate moment. But if necessary, I can build a go generator that generates a ton of interfaces, structs, and functions. Orig thread: https://groups.google.com/forum/#!topic/golang-nuts/sBBkQ1_xf2Q.

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Comments: 22 (15 by maintainers)

Commits related to this issue

Most upvoted comments

I’ll leave a compile running overnight to see whether I can get some useful pprof output.

OnUnmappableCharacter__desc____obj__Java__nio__charset__CodingErrorAction__ret____obj__Java__nio__charset__CharsetEncoder – wow.

@laboger thanks, but to mangle Tolstoy, every interminable compile is interminable in its own way.

After the CLs above, time is reduced to a half hour, and max rss is down a bit:

     1461.67 real      1588.13 user       374.84 sys
11645718528  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
  38278408  page reclaims
        75  page faults
         0  swaps
       312  block input operations
       193  block output operations
         0  messages sent
         0  messages received
         3  signals received
    167410  voluntary context switches
  11764445  involuntary context switches
  • parse now takes 11s.
  • dumpobj now takes 7m.

For reference, here’s an alloc_space profile output:

alloc_space_after_fuse.pdf

Aside from the things I’ve already mentioned, disabling inlining would probably help noticeably now with memory usage.

There might be further optimizations available to further speed up dumpobj or shrink the object file size by reusing more strings somewhere, but I’ll leave that to @griesemer (export info) and @crawshaw (reflect info).

Thanks for the new test case; I’ll take a look at that later or tomorrow.

Here’s a CPU profile:

cf.pdf

Aha!

Hello, gc.testdclstack. This is not the first time we’ve had problems with testdclstack. See #14781. Robert suggested only enabling it in a special debug mode in the compiler. It is probably time to do that, perhaps even for Go 1.8. I’ll see about sending a CL soon.

With gc.testdclstack eliminated, the parse phase drops from 11m to 13s. Still waiting to see how much it helps elsewhere.

Eliminating gc.testdclstack won’t help with memory usage, though. My compile is still at 7gb and growing. ssa.fuseBlockPlain changes may help, experimentation still to come.

If it’s really very long identifiers that cause problems with the compile time, we should try to get to the bottom of this, rather than find work-arounds.

I don’t think it’s just very long exported identifiers. It is also the sheer number of them, and probably also some structural things (per the other comments I’ve made). Squinting at the profiles, the long identifiers is maybe 10% of the memory issue; I just suggested it to @cretz as a good, straightforward first step (and experiment to confirm what I’m seeing).

No prob. I appreciate that y’all are leaving this open so it can be revisited in the future if anyone wants a crazy test case for compiler performance.