clangd: clangd assertion failure when switching git branches

I don’t have a foolproof way to reproduce unfortunately. It happens quite often when switching branches when working on https://github.com/systemd/systemd

Logs

LLVM ERROR: SmallVector unable to grow. Requested capacity (18446744073709551212) is larger than maximum value for size type (4294967295)
/lib64/libLLVM-12.so(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEi+0x36)[0x7ffaa0a073f6]
/lib64/libLLVM-12.so(_ZN4llvm3sys17RunSignalHandlersEv+0x34)[0x7ffaa0a052e4]
/lib64/libLLVM-12.so(+0xc04466)[0x7ffaa0a05466]
/lib64/libpthread.so.0(+0x13a20)[0x7ffaa8a69a20]
/lib64/libc.so.6(gsignal+0x142)[0x7ffa9f90c2a2]
/lib64/libc.so.6(abort+0x116)[0x7ffa9f8f58a4]
/lib64/libLLVM-12.so(_ZN4llvm18report_fatal_errorERKNS_5TwineEb+0x8e)[0x7ffaa0953d2e]
/lib64/libLLVM-12.so(+0xb52e94)[0x7ffaa0953e94]
/lib64/libLLVM-12.so(+0xb918b9)[0x7ffaa09928b9]
/lib64/libLLVM-12.so(_ZN4llvm15SmallVectorBaseIjE8grow_podEPvmm+0xc1)[0x7ffaa09929d1]
/lib64/libclang-cpp.so.12(_ZN5clang17CharLiteralParserC1EPKcS2_NS_14SourceLocationERNS_12PreprocessorENS_3tok9TokenKindE+0xb29)[0x7ffaa6377439]
/lib64/libclang-cpp.so.12(_ZN5clang4Sema22ActOnCharacterConstantERKNS_5TokenEPNS_5ScopeE+0xdf)[0x7ffaa6cd73cf]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbRbNS0_13TypeCastStateEbPb+0x364)[0x7ffaa641d4c4]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbNS0_13TypeCastStateEbPb+0x3e)[0x7ffaa64202ce]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser25ParseAssignmentExpressionENS0_13TypeCastStateE+0x39)[0x7ffaa6422ec9]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser21ParseBraceInitializerEv+0x347)[0x7ffaa6440717]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser30ParseCompoundLiteralExpressionENS_9OpaquePtrINS_8QualTypeEEENS_14SourceLocationES4_+0xd8)[0x7ffaa6423058]
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
I[13:38:49.938] <-- textDocument/documentSymbol(1199)
/lib64/libclang-cpp.so.12(_ZN5clang6Parser20ParseParenExpressionERNS0_16ParenParseOptionEbbRNS_9OpaquePtrINS_8QualTypeEEERNS_14SourceLocationE+0x13ff)[0x7ffaa64297bf]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbRbNS0_13TypeCastStateEbPb+0x711)[0x7ffaa641d871]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbNS0_13TypeCastStateEbPb+0x3e)[0x7ffaa64202ce]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser25ParseAssignmentExpressionENS0_13TypeCastStateE+0x39)[0x7ffaa6422ec9]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser15ParseExpressionENS0_13TypeCastStateE+0xd)[0x7ffaa642400d]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser20ParseParenExpressionERNS0_16ParenParseOptionEbbRNS_9OpaquePtrINS_8QualTypeEEERNS_14SourceLocationE+0x6cd)[0x7ffaa6428a8d]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbRbNS0_13TypeCastStateEbPb+0x711)[0x7ffaa641d871]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbNS0_13TypeCastStateEbPb+0x3e)[0x7ffaa64202ce]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser25ParseAssignmentExpressionENS0_13TypeCastStateE+0x39)[0x7ffaa6422ec9]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseExpressionListERN4llvm15SmallVectorImplIPNS_4ExprEEERNS2_INS_14SourceLocationEEENS1_12function_refIFvvEEE+0x6f)[0x7ffaa6423a5f]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser28ParsePostfixExpressionSuffixENS_12ActionResultIPNS_4ExprELb1EEE+0xce0)[0x7ffaa6425360]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbRbNS0_13TypeCastStateEbPb+0x2aa)[0x7ffaa641d40a]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser19ParseCastExpressionENS0_13CastParseKindEbNS0_13TypeCastStateEbPb+0x3e)[0x7ffaa64202ce]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser25ParseAssignmentExpressionENS0_13TypeCastStateE+0x39)[0x7ffaa6422ec9]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser15ParseExpressionENS0_13TypeCastStateE+0xd)[0x7ffaa642400d]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser18ParseExprStatementENS0_17ParsedStmtContextE+0x50)[0x7ffaa647fca0]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser42ParseStatementOrDeclarationAfterAttributesERN4llvm11SmallVectorIPNS_4StmtELj32EEENS0_17ParsedStmtContextEPNS_14SourceLocationERNS0_25ParsedAttributesWithRangeE+0x27b)[0x7ffaa647ce3b]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser27ParseStatementOrDeclarationERN4llvm11SmallVectorIPNS_4StmtELj32EEENS0_17ParsedStmtContextEPNS_14SourceLocationE+0x8c)[0x7ffaa647e4cc]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser26ParseCompoundStatementBodyEb+0x832)[0x7ffaa6483c02]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser26ParseFunctionStatementBodyEPNS_4DeclERNS0_10ParseScopeE+0xca)[0x7ffaa648695a]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser23ParseFunctionDefinitionERNS_17ParsingDeclaratorERKNS0_18ParsedTemplateInfoEPNS0_18LateParsedAttrListE+0x3df)[0x7ffaa64aaa4f]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser14ParseDeclGroupERNS_15ParsingDeclSpecENS_17DeclaratorContextEPNS_14SourceLocationEPNS0_12ForRangeInitE+0x83f)[0x7ffaa63fa95f]
/lib64/libclang-cpp.so.12(+0xadca05)[0x7ffaa64a5a05]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser24ParseExternalDeclarationERNS0_25ParsedAttributesWithRangeEPNS_15ParsingDeclSpecE+0x5a6)[0x7ffaa64ad7f6]
/lib64/libclang-cpp.so.12(_ZN5clang6Parser17ParseTopLevelDeclERNS_9OpaquePtrINS_12DeclGroupRefEEEb+0x158)[0x7ffaa64ae9e8]
/lib64/libclang-cpp.so.12(_ZN5clang8ParseASTERNS_4SemaEbb+0x229)[0x7ffaa63d8129]
/lib64/libclang-cpp.so.12(_ZN5clang14FrontendAction7ExecuteEv+0xc9)[0x7ffaa7b16f09]
clangd(+0x2b424e)[0x55bd175c924e]
clangd(+0x30d51a)[0x55bd1762251a]
clangd(+0x30a200)[0x55bd1761f200]
clangd(+0x48d492)[0x55bd177a2492]
/lib64/libLLVM-12.so(+0xc07b8b)[0x7ffaa0a08b8b]
/lib64/libpthread.so.0(+0x9299)[0x7ffaa8a5f299]
/lib64/libc.so.6(clone+0x43)[0x7ffa9f9cf353]

System information

Output of clangd --version: 12.0.1

Editor/LSP plugin: vscode-clangd

Operating system: Fedora

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (1 by maintainers)

Commits related to this issue

Most upvoted comments

I could have sworn I pushed that patch to the repo, maybe I messed up somewhere?

Ah you’re right! It has indeed been committed, the commit message just didn’t have the Differential Revision field set (I’m guessing because you didn’t use arc to submit for review) and so the Phabricator review did not get closed by the commit.

So, @RishabhRD, it looks like this should be fixed in clangd 14.

@DaanDeMeyer @sam-mccall The patch at https://reviews.llvm.org/D114003 has been accepted – should we merge it to mitigate this issue for the time being?

We were trying to puzzle out what’s going on with the crash.

My understanding is it’s a skew between the AST and sources seen by SourceManager:

  • we build a preamble (PCH), which includes a macro with a number in its body
  • we build an AST from a file using the PCH and expanding the macro
  • while expanding the macro, the parser asks to re-lex the token (rather than store it in its lexed form and reuse that)
  • the PP does this by getting the spelling location and looking at the text there
  • the text comes from the SourceManager’s ContentCache::getBufferOrNone() -> FileManager::getBufferForFile() -> opening the file on disk.
    • if there were no PCH involved, the buffer would be in memory already (in clangd, we avoid mmap to ensure this is stable)
    • if the file is unchanged, we’re going to get the same result
    • but if you’ve switched branches, then you’re likely to see different content for preamble
  • Now the parser “knows” there must be a valid number there, because it saw one before. But the content is changed, so the assertion is unsafe.

The idea of avoiding the assertion in D114003 makes sense to avoid the crash, but:

  • we’re still definitely going to parse the code wrong
  • it seems at least plausible there are other similar code paths where re-reading the wrong code leads to crashes Ideally we’d find some way to make clang/clangd behave as if seeing a consistent view of the code. e.g. by embedding the source in the PCH, or caching it in the VFS layer, or avoiding the need for relexing, or…

No conclusion yet, still going to look at the patch which may make sense in its own right.