clangd: Source file chosen to infer compile commands for a header sometimes has the wrong language
Whether or not this qualifies as a bug report or a feature request could I guess be discussed.
After reporting https://llvm.discourse.group/t/header-file-heuristics-issue/1749 and investigating it further, I arrive at the conclusion that the issue I am describing is probably just a logical consequence of the clangd heuristics used for header files in my project structure.
A short summary of the issue:
- My project contains C and C++ files.
- I have a compilation database which contains an entry for every C or C++ file, but no entry for header files, which I guess is quite standard.
- My C++ header files have the extension
.hpp
. - In the clangd log (10.0.0 or 11.0.0 rc1), I can sometimes see some entries of the following type:
Updating file /path/to/Header.hpp with command inferred from /other/path/to/arbitrary.c
. I.e. clangd picks a C-file instead of a C++ file.
Sometimes, however, clangd will pick a C++ file which does not necessarily include the header file in question. While I understand that picking the right translation unit for a header file is a little arbitrary (there could be several correct answers, with different compiler flags), it feels like picking a C++ file instead of a C file in the case above should be feasible, and could also be a non-negligible improvement for some uses cases (like mine, because I would then get the correct flags, like -std=c++17
etc.). It seems like the heuristics use some kind of distance in the file hierarchy between the header file and the C/C++ file inferred as a criterion, which I guess is fine as a general principle, but it would be nice if files of the same language as the header file’s (if clangd knows about that, but in my case it is clear from the extension) are picked with a higher priority.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 13
- Comments: 35 (7 by maintainers)
Commits related to this issue
- [clangd] Use flags from open files when opening headers they include Currently our strategy for getting header compile flags is something like: A) look for flags for the header in compile_commands.j... — committed to llvm/llvm-project by sam-mccall 3 years ago
- [clangd] Use flags from open files when opening headers they include Currently our strategy for getting header compile flags is something like: A) look for flags for the header in compile_commands.j... — committed to morehouse/llvm-project by sam-mccall 3 years ago
- [clangd] Adjust compile flags so they work when applied to other file(type)s. It's reasonable to want to use the command from one file to compile another. In particular, the command from a translatio... — committed to llvm/llvm-project by sam-mccall 3 years ago
- [clangd] Use flags from open files when opening headers they include Currently our strategy for getting header compile flags is something like: A) look for flags for the header in compile_commands.j... — committed to draperlaboratory/hope-llvm-project by sam-mccall 3 years ago
@bradlarsen, thanks, really interesting reads. I was just trying to write down some thinking about header files in compilation databases when you posted your entry about compdb. My thinking was that given the objectives of compilation databases, it would not have seemed entirely crazy to extend the file format to include translation units dependencies.
Like:
or something similar.
That would seem better than making up compilation commands for header files when they do not really exist. And clangd’s (and other’s!) job would be much simpler. Because of course, any proper build system has that information, since it needs it to ensure that TU only get recompiled when updated.
I have also found this CMake issue, which I guess is relevant.
I would like to submit an additional use case related to this. I have header files with preprocessor conditions, meaning the active portions of the header files vary depending on which file includes it. I would like to be able to tell clangd: show me the analysis for this header file as seen by this source file or that source file, and be able to switch at runtime.
I imagine a flow like this:
Would that make sense?
While looking at solving this another way, I ran across the -M (–dependencies) clang flag, which dumps the headers used by a particular file. It really seems to me like the right solution is to have clangd infer commands for headers from the source files that use them! Perhaps we could mostly reuse infrastructure from that flag
In your case, the issue is not that clangd is picking a wrong-language file from the CDB, but rather that it’s not using the CDB for out-of-tree headers at all.
For each open file, clangd looks for a CDB in the directory containing the file and its ancestor directories. If your CDB is in your workspace root and you open a file outside the workspace root, this algorithm therefore will not find the CDB for such files.
You can force clangd to use a CDB for all files include out-of-tree files by passing
--compile-commands-dir=<directory>
as a command-line argument toclangd
, where<directory>
is the directory containing your CDB.Hi guys, is there any news on this issue? I’m trying to use the VSCode extension in pair with a GCC cross-compiler for an embedded target. My project contains several files with code that rely on compiler type and since headers are not present in the compilation database Clang is trying to pretend to be the default compiler in the OS (in my case it MSVC) which leads to intellisense issues. It’s somewhat possible to partially overcome the problem with
clangd.fallbackFlags
, but then Clangd is not able to find standard headers since--query-driver
is not taken into account.When choosing a file to infer the commands from, we always prefer one in the same language, but we only consider some candidates, so if there are no C++ files nearby and no C++ files has identical name, we’ll choose a C file instead.
https://reviews.llvm.org/D87253 makes it so that if there’s even one C++ file in the project, that one would be preferred. I’m honestly not sure if that’s the right thing to do or not. Let’s see what others think.
@cpsauer You’re too kind 😃
This is a pretty minor distinction.
-Wunused
apply in header files but not main files, clang warns on#include_next
in the main file,I’ve landed D116167, so now this bug is just about the original topic: preferring (more) files from the same language.
https://reviews.llvm.org/D116167 should address this recent issue raised by @cpsauer: explicitly enriching the database by simply copying the argv should work.
Regarding other ideas discussed here:
So I think we should maybe-fix the second bullet (to be discussed in review) and then close this bug.
I’m having the same problem but with cross-compilation.
I’m compiling a Yocto project. One of my file includes C++ “vector” header. In my IDE (vscode), it opens the correct file (the header used by Yocto’s cross-compiler, i.e. “…/<project>/recipe-sysroot/usr/include/…”). If I then try to open one of the file included by “vector”, e.g. “vector.tcc”, the file from my host system is being opened, i.e. “/usr/include/…”. This becomes really problematic for headers that don’t exist on the host, i.e. most library headers beside STL/C++/C.
Hi @HighCommander4! It’s partially related to this issue. I don’t have any C++ files in my CDB and still Clangd is trying to parse an external header file as a C++ header with fallback flags. Here are parts from the verbose log:
Header file
С file
As you can see with
main.c
file, Clangd correctly used standard headers from the provided query driver, but with an out-of-tree header file it used C++ flags. Please let me know if you would like to see the entire log.Clangd should still be picking some source file from the compilation database to infer compile commands from for a header.
The issue in this ticket, is that it sometimes infers a source file of the wrong language (e.g. a C source file for a C++ header in a project that contains a mix of source files in both languages). Is that the case for your project?
If your issue is that clangd is not inferring a source file at all, that’s a different issue that can potentially be solved on your end. (For example, if the header is outside the directory containing the compilation database, clangd won’t use the CDB for it by default, but it can be made to with the
--compile-commands-dir
option.)If you share a clangd log (from VSCode’s Output view, “Clangd Language Server” option in the dropdown) that should shed some light on things.