The commit f11b056c (#76304) breaks `clang` and other tools if they are
used from a RAMDrive. `GetFinalPathNameByHandleW()` may return 0 and
GetLastError 0x28. This patch fixes that issue. Note `real_path()` uses
`openFileForRead()` but it reports the error only if failed to open a
file. Getting `RealPath` is optional functionality.
BTW, `sys::fs::real_path()` resolves not only symlinks, but also network
drives and virtual drives created by the `subst` tool. It may break an
automation. It is better to detect symlinks and resolve only symlinks.
Fixes https://github.com/llvm/llvm-project/issues/83046
There is a race condition when calling `GetFileAttributesW` that can
cause it to return `ERROR_ACCESS_DENIED` on a path which exists, which
is unexpected for callers using this function to check for file
existence by passing `AccessMode::Exist`. This was manifesting as a
compiler crash on Windows downstream in the Swift compiler when using
the `-index-store-path` flag (more information in
https://github.com/apple/llvm-project/issues/8224).
I looked for alternate APIs to avoid bringing in `shlwapi.h`, but didn't
see any good candidates. I'm not tied at all to this solution, any
feedback and alternative approaches are more than welcome.
LLVM is inconsistent about how it converts `errno` to `std::error_code`.
This can cause problems because values outside of `std::errc` compare
differently if one is system and one is generic on POSIX systems.
This is even more of a problem on Windows where use of the system
category is just wrong, as that is for Windows errors, which have a
completely different mapping than POSIX/generic errors. This patch fixes
one instance of this mistake in `JSONTransport.cpp`.
This patch adds `errnoAsErrorCode()` which makes it so people do not
need to think about this issue in the future. It also cleans up a lot of
usage of `errno` in LLVM and Clang.
Relands #81708, which was reverted by
f410f74cd5b26319b5796e0404c6a0f3b5cc00a5, now with a corrected unit
test. Origionally the test failed on Windows when run with lit as
`GetConsoleWindow` could not retrieve a window handle regardless of
whether `DetachProcess` was `true` or `false`. The test now uses
`GetStdHandle(STD_OUTPUT_HANDLE)` which does not rely on a console
window existing. Original commit message below.
Adds a new parameter, `bool DetachProcess` with a default option of
`false`, to `llvm::sys::ExecuteNoWait`, which, when set to `true`,
executes the specified program without a controlling terminal.
Functionality added so that the module build daemon can be run without a
controlling terminal.
Adds a new parameter, `bool DetachProcess` with a default option of
`false`, to `llvm::sys::ExecuteNoWait`, which, when set to `true`,
executes the specified program without a controlling terminal.
Functionality added so that the module build daemon can be run without a
controlling terminal.
In acd8791c2619f2afc0347c1bff073b32fbffb5d6, a call to FlushFileBuffers
was added to work around a rare kernel bug. In
3b9b4d2156673edda50584086fbfb0d66460b4d2, the scope of that workaround
was limited, for performance reasons, as the flushes are quite
expensive.
On VirtualBox shared folders, closing a memory mapping that has been
written to, also needs to be explicitly flushed, if renaming the output
file before it is closed. Contrary to the kernel bug, this always
happens on such mounts. In these cases, the output ends up as a file of
the right size, but the contents are all zeros.
The sequence to trigger the issue on the VirtualBox Shared Folders is
this, summarized:
file = CreateFile()
mapping = CreateFileMapping(file)
mem = MapViewOfFile()
CloseHandle(mapping)
write(mem)
UnmapViewOfFile(mem)
SetFileInformationByHandle(file, FileRenameInfo)
CloseHandle(file)
With this sequence, the output file always ends up with all zeros. See
https://github.com/mstorsjo/llvm-mingw/issues/393 for a full
reproduction example.
To avoid this issue, call FlushFileBuffers() when the file may reside on
a VitualBox shared folder. As the flushes are expensive, only do them
when the output isn't on a local file system.
The issue with VirtualBox shared folders could also be fixed by calling
FlushViewOfFile before UnmapViewOfFile, and doing that could be slightly
less expensive than FlushFileBuffers.
Empirically, the difference between the two is very small though, and as
it's not easy to verify whether switching FlushFileBuffers to
FlushViewOfFile helps with the rare kernel bug, keep using
FlushFileBuffers for both cases, for code simplicity.
This fixes downstream bug
https://github.com/mstorsjo/llvm-mingw/issues/393.
Rather than conditionally using the output from GetFileAttributesW move
the branch to avoid calling GetFileAttributesW entirely if not required.
This avoids hitting IO an extra time for a small performance
improvement.
This makes the Windows implementation for `getMainExecutable()` behave
the same as its Linux counterpart, in regards to symlinks. Previously,
when using `cmake ... -DLLVM_USE_SYMLINKS=ON`, calling this function
wouldn't resolve to the "real", non-symlinked path.
This patch renames {starts,ends}with to {starts,ends}_with for
consistency with std::{string,string_view}::{starts,ends}_with in
C++20. Since there are only a handful of occurrences, this patch
skips the deprecation phase and simply renames them.
The FileIndex values returned from GetFileInformationByHandle are
considered stable and uniquely identifying a file, as long as the
handle is open. When handles are closed, there are no guarantees
for their stability or uniqueness. On some file systems (such as
NTFS), the indices are documented to be stable even across handles.
But with some file systems, in particular network mounts, file
indices can be reused very soon after handles are closed.
When such file indices are used for LLVM's UniqueID, files are
considered duplicates as soon as the filesystem driver happens to
have used the same file index for the handle used to inspect the
file. This caused widespread, non-obvious (seemingly random)
breakage. This can happen e.g. if running on a directory that is
shared via Remote Desktop or VirtualBox.
To avoid the issue, use a hash of the canonicalized path for the
file as unique identifier, instead of using FileIndex.
This fixes https://github.com/llvm/llvm-project/issues/61401 and
https://github.com/llvm/llvm-project/issues/22079.
Performance wise, this adds (usually) one extra call to
GetFinalPathNameByHandleW for each call to getStatus(). A test
cases such as running clang-scan-deps becomes around 1% slower
by this, which is considered tolerable.
Change the equivalent() function to use getUniqueID instead of
checking individual file_status fields. The
equivalent(Twine,Twine,bool& result) function calls status() on
each path successively, without keeping the file handles open,
which also is prone to such false positives. This also gets rid
of checks of other superfluous fields in the
equivalent(file_status, file_status) function - the unique ID of
a file should be enough (that is what is done for Unix anyway).
This comes with one known caveat: For hardlinks, each name for
the file now gets a different UniqueID, and equivalent() considers
them different. While that's not ideal, occasional false negatives
for equivalent() is usually that fatal (the cases where we strictly
do need to deduplicate files with different path names are quite
rare) compared to the issues caused by false positives for
equivalent() (where we'd deduplicate and omit totally distinct files).
The FileIndex is documented to be stable on NTFS though, so ideally
we could maybe have used it in the majority of cases. That would
require a heuristic for whether we can rely on FileIndex or not.
We considered using the existing function is_local_internal for that;
however that caused an unacceptable performance regression
(clang-scan-deps became 38% slower in one test, even more than that
in another test).
Differential Revision: https://reviews.llvm.org/D155579
When the environment LLVM_ENABLE_SYMBOLIZER_MARKUP is set, if
llvm-symbolizer fails or is disabled, this change will print a backtrace
in llvm-symbolizer markup instead of falling back to in-process
symbolization mechanisms.
This allows llvm-symbolizer to be run on the output later to produce a
high quality backtrace, even for fully-stripped LLVM utilities.
Reviewed By: mcgrathr
Differential Revision: https://reviews.llvm.org/D139750
This fixes the following warning on Windows with latest Clang:
```
[160/3057] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.obj
In file included from C:/git/llvm-project/llvm/lib/Support/Signals.cpp:260:
C:/git/llvm-project/llvm/lib/Support/Windows/Signals.inc(834,15): warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (RetCode == (0xE0000000 | EX_IOERR))
~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~
1 warning generated.```
The sys::fs::equivalent(file_status, file_status) function is meant to
judge whether two file_status structs denote the same individual file.
On Unix, this is implemented by only comparing the st_dev and st_ino
numbers (from stat), ignoring all other fields.
On Windows, lacking reliable fields corresponding to st_dev and st_ino,
equivalent(file_status, file_status) compares essentially all fields.
However, since 1e39ef331b2b78baec84fdb577d497511cc46bd5
(https://reviews.llvm.org/D18456), file_status also contains the file
access time.
Including the file access time in equivalent(file_status, file_status)
makes it possible to spuriously break. In particular, when invoking
equivalent(Twine, Twine), with two paths, there's a race condition - the
function calls status() on both input paths. Even if the two paths
are the same, the comparison can fail, if the file was accessed
between the two status() calls.
Thus, it seems like the inclusion of the access time in
equivalent(file_status, file_status) was a mistake.
This race condition can cause spurious failures when linking with
LLD/ELF, where LLD uses equivalent() to check whether a file
exists within a specific sysroot, and that sysroot is accessed by other
processes concurrently.
This fixes downstream issue
https://github.com/msys2/MINGW-packages/issues/15695.
Differential Revision: https://reviews.llvm.org/D144172
This has been obsoleted by C++ thread_local for a long time.
As far as I know, Xcode was the last supported toolchain to add
support for C++ thread_local in 2016.
As a precaution, use LLVM_THREAD_LOCAL which provides even greater
backwards compatibility, allowing this to function even pre-C++11
versions of GCC.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D141349
This has been obsoleted by C++ thread_local for a long time.
As far as I know, Xcode was the last supported toolchain to add
support for C++ thread_local in 2016.
As a precaution, use LLVM_THREAD_LOCAL which provides even greater
backwards compatibility, allowing this to function even pre-C++11
versions of GCC.
Reviewed By: majnemer
Differential Revision: https://reviews.llvm.org/D141347
Before Windows 11 and Windows Server 2022, only one 'processor group' is assigned by default to a starting process, then the program is responsible for dispatching its own threads on more 'processor groups'. That is what 8404aeb56a73ab24f9b295111de3b37a37f0b841 was doing, allowing LLVM tools to automatically use all hardware threads in the machine.
After Windows 11 and Windows Server 2022, the OS takes care of that. This has an adverse effect reported in #56618 which is that using `GetProcessAffinityMask()` API in some edge cases seems buggy now. That API is used to detect if an affinity mask was set, and adjust accordingly the available threads for a ThreadPool.
With this patch, on one hand, we let the OS dispatch threads on all 'processor groups', but only for Windows 11 & Windows Server 2022 and after. We retain the old behavior for older OS versions. On the other hand, a workaround was added to mitigate the `GetProcessAffinityMask()` issue described above (see Threading.inc, L226).
Differential Revision: https://reviews.llvm.org/D138747
Currently the process is terminated after the timeout. Add an option
to let the process resume after the timeout instead.
https://reviews.llvm.org/D138952
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
I found the interaction between SecondsToWait and
WaitUntilChildTerminates confusing. Rather than have a boolean to
ignore the value of SecondsToWait, combine these into one Optional
parameter.
This reverts commit 5e50b8089aee249d77542ea858d956568ec6581f.
This commit breaks the build for BOLT:
bolt/lib/Profile/DataAggregator.cpp:264:66: error: no viable
conversion from 'Optional<StringRef>[3]' to
'ArrayRef<std::optional<StringRef>>'
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
The big change here is that now if you have threading disabled,
`llvm::get_physical_cores()` will return -1, as if it had not been able
to work out the right info. This is due to how Threading.cpp includes
OS-specific code/headers. This seems ok, as if threading is disabled,
LLVM should not need to know the number of physical cores.
Differential Revision: https://reviews.llvm.org/D137836
Apply clang-format on llvm/lib/Support/Windows/ and llvm/lib/Support/Unix/ since .inc files in these folders aren't picked up by default. Eventually we need to add this extension in the monorepo .clang-format file.
Differential Revision: https://reviews.llvm.org/D138714
This reverts commit 9969ceb36b440eaafa17c486f29a69c7a7da3b3b.
On Windows:
lld-link: error: undefined symbol: int __cdecl computeHostNumPhysicalCores(void)
>>> referenced by LLVMSupport.lib(Support.Host.obj):(int __cdecl llvm::sys::getHostNumPhysicalCores(void))
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.
Asserts if the mode is not mapped_file_region::readwrite.
Note that I don't have access to a Windows machine, so I can't
easily run those unit tests.
Change by dexonsmith
Differential Revision: https://reviews.llvm.org/D95494
Prevents deadlock between MiniDumpWriteDump and
CryptAcquireContextW (called via fs::createTemporaryFile) in
WriteWindowsDumpFile.
However, there's no guarantee that deadlock can't still occur between
MiniDumpWriteDump and some other Win32 API call. But that would appear
to be the "accepted" risk of using MiniDumpWriteDump in this manner.
Differential Revision: https://reviews.llvm.org/D129004
Currently the backtrace emitted on windows when llvm-symbolizer is not
available includes addresses which cannot be easily decoded because
the addresses have the containing module's run-time base address added
into them, but we don't know what those base addresses are. This
change emits a module offset rather than an address.
There are a couple of related changes which were included as a result
of the review discussion for this patch:
- I have also removed the parameter printing as it adds noise to the
dump and doesn't seem useful.
- I have added the exception code to the backtrace.
Differential Review: https://reviews.llvm.org/D127915
This patch adds an llvm-driver multicall tool that can combine multiple
LLVM-based tools. The build infrastructure is enabled for a tool by
adding the GENERATE_DRIVER option to the add_llvm_executable CMake
call, and changing the tool's main function to a canonicalized
tool_name_main format (i.e. llvm_ar_main, clang_main, etc...).
As currently implemented llvm-driver contains dsymutil, llvm-ar,
llvm-cxxfilt, llvm-objcopy, and clang (if clang is included in the
build).
llvm-driver can be enabled from builds by setting
LLVM_TOOL_LLVM_DRIVER_BUILD=On.
There are several limitations in the current implementation, which can
be addressed in subsequent patches:
(1) the multicall binary cannot currently properly handle
multi-dispatch tools. This means symlinking llvm-ranlib to llvm-driver
will not properly result in llvm-ar's main being called.
(2) the multicall binary cannot be comprised of tools containing
conflicting cl::opt options as the global cl::opt option list cannot
contain duplicates.
These limitations can be addressed in subsequent patches.
Differential revision: https://reviews.llvm.org/D109977
Paths that start with `\\?\` are absolute paths, and aren't expected
to be used with wildcard expressions.
Previously, the `?` at the start of the path triggered the condition
for a potential wildcard, which caused the path to be split and
reassembled. In builds with `LLVM_WINDOWS_PREFER_FORWARD_SLASH=ON`,
this caused a path like e.g. `\\?\D:\tmp\hello.cpp` to be reassembled
into `\\?\D:\tmp/hello.cpp` which isn't a valid path (as such
absolute paths must use backslashes consistently).
This fixes https://github.com/mstorsjo/llvm-mingw/issues/280.
I'm not sure if there's any straightforward way to add a test
for this case, unfortunately.
Differential Revision: https://reviews.llvm.org/D126675
On Apple Silicon Macs, using a Darwin thread priority of PRIO_DARWIN_BG seems to
map directly to the QoS class Background. With this priority, the thread is
confined to efficiency cores only, which makes background indexing take forever.
Introduce a new ThreadPriority "Low" that sits in the middle between Background
and Default, and maps to QoS class "Utility" on Mac. Make this new priority the
default for indexing. This makes the thread run on all cores, but still lowers
priority enough to keep the machine responsive, and not interfere with
user-initiated actions.
I didn't change the implementations for Windows and Linux; on these systems,
both ThreadPriority::Background and ThreadPriority::Low map to the same thread
priority. This could be changed as a followup (e.g. by using SCHED_BATCH for Low
on Linux).
See also https://github.com/clangd/clangd/issues/1119.
Reviewed By: sammccall, dgoldman
Differential Revision: https://reviews.llvm.org/D124715
Bugzilla #47579: if you invoke clang on Windows via a pathname in
which a quoted section closes just after a backslash, e.g.
"C:\Program Files\Whatever\"clang.exe
then cmd.exe and CreateProcess will correctly find the binary, because
when they parse the program name at the start of the command line,
they don't regard the \ before the " as having any kind of escaping
effect. This is different from the behaviour of the Windows standard C
library when it parses the rest of the command line, which would
consider that \" not to close the quoted string.
But this confuses windows::GetCommandLineArguments, because the
Windows API function GetCommandLineW() will return a command line
containing that \" sequence, and cl::TokenizeWindowsCommandLine will
tokenize the whole string according to the C library's rules. So it
will misidentify where the program name stops and the arguments start.
To fix this, I've introduced a new variant function
cl::TokenizeWindowsCommandLineFull(), intended to be applied to the
string returned from GetCommandLineW(). It parses the first word of
the command line according to CreateProcess's rules, considering \ to
never be an escaping character; thereafter, it switches over to the C
library rules for the rest of the command line.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D122914