The commit f11b056c (#76304) breaks `clang` and other tools if they are
used from a RAMDrive. `GetFinalPathNameByHandleW()` may return 0 and
GetLastError 0x28. This patch fixes that issue. Note `real_path()` uses
`openFileForRead()` but it reports the error only if failed to open a
file. Getting `RealPath` is optional functionality.
BTW, `sys::fs::real_path()` resolves not only symlinks, but also network
drives and virtual drives created by the `subst` tool. It may break an
automation. It is better to detect symlinks and resolve only symlinks.
Fixes https://github.com/llvm/llvm-project/issues/83046
There is a race condition when calling `GetFileAttributesW` that can
cause it to return `ERROR_ACCESS_DENIED` on a path which exists, which
is unexpected for callers using this function to check for file
existence by passing `AccessMode::Exist`. This was manifesting as a
compiler crash on Windows downstream in the Swift compiler when using
the `-index-store-path` flag (more information in
https://github.com/apple/llvm-project/issues/8224).
I looked for alternate APIs to avoid bringing in `shlwapi.h`, but didn't
see any good candidates. I'm not tied at all to this solution, any
feedback and alternative approaches are more than welcome.
In acd8791c2619f2afc0347c1bff073b32fbffb5d6, a call to FlushFileBuffers
was added to work around a rare kernel bug. In
3b9b4d2156673edda50584086fbfb0d66460b4d2, the scope of that workaround
was limited, for performance reasons, as the flushes are quite
expensive.
On VirtualBox shared folders, closing a memory mapping that has been
written to, also needs to be explicitly flushed, if renaming the output
file before it is closed. Contrary to the kernel bug, this always
happens on such mounts. In these cases, the output ends up as a file of
the right size, but the contents are all zeros.
The sequence to trigger the issue on the VirtualBox Shared Folders is
this, summarized:
file = CreateFile()
mapping = CreateFileMapping(file)
mem = MapViewOfFile()
CloseHandle(mapping)
write(mem)
UnmapViewOfFile(mem)
SetFileInformationByHandle(file, FileRenameInfo)
CloseHandle(file)
With this sequence, the output file always ends up with all zeros. See
https://github.com/mstorsjo/llvm-mingw/issues/393 for a full
reproduction example.
To avoid this issue, call FlushFileBuffers() when the file may reside on
a VitualBox shared folder. As the flushes are expensive, only do them
when the output isn't on a local file system.
The issue with VirtualBox shared folders could also be fixed by calling
FlushViewOfFile before UnmapViewOfFile, and doing that could be slightly
less expensive than FlushFileBuffers.
Empirically, the difference between the two is very small though, and as
it's not easy to verify whether switching FlushFileBuffers to
FlushViewOfFile helps with the rare kernel bug, keep using
FlushFileBuffers for both cases, for code simplicity.
This fixes downstream bug
https://github.com/mstorsjo/llvm-mingw/issues/393.
Rather than conditionally using the output from GetFileAttributesW move
the branch to avoid calling GetFileAttributesW entirely if not required.
This avoids hitting IO an extra time for a small performance
improvement.
This makes the Windows implementation for `getMainExecutable()` behave
the same as its Linux counterpart, in regards to symlinks. Previously,
when using `cmake ... -DLLVM_USE_SYMLINKS=ON`, calling this function
wouldn't resolve to the "real", non-symlinked path.
This patch renames {starts,ends}with to {starts,ends}_with for
consistency with std::{string,string_view}::{starts,ends}_with in
C++20. Since there are only a handful of occurrences, this patch
skips the deprecation phase and simply renames them.
The FileIndex values returned from GetFileInformationByHandle are
considered stable and uniquely identifying a file, as long as the
handle is open. When handles are closed, there are no guarantees
for their stability or uniqueness. On some file systems (such as
NTFS), the indices are documented to be stable even across handles.
But with some file systems, in particular network mounts, file
indices can be reused very soon after handles are closed.
When such file indices are used for LLVM's UniqueID, files are
considered duplicates as soon as the filesystem driver happens to
have used the same file index for the handle used to inspect the
file. This caused widespread, non-obvious (seemingly random)
breakage. This can happen e.g. if running on a directory that is
shared via Remote Desktop or VirtualBox.
To avoid the issue, use a hash of the canonicalized path for the
file as unique identifier, instead of using FileIndex.
This fixes https://github.com/llvm/llvm-project/issues/61401 and
https://github.com/llvm/llvm-project/issues/22079.
Performance wise, this adds (usually) one extra call to
GetFinalPathNameByHandleW for each call to getStatus(). A test
cases such as running clang-scan-deps becomes around 1% slower
by this, which is considered tolerable.
Change the equivalent() function to use getUniqueID instead of
checking individual file_status fields. The
equivalent(Twine,Twine,bool& result) function calls status() on
each path successively, without keeping the file handles open,
which also is prone to such false positives. This also gets rid
of checks of other superfluous fields in the
equivalent(file_status, file_status) function - the unique ID of
a file should be enough (that is what is done for Unix anyway).
This comes with one known caveat: For hardlinks, each name for
the file now gets a different UniqueID, and equivalent() considers
them different. While that's not ideal, occasional false negatives
for equivalent() is usually that fatal (the cases where we strictly
do need to deduplicate files with different path names are quite
rare) compared to the issues caused by false positives for
equivalent() (where we'd deduplicate and omit totally distinct files).
The FileIndex is documented to be stable on NTFS though, so ideally
we could maybe have used it in the majority of cases. That would
require a heuristic for whether we can rely on FileIndex or not.
We considered using the existing function is_local_internal for that;
however that caused an unacceptable performance regression
(clang-scan-deps became 38% slower in one test, even more than that
in another test).
Differential Revision: https://reviews.llvm.org/D155579
The sys::fs::equivalent(file_status, file_status) function is meant to
judge whether two file_status structs denote the same individual file.
On Unix, this is implemented by only comparing the st_dev and st_ino
numbers (from stat), ignoring all other fields.
On Windows, lacking reliable fields corresponding to st_dev and st_ino,
equivalent(file_status, file_status) compares essentially all fields.
However, since 1e39ef331b2b78baec84fdb577d497511cc46bd5
(https://reviews.llvm.org/D18456), file_status also contains the file
access time.
Including the file access time in equivalent(file_status, file_status)
makes it possible to spuriously break. In particular, when invoking
equivalent(Twine, Twine), with two paths, there's a race condition - the
function calls status() on both input paths. Even if the two paths
are the same, the comparison can fail, if the file was accessed
between the two status() calls.
Thus, it seems like the inclusion of the access time in
equivalent(file_status, file_status) was a mistake.
This race condition can cause spurious failures when linking with
LLD/ELF, where LLD uses equivalent() to check whether a file
exists within a specific sysroot, and that sysroot is accessed by other
processes concurrently.
This fixes downstream issue
https://github.com/msys2/MINGW-packages/issues/15695.
Differential Revision: https://reviews.llvm.org/D144172
Apply clang-format on llvm/lib/Support/Windows/ and llvm/lib/Support/Unix/ since .inc files in these folders aren't picked up by default. Eventually we need to add this extension in the monorepo .clang-format file.
Differential Revision: https://reviews.llvm.org/D138714
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.
Asserts if the mode is not mapped_file_region::readwrite.
Note that I don't have access to a Windows machine, so I can't
easily run those unit tests.
Change by dexonsmith
Differential Revision: https://reviews.llvm.org/D95494
This patch adds an llvm-driver multicall tool that can combine multiple
LLVM-based tools. The build infrastructure is enabled for a tool by
adding the GENERATE_DRIVER option to the add_llvm_executable CMake
call, and changing the tool's main function to a canonicalized
tool_name_main format (i.e. llvm_ar_main, clang_main, etc...).
As currently implemented llvm-driver contains dsymutil, llvm-ar,
llvm-cxxfilt, llvm-objcopy, and clang (if clang is included in the
build).
llvm-driver can be enabled from builds by setting
LLVM_TOOL_LLVM_DRIVER_BUILD=On.
There are several limitations in the current implementation, which can
be addressed in subsequent patches:
(1) the multicall binary cannot currently properly handle
multi-dispatch tools. This means symlinking llvm-ranlib to llvm-driver
will not properly result in llvm-ar's main being called.
(2) the multicall binary cannot be comprised of tools containing
conflicting cl::opt options as the global cl::opt option list cannot
contain duplicates.
These limitations can be addressed in subsequent patches.
Differential revision: https://reviews.llvm.org/D109977
This reverts commit ef8206320769ad31422a803a0d6de6077fd231d2.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
Replace a few `reserve()` / `set_size()` pairs with
`resize_for_overwrite()` / `truncate()` in the platform-specific
code for Windows.
Differential Revision: https://reviews.llvm.org/D115390
On *NIX systems, this API calls madvise(MADV_DONTNEED) on read-only file mappings.
It should not be used on a writable buffer.
The API is used to implement ld.lld LTO memory saving trick (D116367).
Note: on read-only file mappings, Linux's MADV_DONTNEED semantics match POSIX
POSIX_MADV_DONTNEED and BSD systems' MADV_DONTNEED.
On Windows, VirtualAllocEx MEM_COMMIT/MEM_RESET have similar semantics
but are unfortunately not drop-in replacements. dontNeedIfMmap is currently a no-op.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D116366
This normalizes most paths (except ones input from the user as command
line arguments) into the preferred form, if `real_style()` evaluates to
`windows_forward`.
Differential Revision: https://reviews.llvm.org/D111880
Since D81803 / 79657e2339b58bc01fe1b85a448bb073d57d90bb, temp files
created on network shares don't set "Disposition.DeleteFile = true".
This flag normally takes care of removing the temp file both if the
process exits abnormally (either crashing or killed externally), and
when the file is closed cleanly.
For network shares, we voluntarily choose to not set the flag, and
if the operation to inspect the file handle (as a prerequisite to
setting the flag since 79657e2339b58bc01fe1b85a448bb073d57d90bb)
fails we also error out. In both of these cases, we can at least make
sure to remove the temp files when they are closed cleanly.
Adjust the semantics of "OF_Delete" to not set the delete
disposition, but only set the access mode for allowing deletion.
Move the call to setDeleteDisposition into TempFile::create,
where we can check if it failed, and if it did, set a flag noting
that the file should be removed manually at the end.
This does leak files on crash, but at least doesn't leak files
in regular successful runs. (Technically, the alternative codepath
could use the RemoveFileOnSignal function, but that might complicate
the TempFile implementation further.)
This fixes https://github.com/mstorsjo/llvm-mingw/issues/233 and
https://bugs.llvm.org/show_bug.cgi?id=52080.
Differential Revision: https://reviews.llvm.org/D111875
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
incorrect std::string use. (Also remove redundant call to
RemoveFileOnSignal.)
Clang writes object files by first writing to a .tmp file and then
renaming to the final .obj name. On Windows, if a compile is killed
partway through the .tmp files don't get deleted.
Currently it seems like RemoveFileOnSignal takes care of deleting the
tmp files on Linux, but on Windows we need to call
setDeleteDisposition on tmp files so that they are deleted when
closed.
This patch switches to using TempFile to create the .tmp files we write
when creating object files, since it uses setDeleteDisposition on Windows.
This change applies to both Linux and Windows for consistency.
Differential Revision: https://reviews.llvm.org/D102876
This reverts commit 20797b129f844d4b12ffb2b12cf33baa2d42985c.
Clang writes object files by first writing to a .tmp file and then
renaming to the final .obj name. On Windows, if a compile is killed
partway through the .tmp files don't get deleted.
Currently it seems like RemoveFileOnSignal takes care of deleting the
tmp files on Linux, but on Windows we need to call
setDeleteDisposition on tmp files so that they are deleted when
closed.
This patch switches to using TempFile to create the .tmp files we write
when creating object files, since it uses setDeleteDisposition on Windows.
This change applies to both Linux and Windows for consistency.
Differential Revision: https://reviews.llvm.org/D102876
Update llvm::sys::fs::mapped_file_region to have a move constructor and
a move assignment operator, allowing it to be used as an Optional. Also,
update FileOutputBuffer's OnDiskBuffer to take advantage of this,
avoiding an extra allocation from the unique_ptr.
A nice follow-up would be to make the mapped_file_region constructor
private and replace its use with a factory function, such as
mapped_file_region::create(), that returns an Expected (or ErrorOr). I
don't plan on doing that immediately, but I might swing back later.
No functionality change, besides the saved allocation in OnDiskBuffer.
Differential Revision: https://reviews.llvm.org/D100159
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
The function utilizes Windows' SearchPathW function, which as I found out today, may also return directories. After looking at the Unix implementation of the file I found that it contains a check whether the found path is also executable. While fixing the Windows implementation, I also learned that sys::fs::access returns successfully when querying whether directories are executable, which the Unix version does not.
This patch makes both of these functions equivalent to their Unix implementation and insures that any path returned by sys::findProgramByName on Windows may only be executable, just like the Unix implementation.
The equivalent additions I have made to the Windows implementation, in the Unix implementation are here:
sys::findProgramByName: 39ecfe6143/llvm/lib/Support/Unix/Program.inc (L90)
sys::fs::access: c2a84771bb/llvm/lib/Support/Unix/Path.inc (L608)
I encountered this issue when running the LLVM testsuite. Commands of the form not test ... would fail to correctly execute test.exe, which is part of GnuWin32, as it actually tried to execute a folder called test, which happened to be in a directory on my PATH.
Differential Revision: https://reviews.llvm.org/D99357
As reported here: https://bugs.llvm.org/show_bug.cgi?id=48378#c0
and here: https://github.com/rust-lang/rust/issues/81051
since 79657e2339b58bc01fe1b85a448bb073d57d90bb, some programs such as llvm-ar
don't work properly on Windows 7.
The issue is shown in the snippet by Oleksandr Prodan:
https://pastebin.com/v51m3uBU
In essence, once the 'DeleteFile' flag has been set on FILE_DISPOSITION_INFO,
the file path can't be queried anymore with GetFinalPathNameByHandleW. This
however works on Windows 10, GetFinalPathNameByHandleW would return sucessfully.
To workaround the issue, we simply reset the 'DeleteFile' flag before even
checking if we're dealing with a network file.
Tested with `llvm-ar r empty.a a.obj` ran on a network mount. At the moment, we
cannot specifically add a test coverage for this, since it requres mounting a
network drive.
On Windows, after commit 881ba104656c40098d4bc90c52613c08136f0fe1, tools
using TempFile would error with "bad file descriptor" when writing the
file on a network drive. It appears that setting the delete-on-close bit via
SetFileInformationByHandle/FileDispositionInfo prevented it from
accessing the file on network drives, and although using
FILE_DISPOSITION_INFO seems to work, it causes other troubles.
Differential Revision: https://reviews.llvm.org/D81803
Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like `stat` calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.
Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.
On relanding drop stats in MemoryBuffer.cpp as their value is pretty low
but affects a lot of clients and many of those aren't interested in
modules and header search.
rdar://problem/55715134
Reviewed By: aprantl, bruno
Differential Revision: https://reviews.llvm.org/D86895
This reverts commit c4bacc3c9b333bb7032fb96f41d6f5b851623132.
Test "LLVM :: ThinLTO/X86/funcimport-stats.ll" is failing. Reverting now
and will recommit after making the test not fail with the added stats.
Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like `stat` calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.
Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.
rdar://problem/55715134
Reviewed By: aprantl, bruno
Differential Revision: https://reviews.llvm.org/D86895
`GetFinalPathNameByHandleW(,,N,)` returns:
- `< N` on success (this value does not include the size of the terminating null character)
- `>= N` if buffer is too small (this value includes the size of the terminating null character)
So, when `N == Buffer.capacity() - 1`, we need to resize buffer if return value is > `Buffer.capacity() - 2`.
Also, we can set `N` to `Buffer.capacity()`.
Thus, without this patch `realPathFromHandle()` returns unfilled buffer when length of the final path of the file is equal to `Buffer.capacity()` or `Buffer.capacity() - 1`.
Reviewed By: andrewng, amccarth
Differential Revision: https://reviews.llvm.org/D86564
This is recommit of f51bc4fb60fb, reverted in 8577595e03fa, because
the function `flock` is not available on Solaris. In this variant
`flock` was replaced with `fcntl`, which is a POSIX function.
New functions `lockFile`, `tryLockFile` and `unlockFile` implement
simple file locking. They lock or unlock entire file. This must be
enough to support simulataneous writes to log files in parallel builds.
Differential Revision: https://reviews.llvm.org/D78896
Fix incorrect use of the size of Path when accessing PathUTF16, as the
UTF-16 path can be shorter. Added unit test for coverage of this test
case.
Thanks to Ding Fei (danix800) for the code fix, see
https://reviews.llvm.org/D83321.
Differential Revision: https://reviews.llvm.org/D83689
New functions `lockFile`, `tryLockFile` and `unlockFile` implement
simple file locking. They lock or unlock entire file. This must be
enough to support simulataneous writes to log files in parallel builds.
Differential Revision: https://reviews.llvm.org/D78896
This reverts commit fb5fd74685e728b1d5e68d33e9842bcd734b98e6.
Re-instates commit 53913a65b408ade2956061b4c0aaed6bba907403
The fix is to trim off trailing separators, as in `/foo/bar/` and
produce `/foo/bar`. VFS tests rely on this. I added unit tests for
remove_dots.
This reverts commit 53913a65b408ade2956061b4c0aaed6bba907403.
Breaks VFSFromYAMLTest.DirectoryIterationSameDirMultipleEntries
in SupportTests on non-Windows.
LLD calls this on every source file string in every object file when
writing PDBs, so it is somewhat hot.
Avoid rewriting paths that do not contain path traversal components
(./..). Use find_first_not_of(separators) directly instead of using the
path iterators. The path component iterators appear to be slow, and
directly searching for slashes makes it easier to find double separators
that need to be canonicalized.
I discovered that the VFS relies on remote_dots to not canonicalize
early slashes (/foo or C:/foo) on Windows, so I had to leave that
behavior behind with unit tests for it. This is undesirable, but I claim
that my change is NFC.
This reverts commit ad38f4b371bdca214e3a3cda9a76ec2213215c68.
As it broke building the unittests:
.../sources/llvm-project/llvm/unittests/Support/Path.cpp:334:5: error: use of undeclared identifier 'set'
set(Value);
^
1 error generated.
Summary:
This patch adds a function that is similar to `llvm::sys::path::home_directory`, but provides access to the system cache directory.
For Windows, that is %LOCALAPPDATA%, and applications should put their files under %LOCALAPPDATA%\Organization\Product\.
For *nixes, it adheres to the XDG Base Directory Specification, so it first looks at the XDG_CACHE_HOME environment variable and falls back to ~/.cache/.
Subsequently, the Clangd Index storage leverages this new API to put index files somewhere else than the users home directory.
Fixes https://github.com/clangd/clangd/issues/341
Reviewers: sammccall, chandlerc, Bigcheese
Reviewed By: sammccall
Subscribers: hiraditya, ilya-biryukov, MaskRay, jkorous, dexonsmith, arphaman, kadircet, ormris, usaxena95, cfe-commits, llvm-commits
Tags: #clang-tools-extra, #clang, #llvm
Differential Revision: https://reviews.llvm.org/D78501