Work around documented Linux mmap bug. (#152595)

On Linux, mmap doesn't always zero-fill slack bytes ([man page]),
despite being required to do so by POSIX. If the final page of a file is
in the page cache and the bytes past the end of the file get overwritten
by some process, those bytes then remain non-zero until the page falls
out of the cache or another process overwrites them.

Stop trusting that mmap behaves properly and instead check
whether the buffer was indeed properly terminated. If not, fall back to
using `read` to read the file contents.

This fixes an obscure clang crash bug that can occur if another program
(such as an editor) mmap's a source file and writes past the end of the
mmap'd region shortly before clang or clangd attempts to parse the file.

 [man page]: https://man7.org/linux/man-pages/man2/mmap.2.html#BUGS
This commit is contained in:
Richard Smith 2025-08-13 12:39:25 -07:00 committed by GitHub
parent 36d31b0c00
commit 85cd3d9868
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -501,9 +501,15 @@ getOpenFileImpl(sys::fs::file_t FD, const Twine &Filename, uint64_t FileSize,
std::unique_ptr<MB> Result(
new (NamedBufferAlloc(Filename)) MemoryBufferMMapFile<MB>(
RequiresNullTerminator, FD, MapSize, Offset, EC));
if (!EC)
if (!EC) {
// On at least Linux, and possibly on other systems, mmap may return pages
// from the page cache that are not properly filled with trailing zeroes,
// if some prior user of the page wrote non-zero bytes. Detect this and
// don't use mmap in that case.
if (!RequiresNullTerminator || *Result->getBufferEnd() == '\0')
return std::move(Result);
}
}
#ifdef __MVS__
ErrorOr<bool> NeedsConversion = needConversion(Filename.str().c_str(), FD);