Since commit 613a077b05b8352a48695be295037306f5fca151, `flang` doesn't
build any longer on Solaris/amd64:
```
flang/lib/Evaluate/intrinsics-library.cpp:225:26:
error: address of overloaded function 'acos' does not match required type '__float128 (__float128)'
225 | FolderFactory<F, F{std::acos}>::Create("acos"),
| ^~~~~~~~~
```
That patch led to the version of `quadmath.h` deep inside `/usr/gcc/<N>`
to be found, thus `HAS_QUADMATHLIB` is defined. However, the `struct
HostRuntimeLibrary<__float128, LibraryVersion::Libm>` template is
guarded by `_POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600`, while
`clang` only predefines `_XOPEN_SOURCE=500`.
This code dates back to commit 0c1941cb055fcf008e17faa6605969673211bea3
back in 2012. Currently, this is long obsolete and `gcc` prefefines
`_XOPEN_SOURCE=600` instead since GCC 4.6 back in 2011.
This patch follows that.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
(cherry picked from commit e71c8ea3cc73c8f7b0382468f355a254166d3a72)
The current check may fail if the addition overflows. I've observed
failures of macho-invalid.test on 32-bit due to this.
Instead, compare against the remaining bytes until the end of the
object.
(cherry picked from commit 3f29acb51739a3e6bfb8cc623eb37cb734c98a63)
This struct will be removed from glibc-2.42 and has been deprecated for
a very long time.
Fixes#137321
(cherry picked from commit 59978b21ad9c65276ee8e14f26759691b8a65763)
The `[G]CSRXCHG` instruction must not use R0 or R1 as the `rj` operand,
as encoding `rj` as 0 or 1 will be interpreted as `[G]CSRRD` OR
`[G]CSRWR`, respectively, rather than `[G]CSRXCHG`.
This patch introduces a new register class `GPRNoR0R1` and updates the
`[G]CSRXCHG` instruction definition to use it for the `rj` operand,
ensuring the register allocator avoids assigning R0 or R1.
Fixes#140842
(cherry picked from commit bd8578c3574d77bc1231f047bced4a0053a1b000)
Fix a use-after-free issue related to annotateTableJump in the LoongArch
target.
Previously, `LoongArchPreRAExpandPseudo::annotateTableJump()` recorded a
reference to a MachineOperand representing a jump table index. However,
later optimizations such as the `BranchFolder` pass may delete the
instruction containing this operand, leaving a dangling reference.
This led to an assertion failure in
`LoongArchAsmPrinter::emitJumpTableInfo()` when trying to access a freed
MachineOperand via `getIndex()`.
The fix avoids holding a reference to the MachineOperand. Instead, we
extract and store the jump table index at the time of annotation. During
`emitJumpTableInfo()`, we verify whether the recorded index still exists
in the MachineFunction's jump table. If not, we skip emission for that
entry.
Fixes#140904
(cherry picked from commit 4e186f20e2f2be2fbf95d9713341a0b6507e707d)
The load not being included in the chain meant that it could materialize
after a `@llvm.lifetime.end` annotation on the pointer. This could
result in miscompiles if the stack slot is reused for another value.
Fixes https://github.com/llvm/llvm-project/issues/140491
(cherry picked from commit c9d62491981fe720c1b3255fa2f9ddf744590c65)
As reported in #135665, C++20 parenthesis initializer list expressions
are not handled correctly and were causing crashes. This commit attempts
to fix the issue by handing parenthesis initializer lists along side
existing initializer lists.
(cherry picked from commit 5dc9d55eb04d94c01dba0364b51a509f975e542a)
Since P2280R4 Unknown references and pointers was implemented,
HandleLValueBase now has to deal with referneces:
D.MostDerivedType->getAsCXXRecordDecl()
will return a nullptr if D.MostDerivedType is a ReferenceType. The fix
is to use getNonReferenceType() to obtain the Pointee Type if we have a
reference.
Fixes: https://github.com/llvm/llvm-project/issues/139452
(cherry picked from commit 136f2ba2a7bca015ef831c91fb0db5e5e31b7632)
# Conflicts:
# clang/docs/ReleaseNotes.rst
Prvious `fp_to_uint/fp_to_sint` patterns for `v4f64 -> v4i32` are wrong.
Conversion error was triggered after pr
https://github.com/llvm/llvm-project/pull/126456.
(cherry picked from commit b5c7724f82b6afe98761d0a1c5b6ee7cd2330ada)
Recently some users reported that they observed large increases of
runtime (up to +600% on some translation units) when they upgraded to a
more recent (slightly patched, internal) clang version. Bisection
revealed that the bulk of this increase was probably caused by my
earlier commit bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849 ("Don't assume
third iteration in loops").
As I evaluated that earlier commit on several open source project, it
turns out that on average it's runtime-neutral (or slightly helpful: it
reduced the total analysis time by 1.5%) but it can cause runtime spikes
on some code: in particular it more than doubled the time to analyze
`tmux` (one of the smaller test projects).
Further profiling and investigation proved that these spikes were caused
by an _increase of analysis scope_ because there was an heuristic that
placed functions on a "don't inline this" blacklist if they reached the
`-analyzer-max-loop` limit (anywhere, on any one execution path) --
which became significantly rarer when my commit ensured the analyzer no
longer "just assumes" four iterations. (With more inlining significantly
more entry points use up their allocated budgets, which leads to the
increased runtime.)
I feel that this heuristic for the "don't inline" blacklist is
unjustified and arbitrary, because reaching the "retry without inlining"
limit on one path does not imply that inlining the function won't be
valuable on other paths -- so I hope that we can eventually replace it
with more "natural" limits of the analysis scope.
However, the runtime increases are annoying for the users whose project
is affected, so I created this quick workaround commit that approximates
the "don't inline" blacklist effects of ambiguous loops (where the
analyzer doesn't understand the loop condition) without fully reverting
the "Don't assume third iteration" commit (to avoid reintroducing the
false positives that were eliminated by it).
Investigating this issue was a team effort: I'm grateful to Endre Fülöp
(gamesh411) who did the bisection and shared his time measurement setup,
and Gábor Tóthvári (tigbr) who helped me in profiling.
(cherry picked from commit 9600a12f0de233324b559f60997b9c2db153fede)
In my previous attempt (#126913) of fixing the flaky case was on a good
track when I used the begin locations as a stable ordering. However, I
forgot to consider the case when the begin locations are the same among
the Exprs.
In an `EXPENSIVE_CHECKS` build, arrays are randomly shuffled prior to
sorting them. This exposed the flaky behavior much more often basically
breaking the "stability" of the vector - as it should.
Because of this, I had to revert the previous fix attempt in #127034.
To fix this, I use this time `Expr::getID` for a stable ID for an Expr.
Hopefully fixes#126619
Hopefully fixes#126804
(cherry picked from commit f378e52ed3c6f8da4973f97f1ef043c2eb0da721)
The order of operation was slightly incorrect, as we were checking for
incomplete types *before* handling reference types.
Fixes#129397
---------
Co-authored-by: Erich Keane <ekeane@nvidia.com>
In powerpc64-unknown-linux-musl, signal.h does not include asm/ptrace.h,
which causes "member access into incomplete type 'struct pt_regs'"
errors. Include the header explicitly to fix this.
Also in sanitizer_linux_libcdep.cpp, there is a usage of TlsPreTcbSize
which is not defined in such a platform. Guard the branch with macro.
(cherry picked from commit 801b519dfd01e21da0be17aa8f8dc2ceb0eb9e77)
In the original case, the third call to `getCheaperNegatedExpression`
deletes the SDNode returned by the first call.
Similar to 74e6030bcbcc8e628f9a99a424342a0c656456f9, this patch uses
`HandleSDNodes` to prevent nodes from being deleted by subsequent calls.
Closes https://github.com/llvm/llvm-project/issues/138944.
(cherry picked from commit 143cce72b1f50bc37363315793b80ae92d2b0ae3)
FEAT_FP8DOT4 and FEAT_FP8FMA are supported by FUJITSU-MONAKA. These were
previously enabled due to dependencies, but now require explicit
activation due to modifications in the dependencies.
(cherry picked from commit 9d5a5424f0356bd6ee01c751dd6957299783b41b)
This pull request implements mangling for ConstantMatrixType, allowing
matrices to be used on Windows.
Related issues: #53158, #127127
This example code:
```cpp
#include <typeinfo>
#include <stdio.h>
typedef float Matrix4 __attribute__((matrix_type(4, 4)));
int main()
{
printf("%s\n", typeid(Matrix4).name());
}
```
Outputs this:
```
struct __clang::__matrix<float,4,4>
```
(cherry picked from commit f5a30f111dc4ad6422863722eb708059a68a9d5c)
Towards
This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process
Changes done
- Converted WasmSym from a static struct to a regular struct with
instance members.
- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.
- Reset wasmSym in Ctx::reset() to clear state between links.
- Replaced all WasmSym:: references with ctx.wasmSym->.
- Removed global symbol definitions from Symbols.cpp that are no longer
needed.
Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.
---------
Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com>
(cherry picked from commit 9cbbb74d370c09e13b8412f21dccb7d2c4afc6a4)
These changes align with these lock types and allows builds and tests to
pass with various SDKS.
rdar://147067322
(cherry picked from commit 7cc4472037b43971bd3ee373fe75b5043f5abca9)
Follows the discussion here:
https://github.com/llvm/llvm-project/pull/129309
Recently, the test
`TestRtsan.AccessingALargeAtomicVariableDiesWhenRealtime` has been
failing on newer MacOS versions, because the internal locking mechanism
in `std::atomic<T>::load` (for types `T` that are larger than the
hardware lock-free limit), has changed to a function that wasn't being
intercepted by rtsan.
This PR introduces an interceptor for `_os_nospin_lock_lock`, which is
the new internal locking mechanism.
_Note: we'd probably do well to introduce interceptors for
`_os_nospin_lock_unlock` (and `os_unfair_lock_unlock`) too, which also
appear to have blocking implementations. This can follow in a separate
PR._
(cherry picked from commit 481a55a3d9645a6bc1540d326319b78ad8ed8db1)
This avoids type suffixes for integer constants when the type can be
inferred from the template parameter, such as the unsigned parameter
of A<1> and A<2> in the added test.
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.
This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
Manual backport of https://github.com/llvm/llvm-project/pull/135254
The `_LIBCPP_DISABLE_AVAILABILITY` macro was removed in afae1a5f32bb as an
intended no-op. It turns out that some projects are making use of that
macro to work around a Clang bug with availability annotations that
still exists: https://github.com/llvm/llvm-project/issues/134151.
Since that Clang bug still hasn't been fixed, I feel that we must sill
honor that unfortunate macro until we've figured out how to get rid of
it without breaking code.
(cherry picked from commit 25fc52e655fb4bfd3bb89948d5cbfe011e1b8984)
Check this error for more context
(https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531)
This fails with
```
* thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3)
* frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58
frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838
frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13
frame #5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98
frame #6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102
frame #7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13
frame #8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919
```
Problem :
1) The destructor currently handles no clearance for the DeviceParser
and the DeviceAct. We currently only have this
9764938224/clang/lib/Interpreter/Interpreter.cpp (L416-L419)
2) The ownership for DeviceCI currently is present in
IncrementalCudaDeviceParser. But this should be similar to how the
combination for hostCI, hostAction and hostParser are managed by the
Interpreter. As on master the DeviceAct and DeviceParser are managed by
the Interpreter but not DeviceCI. This is problematic because :
IncrementalParser holds a Sema& which points into the DeviceCI. On
master, DeviceCI is destroyed before the base class ~IncrementalParser()
runs, causing Parser::reset() to access a dangling Sema (and as Sema
holds a reference to Preprocessor which owns PragmaNamespace) we see
this
```
* frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
```
(cherry picked from commit 529b6fcb00aabbed17365e5fb3abbc2ae127c967)
If the LocationSize is larger than the index space of the pointer type,
bail out instead of triggering an APInt assertion.
Fixes the issue reported at
https://github.com/llvm/llvm-project/pull/119365#issuecomment-2849874894.
(cherry picked from commit 027b2038140f309467585298f9cb10d6b37411e7)
Summary:
Different ordering modes aren't supported for an atomic load, so we just
do an add of zero as the same thing. It's less efficient, but it works.
Fixes https://github.com/llvm/llvm-project/issues/138560
(cherry picked from commit dfcb8cb2a92c9f72ddde5ea08dadf2f640197d32)
This reverts commit 6a1bdd9 and re-instate behavior that matches what
MSVC link.exe does, that is, error out when trying to dllimport a symbol
from a static library.
A hint is now displayed in stdout, mentioning that we should rather dllimport the symbol
from a import library.
Fixes https://github.com/llvm/llvm-project/issues/131807
If we are attempting to combine shuffle+bitcast but the bitcast is
pairable with a subsequent bitcast, we should not fold the shuffle as
doing so can block further simplifications.
The motivation for this is a long-standing regression affecting SIMDe on
AArch64, introduced indirectly by the AlwaysInliner (1a2e77cf). Some
reproducers:
* https://godbolt.org/z/53qx18s6M
* https://godbolt.org/z/o5e43h5M7
(cherry picked from commit c91c3f930cfc75eb4e8b623ecd59c807863aa6c0)
…dy.py
Currently, run_clang_tidy.py does not correctly display the list of
checks picked up from the top-level .clang-tidy file. The reason for
that is that we are passing an empty string as input file.
However, that's not how we are supposed to use clang-tidy to list
checks. Per
65eccb463d,
we simply should not pass any file at all - the internal code of
clang-tidy will pass a "dummy" file if that's the case and get the
.clang-tidy file from the current working directory.
Fixes#136659
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
(cherry picked from commit 014ab736dc741f24c007f9861e24b31faba0e1e7)
A PURE subprogram can't have a local variable with the SAVE attribute.
An ASSOCIATE or SELECT TYPE construct entity whose selector is a
variable will return true from IsSave(); exclude them from the local
variable check.
Fixes https://github.com/llvm/llvm-project/issues/131356.
(cherry picked from commit b99dab25879449cb89c1ebd7b4088163543918e3)
In #132722 spills of ZT0 were disabled around all SME ABI routines to
avoid a case where ZT0 is spilled before ZA is enabled (resulting in a
crash).
It turns out that the ABI does not promise that routines will preserve
ZT0 (however in practice they do), so generally disabling ZT0 spills for
ABI routines is not correct.
The case where a crash was possible was "aarch64_new_zt0" functions with
ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this
case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this
patch avoids the ZT0 spill by marking the callsite with
"aarch64_zt0_undef". This attribute only applies to callsites and marks
that at the point the call is made ZT0 is not defined, so does not need
preserving.
This caused ZT0 to be preserved around `__arm_tpidr2_save` in functions
with "aarch64_new_zt0". The block in which `__arm_tpidr2_save` is called
is added by the SMEABIPass and may be reachable in cases where ZA has
not been enabled* (so using `str zt0` is invalid).
* (when za_save_buffer is null and num_za_save_slices is zero)
TargetLoweringBase::getIRStackGuard refers to a platform-specific guard
variable. Before this change, TargetLoweringBase::getSDagStackGuard only
referred to a different variable.
This means that SelectionDAGBuilder's getLoadStackGuard does not get
memory operands. However, AArch64InstrInfo::expandPostRAPseudo assumes
that the passed MachineInstr has nonzero memoperands, causing a
segfault.
We have two possible options here: either disabling the LOAD_STACK_GUARD
node entirely in AArch64TargetLowering::useLoadStackGuardNode or just
making the platform-specific values match across TargetLoweringBase.
Here, we try the latter.
(cherry picked from commit c180e249d0013474d502cd779ec65b33cf7e9468)
925e195 introduced a regression since which we started to accept invalid
trailing commas in many expression lists where they're not allowed by
the grammar. The issue came from the fact that an additional invalid
state - previously handled by ParseExpressionList - was overlooked in
that patch.
Fixes https://github.com/llvm/llvm-project/issues/136254
No release entry because I want to backport it.
(cherry picked from commit c7daab259c3281cf8f649583993bad2536febc02)
This would be more convenient, if ADDITIONAL_COMPILE_FLAGS(target=...)
could be set with a regular expression, just like within e.g. XFAIL
lines.
(cherry picked from commit d6622df115c0de92bf8fc10f8787345ff96b26cc)