Premerge CI is currently failing with the following after the update to
clang v22:
```
/home/gha/llvm-project/clang-tools-extra/clangd/benchmarks/IndexBenchmark.cpp:92:1: error: '__COUNTER__' is a C2y extension [-Werror,-Wc2y-extensions]
92 | BENCHMARK(dexQueries);
| ^
```
Some original work was done around this in
df1d786c460e0e47c9074f3533f098190ebfbc1b, which was then done in
upstream Google benchmark in
d8db2f90b6.
The original work done in the patch implementing this feature doesn't
seem to account for as many cases as the upstream patch does. This patch
reverts the diff in df1d786c460e0e47c9074f3533f098190ebfbc1b and applies
the applicable hunks from the upstream patch.
In cmake this value is set in llvm-config.h, we're not really handling
that the same way in bazel so we can just allow all targets to inherit
this disabled, otherwise it fails since lldb assumes it is always
something
The only expectations change here is that `__stack_pointer` is
no longer exports in the `archive-export.test` test. This is because
we don't enable the mutable-globals feature (since the assembly files
don't contains all the now-default features of the generic CPU).
This PR refactors blocking support for ConvertLayout op to allow it
unrollable, not just removing it for specialize case.
It also removes the foldable attribute for ConvertLayout op, as we
expect the OP to be explicitly handled by XeGPU lowering.
It adds subgroup to lane distribution support for ConvertLayout op.
Adding the `operator==` and `operator!=` for SBBlock. This should allow
us to compare blocks within a frame, like:
```python
block = frame.GetBlock()
while True:
if block == frame.GetFrameBlock():
# we're at the top function scope.
else:
# we're at an inner block scope.
```
!$omp must never be recognized as a compiler directive sentinel when it
is immediately followed by anything other than a space or &. Revert a
bit of a recent change that broke this.
Add a fold() method to RsqrtOp, matching the pattern used by SqrtOp and
other math unary ops. The fold computes `1.0 / sqrt(x)` using APFloat
division.
---------
Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
Add the ability to generate a C source file, which is in addition to the
existing functionality of generating binary.
An example of the generated source:
```c
#ifdef __APPLE__
#define FORMATTER_SECTION "__DATA_CONST,__lldbformatters"
#else
#define FORMATTER_SECTION ".lldbformatters"
#endif
__attribute__((used, section(FORMATTER_SECTION)))
unsigned char _Account_synthetic[] =
// version
"\x01"
// remaining record size
"\x15"
// type name size
"\x07"
// type name
"Account"
// flags
"\x00"
// sig_get_num_children
"\x02"
// program size
"\x02"
// program
"\x20\x01"
// sig_get_child_at_index
"\x04"
// program size
"\x06"
// program
"\x02\x20\x00\x23\x11\x60"
;
```
The CombineSetCC helpers and performAnyAllCombine generate MVT::i1
results.
However MVT::i1 is an illegal type in WebAssembly, and this combiner can
run either before or after legalization. Directly creating the intrinsic
and negating its result using XOR instead of i1 and a NOT operation
avoids this problem.
Fixes#183842
Which is under discussion in
https://github.com/llvm/llvm-project/issues/179036
Add new options -ffixed_r{8-15} for clang X86 target, like option
"-ffixed_x" for RISCV/AArch64 target.
Also, add target-feature +reserve-r{8-15} for the X86 backend.
The registers which are specified reserved will not be used in
RegAlloc/CalleeSave. Then the reserved registers can be maintained by
user. It will be useful for the runtime/interpreter implementation.
Other registers are used in specific instructions or mechanism, so they
can't be reserved.
Summary:
We already support floating point arguments for the standard atomic
functions. LLVM supports these in most cases as well. This PR unifies
the handling and allows this in the cases that the LLVM IR supports.
In #184259, Jim noticed that Debugger::FindTargetWithProcess and
Debugger::FindTargetWithProcessID are rather poorly designed APIs as
tehy allow code running in one Debugger to mess with Targets from
another Debugger. The only use is Process::SetProcessExitStatus which
isn't actually used.
I have an unwind failure where the eh_frame for a
trap handler states that the caller's return address is in eh_frame
register 33, which lldb treats as cpsr.
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#dwarf-register-names
Register 33 is ELR_mode, which isn't defined as a register in any of the
AArch64 register definition files in lldb today, so I'm not adding it to
the header files.
rdar://170602999
Some of the DAP tests define a static method named `TeatUpTestSuite`
which is calling `SBDebugger::Terminate`. Besides the typo, the correct
method is `TearDownTestSuite`, which GoogleTest calls after running the
last test in the test suite.
When addressing this, I realized that currently you can't really call
Initialize and Terminate multiple times in the same process. This
depends on:
- https://github.com/llvm/llvm-project/pull/184259
- https://github.com/llvm/llvm-project/pull/184261
Rename TENSOR_LOAD_TO_LDS to TENSOR_LOAD_TO_LDS_d4;
Rename TENSOR_STORE_FROM_LDS to TENSOR_STORE_FROM_LDS_d4;
Also rename function names in a couple of tests to reflect this change.
Replace the single `cir.binop` operation (dispatched via a `BinOpKind`
enum) with nine distinct ops — `cir.add`, `cir.sub`, `cir.mul`,
`cir.div`, `cir.rem`, `cir.and`, `cir.or`, `cir.xor`, and `cir.max` —
each with precise type constraints and only the attributes it needs
(nsw/nuw/sat on add/sub via `BinaryOverflowOp`).
A new `BinaryOpInterface` provides uniform `getLhs`/`getRhs`/`getResult`
access for passes and analyses.
The monolithic switch-based CIRToLLVMBinOpLowering is replaced by per-op
patterns generated through the existing CIRLowering.inc TableGen
infrastructure, with shared dispatch factored into two helpers:
`lowerSaturatableArithOp` for add/sub and `lowerIntFPBinaryOp` for
div/rem.
Remove call_once wrappers around PluginManager::RegisterPlugin. Plugins
can be registered and unregistered in Initialize and Terminate
respectively. In its current state, after having called Terminate, a
plugin can never be re-initialized.
Roughly 10 years ago, in aacb80853a46bd544fa76a945667302be1de706c, Greg
deleted the call to delete g_debugger_list_ptr because of a race
condition:
> Fixed a threading race condition where we could crash after calling
Debugger::Terminate().
>
> The issue was we have two global variables: one that contains a
DebuggerList pointer and one that contains a std::mutex > pointer. These
get initialized in Debugger::Initialize(), and everywhere that uses
these does:
>
> if (g_debugger_list_ptr && g_debugger_list_mutex_ptr)
> {
> std::lock_guard<std::recursive_mutex>
guard(*g_debugger_list_mutex_ptr);
> // do work while mutex is locked
> }
>
> Debugger::Terminate() was deleting and nulling out g_debugger_list_ptr
which meant we had a race condition where someone might do the if
statement and it evaluates to true, then another thread calls
Debugger::Terminate() and deletes and nulls out g_debugger_list_ptr
while holding the mutex, and another thread then locks the mutex and
tries to use g_debugger_list_ptr. The fix is to just not delete and null
out the g_debugger_list_ptr variable.
However, this isn't necessary as long as we persist ("leak") the mutex
and always check it first. That's exactly what this patch does. Without
it, the assert in Debugger::Initialize is incorrect.
```
assert(g_debugger_list_ptr == nullptr &&
"Debugger::Initialize called more than once!");
```
This introduces a new pass to lower from a flattened, target-independent
form of CIR to a form that uses Itanium-specific representation for
exception handling. It also includes a small amount of code needed to
lower the Itanium form to LLVM IR.
Substantial amounts of this PR were created using agentic AI tools, but
I have carefully reviewed the code, comments, and tests and made changes
as needed.
Issue:
Building RISCVInstrInfo.td fails with the following TableGen error
during the generation of RISCVGenInstrInfo.inc:
` error: In test: Could not infer all types in pattern!`
Root Cause:
The riscv_swap_csr node has a polymorphic result type (i32 or i64
depending on the target architecture). When used inside the SwapSysReg
class pattern, TableGen's type inference engine cannot automatically
deduce the exact return type solely from the GPR:$rd output, leading to
the ambiguity error.
Fix:
This patch resolves the type ambiguity by explicitly wrapping the
riscv_swap_csr node with XLenVT, allowing TableGen to infer the types
correctly.
Fixes a logic issue in the `faceforward` pattern matcher in
`SPIRVCombinerHelper.cpp`.
Previously when `mi_match` failed, we would still go through the nested
`Pred == CmpInst::FCMP_OGT || Pred == CmpInst::FCMP_UGT` check. It was
possible that whatever garbage was in Pred could randomly pass this
check and make us continue through the code. This change fixes that
logic issue by returning false as soon as `mi_match` fails.
Likely fixes#177803. Can't confirm since it seems another change has
obscured the crash.
Instead of storing CmpInst::Predicate/GepNoWrapFlags, only store their
raw bitfield values. This reduces the size of VPIRFlags from 12 to 3
bytes.
PR: https://github.com/llvm/llvm-project/pull/181571
The range is an unsigned integer where a value of `UINT32_MAX` denotes
an unbounded range
The current implementation implied that any size interpreted as a signed
integer that is negative was unbounded, which is incorrect.
Adds a note to the docs
As a GNU extension, clang supports math on void* and function pointers
in C mode only. From a CIR perspective, it makes sense to leave these
types in the IR, since it might be useful to do analysis.
During lowering, we already properly lower these to a size-1 element, so
there is no changes that need to happen besides letting this get through
CIR generation. This patch does that, plus adds some tests.
This patch implements the CodeGen logic for calling __llvm_omp_indirect_call_lookup
on the device when an indirect function call or a virtual function call is made
within an OpenMP target region.
---------
Co-authored-by: Youngsuk Kim
Finding reproducers for these that don't use the deferred vtable (which
we haven't yet implemented) was a bit of a challenge, but I found
this setup to get these to be emitted. Fortunately it is a quite easy
implementation that doesn't do awfully much.
This patch implements both, plus the name through the itanium ABI.
The current validity message prints out both "TILE" and "COLLAPSE" even
if just one of them is used. This makes it confusing if the user only
used one of them. This improves the messages to be precise which clause
is not allowed (and separate messages are issued when both clauses are
used).