If the copy being hoisted was undef, we have the same problems that
eliminateUndefCopy needs to solve. We would effectively be introducing a
new live out implicit_def. We need to add an undef flag to avoid
artificially introducing a live through undef value. Previously, the
verifier would fail due to the dead def inside the loop providing the
live in value for the %1 use.
Currently on mcpu=v3 we do not support sdiv, srem instructions. And the
backend crashes with stacktrace & coredump, which is misleading for end
users, as this is not a "bug"
Add llvm bug reporting for sdiv/srem on ISel legalize-op phase.
For clang frontend we can get detailed location & bug report.
$ build/bin/clang -g -target bpf -c local/sdiv.c
local/sdiv.c:1:35: error: unsupported signed division, please convert to
unsigned div/mod.
1 | int sdiv(int a, int b) { return a / b; }
| ^
1 error generated.
Fixes: #70433Fixes: #48647
This also improves error handling for dynamic stack allocation:
local/vla.c:2:3: error: unsupported dynamic stack allocation
2 | int b[n];
| ^
1 error generated.
Fixes: https://github.com/llvm/llvm-project/issues/57171
This reverts commit 323451ab88866c42c87971cbc670771bd0d48692.
Code with these section names in the wild doesn't compile because
support for large globals in the small code model is not complete yet.
This adds support to help LLDB when binary is built with split dwarf,
has
.debug_names accelerator table and DWP file.
Final linked binary might have Type Units (TUs) with the same type
signature in multiple
compilation units. Although the signature is the same, TUs are not
guranted to
be bit identical. This is not a problem when they are in .o/.dwo files
as LLDB
can find them by looking at the right one based on
DW_AT_comp_dir/DW_AT_name in
skeleton CU. Once DWP is created, TUs are de-duplicated, and we need to
know
from which CU remaining one came from.
This approach allows LLDB to figure it out, with minimal changes to the
rest of
the tooling. As would have been the case if .debug_tu_index section in
DWP was
modified.
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.
This is a boring mechanical update to support DPValues that look like
dbg.declares in SelectionDAG.
The tests will become "live" once #74090 lands (see for more info).
Rename to canonicalizeShuffleWithOp and begin adding SHUFFLE(UNARYOP(X),UNARYOP(Y)) -> UNARYOP(SHUFFLE(X,Y)) fold support.
This is only kicking in after legalization, so targets that expand bit counts are still duplicating but it helps with a few initial cases.
I'm investigating adding support for extensions/conversions as well, but this is a first step.
This partially reverts 33819f3bfb9c - the asm comments become a lot messier in #73509 - we're better off ensuring the constant data is the correct type in DAG
Previously we bailed if we encountered a pseudo without a VL op, i.e.
vmv.x.s,
which prevented us from preserving VL and VTYPE. It looks like this was
copied
over from a time whenever this code was operating on the MachineInstrs
in
place, see https://reviews.llvm.org/D127870
However because we no longer mutate the MIs, we can just get rid of this
early
exit which allows us to preserve VL and VTYPE when dealing with vmv.x.s.
- Use `BlockFrequencyInfoWrapperPass` in legacy pass so member
`std::unique_ptr<BranchProbabilityInfo> BPI` could be removed.
- Member `DominatorTree *DT = nullptr` is unused, remove it.
The pipeliner needs to mark store-store order dependences as
loop carried dependences. Otherwise, the stores may be scheduled
further apart than the MII. The order dependences implies that
the first instance of the dependent store is scheduled before the
second instance of the source store instruction.
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.
Additionally, adjust some comments, and remove partial code dealing with
larger-than-128bit atomics, as it's now unreachable.
AArch64 always supports 128-bit atomics, so there's no conditionals
needed here. (Though: we really ought to require that a 128-bit load is
available, not just a cmpxchg, which would mean conditioning on LSE2.
But that's future work.)
The arm64-irtranslator.ll test was adjusted as it was using an i258 type
as a hack to avoid IR atomic lowering to test GlobalISel behavior. Pass
-mattr=+lse and use i32, instead, to accomplish that goal in a way that
continues to work.
Support was added for the following fusions:
auipc-addi, slli-srli, ld-add
Some parts of the code became repetative, so small refactoring of
existing lui-addi fusion was done.
sm_80 only has f32->bf16 conversions, the remaining integer conversions
arrived with sm_90. Use a two-step conversion for sm_80.
There doesn't seem to be a way to express this promotion directly within
the legalization framework, so fallback on Custom lowering.
Neoverse N2 was incorrectly marked as an Armv8.5a core. This has been
changed to an Armv9.0a core. However, crypto options are not enabled
by default for Armv9 cores, so -mcpu=neoverse-n2+crypto is required
to enable crypto for this core.
Neoverse N2 Technical Reference Manual:
https://developer.arm.com/documentation/102099/0003/
Create a new constant pool entry directly instead of going via a BUILD_VECTOR node, which makes constant pool reuse more difficult.
Helps with some regressions in #73509
I accidentally closed
https://github.com/llvm/llvm-project/pull/74806
If the dynamic allocation size is 0, then we will still probe the
current sp value despite not decrementing sp! This results in
overwriting stack data, in my case the stack canary.
The fix here is just to load the value of [sp] into xzr which is
essentially a no-op but still performs a read/probe of the new page.