52796 Commits

Author SHA1 Message Date
Matt Arsenault
538a83e4b9
RegisterCoalescer: Add undef flags in removePartialRedundancy (#75152)
If the copy being hoisted was undef, we have the same problems that
eliminateUndefCopy needs to solve. We would effectively be introducing a
new live out implicit_def. We need to add an undef flag to avoid
artificially introducing a live through undef value. Previously, the
verifier would fail due to the dead def inside the loop providing the
live in value for the %1 use.
2023-12-13 15:02:53 +07:00
Yeting Kuo
6095e21130
[RISCV] Bump zicfilp to 0.4 (#75134)
Bump to https://github.com/riscv/riscv-cfi/releases/tag/v0.4.0. Actually
there is no functional change here.
2023-12-13 14:50:24 +08:00
Yingchi Long
ddf85b92aa
[BPF] improve error handling by custom lowering & fail() (#75088)
Currently on mcpu=v3 we do not support sdiv, srem instructions. And the
backend crashes with stacktrace & coredump, which is misleading for end
users, as this is not a "bug"

Add llvm bug reporting for sdiv/srem on ISel legalize-op phase.

For clang frontend we can get detailed location & bug report.

    $ build/bin/clang -g -target bpf -c local/sdiv.c
local/sdiv.c:1:35: error: unsupported signed division, please convert to
unsigned div/mod.
        1 | int sdiv(int a, int b) { return a / b; }
          |                                   ^
    1 error generated.

Fixes: #70433
Fixes: #48647

This also improves error handling for dynamic stack allocation:

    local/vla.c:2:3: error: unsupported dynamic stack allocation
        2 |   int b[n];
          |   ^
    1 error generated.

Fixes: https://github.com/llvm/llvm-project/issues/57171
2023-12-13 13:41:52 +08:00
paperchalice
a930fec033
[CodeGen] Port InterleavedLoadCombine to new pass manager (#75164) 2023-12-13 12:46:22 +08:00
Arthur Eubanks
19fff85893 Revert "[X86] Set SHF_X86_64_LARGE for globals with explicit well-known large section name (#74381)"
This reverts commit 323451ab88866c42c87971cbc670771bd0d48692.

Code with these section names in the wild doesn't compile because
support for large globals in the small code model is not complete yet.
2023-12-12 16:31:41 -08:00
Arthur Eubanks
3959231695 [X86][FastISel] Bail out on large objects when materializing a GlobalValue
To avoid crashes with explicitly large objects.

I will clean up fast-isel with large objects/medium code model soon.
2023-12-12 12:45:20 -08:00
Stanislav Mekhanoshin
7f54070194
[AMDGPU] Precommit test for LDS DMA waitcounts. NFC. (#75240) 2023-12-12 12:29:09 -08:00
Simon Pilgrim
06613095c4 [X86] avx512-vbroadcast.ll - fix orphan check prefixes
The AVX512F/AVX512BW checks had been removed despite still being used
2023-12-12 18:34:53 +00:00
Jay Foad
8005ee6da3
[AMDGPU] CodeGen for GFX12 64-bit scalar add/sub (#75070) 2023-12-12 17:41:40 +00:00
Tuan Chuong Goh
32532c2bbe [AArch64][GlobalISel] Test Pre-Commit for Look into array's element 2023-12-12 15:38:21 +00:00
Evgenii Kudriashov
ef35da825f
[X86][GlobalISel] Add instruction selection for G_SELECT (#70753) 2023-12-12 16:08:08 +01:00
Alexander Yermolovich
e8e9a33583
[LLVM][DWARF] Add compilation directory and dwo name to TU in dwo section (#74909)
This adds support to help LLDB when binary is built with split dwarf,
has
.debug_names accelerator table and DWP file.
Final linked binary might have Type Units (TUs) with the same type
signature in multiple
compilation units. Although the signature is the same, TUs are not
guranted to
be bit identical. This is not a problem when they are in .o/.dwo files
as LLDB
can find them by looking at the right one based on
DW_AT_comp_dir/DW_AT_name in
skeleton CU. Once DWP is created, TUs are de-duplicated, and we need to
know
from which CU remaining one came from.

This approach allows LLDB to figure it out, with minimal changes to the
rest of
the tooling. As would have been the case if .debug_tu_index section in
DWP was
modified.
2023-12-12 07:01:20 -08:00
James Y Knight
ed4194bb8d
[X86] Set MaxAtomicSizeInBitsSupported. (#75112)
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.
2023-12-12 08:16:55 -05:00
Orlando Cazalet-Hyams
5ee088134f
[DebugInfo][RemoveDIs] Handle dbg.declares in SelectionDAGISel (#73496)
This is a boring mechanical update to support DPValues that look like
dbg.declares in SelectionDAG.

The tests will become "live" once #74090 lands (see for more info).
2023-12-12 11:32:19 +00:00
Simon Pilgrim
3be65325f9 [X86] canonicalizeShuffleWithBinOps - generalize to handle some unary ops
Rename to canonicalizeShuffleWithOp and begin adding SHUFFLE(UNARYOP(X),UNARYOP(Y)) -> UNARYOP(SHUFFLE(X,Y)) fold support.

This is only kicking in after legalization, so targets that expand bit counts are still duplicating but it helps with a few initial cases.

I'm investigating adding support for extensions/conversions as well, but this is a first step.
2023-12-12 10:59:38 +00:00
Mariusz Sikora
a97028ac51
[AMDGPU] Update VOP instructions for GFX12 (#74853)
Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>
2023-12-12 11:38:24 +01:00
Simon Pilgrim
1d56138d74 [X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was f32/f64
This partially reverts 33819f3bfb9c - the asm comments become a lot messier in #73509 - we're better off ensuring the constant data is the correct type in DAG
2023-12-12 10:32:04 +00:00
Saiyedul Islam
777b6de7a4
[AMDGPU][NFC] Test autogenerated llc tests for COV5 (#74339)
Regenerate a few llc tests to test for COV5 instead of the default ABI
version.
2023-12-12 14:35:13 +05:30
Luke Lau
c87eb63abf [RISCV] Move test to RVV directory. NFC
Just a nit, moving the test so that it gets picked up by
check-codegen-riscv-rvv since it contains vector code
2023-12-12 17:31:59 +09:00
Luke Lau
39445046dc
[RISCV] Remove unecessary early exit in transferBefore (#74040)
Previously we bailed if we encountered a pseudo without a VL op, i.e.
vmv.x.s,
which prevented us from preserving VL and VTYPE. It looks like this was
copied
over from a time whenever this code was operating on the MachineInstrs
in
place, see https://reviews.llvm.org/D127870

However because we no longer mutate the MIs, we can just get rid of this
early
exit which allows us to preserve VL and VTYPE when dealing with vmv.x.s.
2023-12-12 17:25:19 +09:00
Yingchi Long
c4ac1d239f
[BPF][GlobalISel] select non-PreISelGenericOpcode (#75034)
This selects non-PreISelGenericOpcode as-is.

Depends on: #74999

Co-authored-by: Origami404 <Origami404@foxmail.com>
2023-12-12 16:19:34 +08:00
paperchalice
b0cc42ae0f
[CodeGen] Port SjLjEHPrepare to new pass manager (#75023)
`doInitialization` in `SjLjEHPrepare` is trivial.

This is the last pass suffix with `ehprepare`.
2023-12-12 16:07:26 +08:00
Kazu Hirata
8f1accfb35 Revert "[RISCV] Update the interface of sifive vqmaccqoq (#74284)"
This reverts commit dc5570319676c14c48440b4ee87c8cfb35102ff6.

Several bots seem to be failing:

https://lab.llvm.org/buildbot/#/builders/10/builds/34651
https://lab.llvm.org/buildbot/#/builders/178/builds/6320
https://lab.llvm.org/buildbot/#/builders/77/builds/32918
2023-12-11 22:46:43 -08:00
Shreyansh Chouhan
5d12274646
[AArch64]: Added code for generating XAR instruction (#75085)
Fixes #61584
2023-12-12 05:48:45 +00:00
Brandon Wu
dc55703196
[RISCV] Update the interface of sifive vqmaccqoq (#74284)
The
spec(https://sifive.cdn.prismic.io/sifive/60d5a660-3af0-49a3-a904-d2bbb1a21517_int8-matmul-spec.pdf)
is updated.
2023-12-12 13:17:47 +08:00
paperchalice
ce08c7ee1e
[CodeGen] Port SelectOptimize to new pass manager (#74920)
- Use `BlockFrequencyInfoWrapperPass` in legacy pass so member
`std::unique_ptr<BranchProbabilityInfo> BPI` could be removed.
- Member `DominatorTree *DT = nullptr` is unused, remove it.
2023-12-12 12:09:30 +08:00
Arthur Eubanks
f82c85d21f
[X86] Handle unsized types in TargetMachine::isLargeGlobalObject() (#74952)
isLargeGlobalObject() didn't handle opaque types, resulting in crashes.
2023-12-11 19:13:09 -08:00
bcahoon
a19c7c403f
[MachinePipeliner] Fix store-store dependences (#72575)
The pipeliner needs to mark store-store order dependences as
loop carried dependences. Otherwise, the stores may be scheduled
further apart than the MII. The order dependences implies that
the first instance of the dependent store is scheduled before the
second instance of the source store instruction.
2023-12-11 21:10:34 -06:00
Arthur Eubanks
3850131197
[X86] Handle ifuncs in TargetMachine::isLargeGlobalObject() (#74911)
isLargeGlobalObject() didn't handle GlobalIFuncs, resulting in crashes.

Treat ifuncs the same as normal Functions.
2023-12-11 19:01:44 -08:00
Arthur Eubanks
843ea98437
[X86] Allow constant pool references under medium code model in X86InstrInfo::foldMemoryOperandImpl() (#75011)
The medium code model assumes that the constant pool is referenceable
with 32-bit relocations.
2023-12-11 19:00:56 -08:00
Fangrui Song
072cea668e [test] Change llc -march to -mtriple
Similar to d20190e68413634b87f0f9426312a0e9d8456d18
2023-12-11 15:42:12 -08:00
James Y Knight
876816ff18
[AArch64] Set MaxAtomicSizeInBitsSupported. (#74385)
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.

Additionally, adjust some comments, and remove partial code dealing with
larger-than-128bit atomics, as it's now unreachable.

AArch64 always supports 128-bit atomics, so there's no conditionals
needed here. (Though: we really ought to require that a 128-bit load is
available, not just a cmpxchg, which would mean conditioning on LSE2.
But that's future work.)

The arm64-irtranslator.ll test was adjusted as it was using an i258 type
as a hack to avoid IR atomic lowering to test GlobalISel behavior. Pass
-mattr=+lse and use i32, instead, to accomplish that goal in a way that
continues to work.
2023-12-11 17:55:07 -05:00
Mikhail Gudim
29ee66f4a0
[RISCV] Macro-fusion support for veyron-v1 CPU. (#70012)
Support was added for the following fusions:
  auipc-addi, slli-srli, ld-add
Some parts of the code became repetative, so small refactoring of
existing lui-addi fusion was done.
2023-12-11 16:34:13 -05:00
Benjamin Kramer
9458bae553 [NVPTX] Custom lower integer<->bf16 conversions for sm_80 (#74827)
sm_80 only has f32->bf16 conversions, the remaining integer conversions
arrived with sm_90. Use a two-step conversion for sm_80.

There doesn't seem to be a way to express this promotion directly within
the legalization framework, so fallback on Custom lowering.
2023-12-11 21:06:46 +01:00
Jonathan Thackray
f576cbe44e
[AArch64] Correctly mark Neoverse N2 as an Armv9.0a core (#75055)
Neoverse N2 was incorrectly marked as an Armv8.5a core. This has been
changed to an Armv9.0a core. However, crypto options are not enabled
by default for Armv9 cores, so -mcpu=neoverse-n2+crypto is required
to enable crypto for this core.

Neoverse N2 Technical Reference Manual:
   https://developer.arm.com/documentation/102099/0003/
2023-12-11 18:52:25 +00:00
Jonas Paulsson
07056c2274
[SystemZ] Use LCGR/AGHI for i64 XOR with -1 (#74882)
LCGR/AGHI is a more compact way of implementing a 64-bit NOT.
2023-12-11 17:28:12 +01:00
Simon Pilgrim
33819f3bfb [X86] X86FixupVectorConstants - create f32/f64 broadcast constants if the source constant data was ANY floating point type
We don't need an exact match, this is mainly cleanup for cases where v2f32 style types have been cast to f64 etc.
2023-12-11 16:23:04 +00:00
Simon Pilgrim
a7d8d11a14 [X86] combineConcatVectorOps - constant fold vector load concatenation directly into a new load.
Create a new constant pool entry directly instead of going via a BUILD_VECTOR node, which makes constant pool reuse more difficult.

Helps with some regressions in #73509
2023-12-11 16:23:04 +00:00
Yingchi Long
2460bf2fac
[BPF][GlobalISel] add initial gisel support for BPF (#74999)
This adds initial codegen support for BPF backend.

Only implemented ir-translator for "RET" (but not support isel).

Depends on: #74998
2023-12-11 19:58:34 +08:00
Simon Pilgrim
d1deeae094
[X86] Rename VBROADCASTF128/VBROADCASTI128 to VBROADCASTF128rm/VBROADCASTI128rm (#75040)
Add missing rm postfix to show these are load instructions
2023-12-11 11:52:53 +00:00
Simon Pilgrim
21be9114ab [X86] evex-to-vex-compress.mir - strip trailing whitespace 2023-12-11 11:10:03 +00:00
Jay Foad
35ebd92d3d
[GlobalISel] Add G_PREFETCH (#74863) 2023-12-11 11:06:50 +00:00
Pierre van Houtryve
dd32d26a37
[AMDGPU] Form V_MAD_U64_U32 from mul24 (#72393)
Fixes SWDEV-421067
2023-12-11 11:38:27 +01:00
Serge Pavlov
18959c46e3 [NFC] Modify test to use autogenerated assertions 2023-12-11 14:38:01 +07:00
paperchalice
d1a83ff3e0
[CodeGen] Rename winehprepare -> win-eh-prepare (#75024)
Forgot to rename `winehprepare` for legacy pass when port this pass to
new passmanager.
2023-12-11 13:55:27 +08:00
wanglei
af999c4be9
[LoongArch] Add codegen support for [X]VF{MSUB/NMADD/NMSUB}.{S/D} instructions (#74819)
This is similar to single and double-precision floating-point
instructions.
2023-12-11 10:37:22 +08:00
paperchalice
9bd32d78a9
[CodeGen] Update DwarfEHPreparePass references in CodeGenPassBuilder.h (#74068)
Forgot to update the counterpart in `CodeGenPassBuilder.h`. Also Rename `dwarfehprepare` -> `dwarf-eh-prepare`.
2023-12-11 09:26:01 +08:00
Oskar Wirga
9930f3e298
[AArch64] Fix case of 0 dynamic alloc when stack probing (#74877)
I accidentally closed
https://github.com/llvm/llvm-project/pull/74806

If the dynamic allocation size is 0, then we will still probe the
current sp value despite not decrementing sp! This results in
overwriting stack data, in my case the stack canary.

The fix here is just to load the value of [sp] into xzr which is
essentially a no-op but still performs a read/probe of the new page.
2023-12-10 08:01:29 -05:00
James Y Knight
e79ef93c83 [X86] Rearrange a few atomics tests. NFC. 2023-12-09 20:53:29 -05:00
Justin Bogner
7a13e410fd
[DirectX] Move ROV info into HLSL metadata. NFC
Pull Request: https://github.com/llvm/llvm-project/pull/74896
2023-12-09 10:42:45 -08:00