187775 Commits

Author SHA1 Message Date
Steven Wu
ba8d9ce8d4
[ADT] Fix unused variable from #69528 (#114114)
Remove unused variable to fix build failures from bot.
2024-10-29 13:00:59 -07:00
David Majnemer
5c12434906 [X86] Emit comments explaining the immediate in vfpclass
This makes the assembly a lot more readable at a glance.

As an example:
```
  vfpclasspd $4, %zmm0, %k0 # k0 = isNegativeZero(zmm0)
```
2024-10-29 19:54:34 +00:00
Maryam Moghadas
8a0cb9ac86
[PowerPC] Add custom lowering for ssubo (#111748)
This patch is to improve the codegen for ssubo node for i32 in 64-bit
mode by custom lowering.
2024-10-29 15:43:05 -04:00
Adam Yang
3a1228a543
[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic (#111888)
partially fixes #70103

### Changes
* Added int_spv_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsSPIRV.td
* Added lowering for int_spv_group_memory_barrier_with_group_sync in
SPIRVInstructionSelector.cpp
* Added SPIRV backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111884](https://github.com/llvm/llvm-project/pull/111884)
2024-10-29 12:40:01 -07:00
Rahul Joshi
a18af41c20
[LLVM] Change error messages to start with lower case (#113748)
Change LLVM Asm and TableGen Lexer/Parser error messages to begin with
lower case.
2024-10-29 12:26:33 -07:00
Ellis Hoag
9cc5a4bf66
Remove llvm::shouldOptForSize() from Utils.h (#112630)
Remove `llvm::shouldOptForSize()` from `Utils.h` since we can use
`llvm::shouldOptimizeForSize()` from `SizeOpts.h` instead.

Depends on https://github.com/llvm/llvm-project/pull/112626
2024-10-29 14:23:47 -05:00
Kazu Hirata
c79827cd15 [SandboxIR] Fix a warning
This patch fixes:

  llvm/lib/SandboxIR/Context.cpp:684:22: error: unused variable
  'MaxRegisteredCallbacks' [-Werror,-Wunused-const-variable]
2024-10-29 12:05:18 -07:00
Lang Hames
9e37cbb469 [ORC] Add some missing FIXMEs, move a temporary Error into an if condition. 2024-10-29 11:12:48 -07:00
Min-Yih Hsu
ba65710908
[RISCV] Avoid redundant SchedRead on _TIED VPseudos (#113940)
_TIED and _MASK_TIED pseudos have one less operand compared to other
pseudos, thus we shouldn't attach the same number of SchedRead for these
instructions.

I don't think we have a way to (explicitly) check scheduling classes. So
I only test this patch with existing tests.
2024-10-29 10:49:35 -07:00
Harald van Dijk
950ee75909
[RISC-V] Fix check of minimum vlen. (#114055)
If we have a minimum vlen, we were adjusting StackSize to change the
unit from vscale to bytes, and then calculating the required padding
size for alignment in bytes. However, we then used that padding size as
an offset in vscale units, resulting in misplaced stack objects.

While it would be possible to adjust the object offsets by dividing
AlignmentPadding by ST.getRealMinVLen() / RISCV::RVVBitsPerBlock, we can
simplify the calculation a bit if instead we adjust the alignment to be
in vscale units.

@topperc This fixes a bug I am seeing after #110312, but I am not 100%
certain I am understanding the code correctly, could you please see if
this makes sense to you?
2024-10-29 17:30:30 +00:00
Steven Wu
b510cdb895
[ADT] Add TrieRawHashMap (#69528)
Implement TrieRawHashMap can be used to store object with its associated
hash. User needs to supply a strong hashing function to guarantee the
uniqueness of the hash of the objects to be inserted. A hash collision
is not supported and will lead to error or failed to insert.

TrieRawHashMap is thread-safe and lock-free and can be used as
foundation data structure to implement a content addressible storage.
TrieRawHashMap owns the data stored in it and is designed to be:
* Fast to lookup.
* Fast to "insert" if the data has already been inserted.
* Can be used without lock and doesn't require any knowledge of the
participating threads or extra coordination between threads.

It is not currently designed to be used to insert unique new data with
high contention, due to the limitation on the memory allocator.
2024-10-29 10:29:39 -07:00
Afanasyev Ivan
4e1b9d34f9
[mir-strip-debug] Fix debug location info strip for bundled instructions (#113676)
Fix bug that `mir-strip-debug` pass does not remove debug location from
bundled instructions.

Problem arises during testing that debug info does not affect
optimization passes output (`llvm-lit` with ` -Dllc="llc
-debugify-and-strip-all-safe"`), when pass operates on MIR with bundled
instructions + memory operands.

Let mir test check looks like:

```
CHECK-NEXT: BUNDLE {
CHECK-NEXT:   $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2)
CHECK-NEXT: }
```

So as `mir-strip-debug` pass does not process bundled instructions,
running `llc -debugify-and-strip-all-safe` on the test will produce the
following output:

```
BUNDLE {
  $r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2)
}
```

And test will fail, but it shouldn't.

Seems like the root cause is that `mir-strip-debug` pass should remove
debug location from bundled instructions.
2024-10-29 10:26:15 -07:00
Adam Yang
9a5b3a1bbc
[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)
fixes #112974
partially fixes #70103

### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
2024-10-29 10:17:35 -07:00
Craig Topper
b1d0fe095b [RISCV] Remove trailing whitespace. NFC 2024-10-29 10:09:28 -07:00
Jubilee
f53889ffca
[RISCV] Allow crypto features to imply dependents (#112659)
This relationship is a logical dependency.

Note Zvbc and Zvknhb. They are explicitly called out in the spec as
requiring 64 bits:
-
56ed7952d1/doc/vector/riscv-crypto-spec-vector.adoc
2024-10-29 10:07:20 -07:00
SpencerAbson
2a9dd8af5a
[AArch64] Add assembly/disassembly for zeroing SVE FCVT{X} and BFCVT (#113916)
This patch adds assembly/disassembly support for the following SVE2.2
instructions

    - FCVT (zeroing)
    - FCVTX (zeroing)
    - BFCVT (zeroing)
    
In accordance with:
https://developer.arm.com/documentation/ddi0602/2024-09/SVE-Instructions
2024-10-29 16:55:19 +00:00
Fangrui Song
318bdd0aeb
[StackSafetyAnalysis] Bail out when calling ifunc
An assertion failure arises when a call instruction calls a GlobalIFunc.
Since we cannot reason about the underlying function, just bail out.

Fix #87923

Pull Request: https://github.com/llvm/llvm-project/pull/113841
2024-10-29 09:26:47 -07:00
Jorge Gorbe Moya
4df71ab78e
[SandboxIR] Add callbacks for instruction insert/remove/move ops (#112965) 2024-10-29 09:25:51 -07:00
Jay Foad
a156362e93
[AMDGPU] Fix machine verification failure after SIFoldOperandsImpl::tryFoldOMod (#113544)
Fixes #54201
2024-10-29 14:59:37 +00:00
Sarah Spall
75e7ba8c0b
[HLSL] Re-implement countbits with the correct return type (#113189)
Restricts hlsl countbits to always return a uint32.
Implements a lowering from llvm.ctpop which has an overloaded return
type to dxil cbits op which always returns uint32.
Closes #112779
2024-10-29 07:56:05 -07:00
Shilei Tian
e268398fa8
[NFC][AMDGPU] Use !foreach to replace explicit list of registers (#114005) 2024-10-29 10:50:06 -04:00
Elvina Yakubova
80a09735ac
Revert "[clang][AArch64] Add getHostCPUFeatures to query for enabled … (#114066)
…features in cpu info (#97749)"

This reverts commit d732c0b13c55259177f2936516b6087d634078e0.

This is breaking buildbots
https://lab.llvm.org/buildbot/#/builders/190/builds/8413,
https://lab.llvm.org/buildbot/#/builders/56/builds/10880 and a few
others.
2024-10-29 14:43:01 +00:00
Momchil Velikov
b6a84e77b6
[AArch64] Add assembly/disassembly for FMOP4A (widening, 4-way) instructions (#113347)
The new instructions are described in
https://developer.arm.com/documentation/ddi0602/2024-09/SME-Instructions
2024-10-29 14:36:07 +00:00
neildhickey
d732c0b13c
[clang][AArch64] Add getHostCPUFeatures to query for enabled features in cpu info (#97749)
Add getHostCPUFeatures into the AArch64 Target Parser to query the 
cpuinfo for the device in the case where we are compiling with 
-mcpu=native.
Add LLVM_CPUINFO environment variable to test mock /proc/cpuinfo
files for -mcpu=native

Co-authored-by: Elvina Yakubova <eyakubova@nvidia.com>
2024-10-29 13:34:43 +00:00
Matt Arsenault
88e23eb2cf
DAG: Fix legalization of vector addrspacecasts (#113964) 2024-10-29 08:08:50 -05:00
Lukacma
3c2d77185e
[AARCH64] Add assembly/disassembly for FMMLA instructions (#113313)
This patch adds assembly/disassembly for the following instructions:
FMMLA (widening, FP16 to FP32)
FMMLA (widening, FP8 to FP16)
FMMLA (widening, FP8 to FP32)

According to [1]

[1]https://developer.arm.com/documentation/ddi0602
2024-10-29 13:02:46 +00:00
Hari Limaye
e19a5fc6d3
[FuncSpec] Improve accounting of specialization codesize growth (#113448)
Only accumulate the codesize increase of functions that are actually
specialized, rather than for every candidate specialization that we
analyse.

This fixes a subtle bug where prior analysis of candidate
specializations that were deemed unprofitable could prevent subsequent
profitable candidates from being recognised.
2024-10-29 11:53:12 +00:00
Momchil Velikov
ec427df2b9
[AArch64] Add assembly/disassembly for FMOP4{A,S} (non-widening) half-precision instructions (#113343)
The new instructions are described in
https://developer.arm.com/documentation/ddi0602/2024-09/SME-Instructions
2024-10-29 11:50:29 +00:00
Jay Foad
2443549b85
[IR] Remove some uses of StructType::setBody. NFC. (#113685)
It is simple to create the struct body up front, now that we have
transitioned to opaque pointers.
2024-10-29 11:44:53 +00:00
Hari Limaye
06664fdc76
[FuncSpec] Enable SpecializeLiteralConstant by default (#113442)
Enable specialization on literal constant arguments by default in
Function Specialization.

---------

Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>
2024-10-29 11:41:25 +00:00
Lukacma
98c8d64353
[AArch64] Add assembly/dissasembly for BFSCALE instructions (#113538)
This patch adds assembly/disassembly for following instructions:
   BFSCALE (multiple and single vector)
   BFSCALE (multiple vectors)

As specified in https://developer.arm.com/documentation/ddi0602/2024-09

Co-authored-by: Momchil Velikov
[momchil.velikov@arm.com](mailto:momchil.velikov@arm.com)
2024-10-29 11:08:36 +00:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Rohit Aggarwal
dfb60bb919
Adding more vector calls for -fveclib=AMDLIBM (#109662)
AMD has it's own implementation of vector calls.
New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos
Please refer [https://github.com/amd/aocl-libm-ose]

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-10-29 10:09:55 +00:00
CarolineConcatto
8d38fbf2f0
[LLVM][AArch64] Add assembly/disassembly for SVE Integer Unary Arithm… (#113670)
…etic Predicated instructions

This patch adds the following instructions:

SVE bitwise unary operations (predicated)
CLS, CLZ, CNT, CNOT, FABS, FNEG, NOT

SVE integer unary operations (predicated)
SXT{B,H,W}, UXT{B,H,W}, ABS ,NEG

SVE2 integer unary operations (predicated)
URECPE, URSQRTE, SQABS, SQNEG

According to https://developer.arm.com/documentation/ddi0602

Co-authored-by: Spencer Abson Spencer.Abson@arm.com
2024-10-29 09:09:55 +00:00
CarolineConcatto
d4197f3ac1
[LLVM][AArch64] Add assembly/disassembly for MUL/BFMUL SME instructions (#113535)
According to https://developer.arm.com/documentation/ddi0602

Co-authored-by: Momchil-Velikov Momchil.Velikov@arm.com
2024-10-29 09:09:13 +00:00
Alex Bradbury
7544d3af0e
[RISCV] Mark RVB23U64 and RVB23S64 as non-experimental (#113918)
The specification was recently ratified

<https://github.com/riscv/riscv-profiles/blob/main/src/rvb23-profile.adoc>.
2024-10-29 07:57:34 +00:00
Craig Topper
3f4468faaa
[RISCV] Teach expandRV32ZdinxStore to handle memoperand not being present. (#113981)
I received a report that the outliner drops memoperands and causes this
code to crash. Handle this by only copying the memoperand if it exists.

Similar for expandRV32ZdinxLoad
2024-10-28 22:37:47 -07:00
NAKAMURA Takumi
828467a54e Fix warnings introduced in #111434 [-Wnontrivial-memaccess] 2024-10-29 14:18:24 +09:00
Craig Topper
635c344dfb
[X86] Add vector_compress patterns with a zero vector passthru. (#113970)
We can use the kz form to automatically zero the extra elements.

Fixes #113263.
2024-10-28 19:59:00 -07:00
Yingwei Zheng
18311093ab
[InstCombine] Do not fold shufflevector(select) if the select condition is a vector (#113993)
Since `shufflevector` is not element-wise, we cannot do fold it into
select when the select condition is a vector.
For shufflevector that doesn't change the length, it doesn't crash, but
it is still a miscompilation: https://alive2.llvm.org/ce/z/s8saCx

Fixes https://github.com/llvm/llvm-project/issues/113986.
2024-10-29 10:39:07 +08:00
c8ef
0c1c37bfbe
[TLI] Add support for the tgamma libcall. (#113791)
This patch adds the `tgamma` libcall.
2024-10-29 10:08:38 +08:00
Lang Hames
6128ff6630 [JITLink][MachO] Add convenience functions for default text/data sections.
The getMachODefaultTextSection and getMachODefaultRWDataSection functions
return the "__TEXT,__text" and "__DATA,__data" sections respectively, creating
empty sections if the default sections are not already present in the graph.
These functions can be used by utilities that want to add code or data to these
standard sections (e.g. these functions can be used to supply the section
argument to the createAnonymousPointerJumpStub and
createPointerJumpStubBlock functions in the various targets).
2024-10-28 18:05:40 -07:00
vporpo
a461869db3
[SandboxIR][Pass] Implement Analyses class (#113962)
The Analyses class provides a way to pass around commonly used Analyses
to SandboxIR passes throught `runOnFunction()` and `runOnRegion()`
functions.
2024-10-28 18:00:52 -07:00
Matt Arsenault
1ceccbb0dd
VirtRegRewriter: Add implicit register defs for live out undef lanes (#112679)
If an undef subregister def is live into another block, we need to
maintain a physreg def to track the liveness of those lanes. This
would manifest a verifier error after branch folding, when the cloned
tail block use no longer had a def.

We need to detect interference with other assigned intervals to avoid
clobbering the undef lanes defined in other intervals, since the undef
def didn't count as interference. This is pretty ugly and adds a new
dependency on LiveRegMatrix, keeping it live for one more pass. It also
adds a lot of implicit operand spam (we really should have a better
representation for this).

There is a missing verifier check for this situation. Added an xfailed
test that demonstrates this. We may also be able to revert the changes
in 47d3cbcf842a036c20c3f1c74255cdfc213f41c2.

It might be better to insert an IMPLICIT_DEF before the instruction
rather than using the implicit-def operand.

Fixes #98474
2024-10-28 17:33:53 -07:00
Igor Kudrin
757d0e4764
Revert "[CFI][LowerTypeTests] Fix indirect call with alias" (#113978)
Reverts llvm/llvm-project#106185

This is breaking Sanitizer bots:
https://lab.llvm.org/buildbot/#/builders/66/builds/5449/steps/8/logs/stdio
2024-10-28 16:13:32 -07:00
David Majnemer
902acde341 [InstCombine] Optimize away certain additions using modular arithmetic
We can turn:
```
  %add = add i8 %arg, C1
  %and = and i8 %add, C2
  %cmp = icmp eq i1 %and, C3
```

into:
```
  %and = and i8 %arg, C2
  %cmp = icmp eq i1 %and, (C3 - C1) & C2
```

This is only worth doing if the sequence is the sole user of the addition
operation.
2024-10-28 22:51:35 +00:00
Matthias Braun
5903c6af44
InstCombine: Fold shufflevector(select) and shufflevector(phi) (#113746)
- Transform `shufflevector(select(c, x, y), C)` to
  `select(c, shufflevector(x, C), shufflevector(y, C))` by re-using
  the `FoldOpIntoSelect` helper.
- Transform `shufflevector(phi(x, y), C)` to
  `phi(shufflevector(x, C), shufflevector(y, C))` by re-using the
  `foldOpInotPhi` helper.
2024-10-28 15:35:17 -07:00
vporpo
bf4b31ad54
[SandboxVec][Legality] Check Fastmath flags (#113967) 2024-10-28 15:32:20 -07:00
vporpo
5ea694816b
[SandboxVec][Legality] Check opcodes and types (#113741) 2024-10-28 14:05:58 -07:00
joaosaffran
481bce018e
Adding splitdouble HLSL function (#109331)
- Adding hlsl `splitdouble` intrinsics
- Adding DXIL lowering
- Adding SPIRV lowering
- Adding test

Fixes: #108901

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-10-28 13:26:59 -07:00