6047 Commits

Author SHA1 Message Date
Kazu Hirata
5b5b241edf
[TableGen] Avoid repeated hash lookups (NFC) (#120619) 2024-12-19 13:02:55 -08:00
Kazu Hirata
b0a4b5b35a
[TableGen] Avoid repeated hash lookups (NFC) (#120532) 2024-12-19 08:00:02 -08:00
Sergei Barannikov
d3750412aa
[TableGen][GISel] Improve dead register handling (#120426)
A dead implicit def wasn't marked as dead if it is also an implicit use.
The new approach should also be more straightforward and simplifies
future changes for supporting optional defs and physical register defs.

Pull Request: https://github.com/llvm/llvm-project/pull/120426
2024-12-18 18:58:37 +03:00
Sergei Barannikov
1941f34172
[TableGen][GISel] Import more "multi-level" patterns (#120332)
Previously, if the destination DAG has an untyped leaf, we would import
the pattern only if that leaf is defined by the *top-level* source DAG.
This is an unnecessary restriction.

Here is an example of such pattern:
```
def : Pat<(add (mul v8i16:$vA, v8i16:$vB), v8i16:$vC),
          (VMLADDUHM $vA, $vB, $vC)>;
```

Previously, it failed to import because `add` doesn't define neither
`$vA` nor `$vB`.

This change reduces the number of skipped patterns as follows:

```
AArch64: 8695 ->  8548 (-147)
AMDGPU: 11333 -> 11240 (-93)
ARM:     4297 ->  4278 (-1)
PowerPC: 3955 ->  3010 (-945)
```

Other GISel-enabled targets are unaffected.
2024-12-18 14:44:55 +03:00
Sergei Barannikov
cf4375d107
[TableGen][GISel] Extract common function for determining MI's regclass (#120135)
Add some comments that hopefully clarify a few things.

This was supposed to be NFC, but there is a difference in the inferred
register class for EXTRACT_SUBREG.

Pull Request: https://github.com/llvm/llvm-project/pull/120135
2024-12-17 18:03:22 +03:00
Sergei Barannikov
73eecb70c2
[TableGen][GISel] Don't use std::optional with pointers (NFC) (#120026)
Pointers already have a well-defined null value.
2024-12-16 04:45:06 +03:00
Sergei Barannikov
97c3c32372
[TableGen][SystemZ] Correctly check the range of a leaf immediate (#119931)
The "Size >= 32" check probably dates back to when TableGen integers
were 32-bit. Delete it and simplify code by using `isInt`/`isUInt`.
2024-12-14 13:58:23 +03:00
Sergei Barannikov
d1f51c67fd
[TableGen] Add TreePatternNode::children and use it in for loops (NFC) (#119877) 2024-12-13 22:05:57 +03:00
Sergei Barannikov
c9070cce09
[TableGen] Allow empty terminator in SequenceToOffsetTable (#119751)
Some clients do not want to emit a terminator after each sub-sequence
(they have other means of determining the length of sub-sequences).

This moves `Term` argument from `emit` method to the constructor and
makes it optional. It couldn't be made optional while still on the
`emit` method because if the terminator wasn't specified, it has to be
taken into account in `layout` method as well.

The fact that `layout` method was called is now recorded in a dedicated
member variable, `IsLaidOut`. `Entries != 0` can no longer be used to
reliably check if `layout` method was called because it may be zero for
a different reason: the terminator wasn't specified and all added
sequences (if any) were empty.

This reduces the size of `*LaneMaskLists` and `*SubRegIdxLists` a bit
and resolves the removed TODO.
2024-12-13 19:55:11 +03:00
Chandler Carruth
dd647e3e60
Rework the Option library to reduce dynamic relocations (#119198)
Apologies for the large change, I looked for ways to break this up and
all of the ones I saw added real complexity. This change focuses on the
option's prefixed names and the array of prefixes. These are present in
every option and the dominant source of dynamic relocations for PIE or
PIC users of LLVM and Clang tooling. In some cases, 100s or 1000s of
them for the Clang driver which has a huge number of options.

This PR addresses this by building a string table and a prefixes table
that can be referenced with indices rather than pointers that require
dynamic relocations. This removes almost 7k dynmaic relocations from the
`clang` binary, roughly 8% of the remaining dynmaic relocations outside
of vtables. For busy-boxing use cases where many different option tables
are linked into the same binary, the savings add up a bit more.

The string table is a straightforward mechanism, but the prefixes
required some subtlety. They are encoded in a Pascal-string fashion with
a size followed by a sequence of offsets. This works relatively well for
the small realistic prefixes arrays in use.

Lots of code has to change in order to land this though: both all the
option library code has to be updated to use the string table and
prefixes table, and all the users of the options library have to be
updated to correctly instantiate the objects.

Some follow-up patches in the works to provide an abstraction for this
style of code, and to start using the same technique for some of the
other strings here now that the infrastructure is in place.
2024-12-11 15:44:44 -08:00
Sergei Barannikov
6b2232606d
[TableGen] Replace WantRoot/WantParent SDNode properties with flags (#119599)
These properties are only valid on ComplexPatterns. Having them as flags
is more convenient because one can now use "let = ... in" syntax to set
these flags on several patterns at a time. This is also less error-prone
as it makes it impossible to specify these properties on records derived
from SDPatternOperator.

Pull Request: https://github.com/llvm/llvm-project/pull/119599
2024-12-12 00:41:44 +03:00
Owen Anderson
6f3f08abdc
CodeGen: Eliminate dynamic relocations in the register superclass tables. (#119487)
This reapplies #119122 with a fix for UBSAN errors in the X86 backend
related
to incrementing a nullptr.
2024-12-12 10:17:32 +13:00
Owen Anderson
e940353fd2 Revert "CodeGen: Eliminate dynamic relocations in the register superclass tables. (#119122)"
Reverting due to UBSan failures in X86RegisterInfo::getLargestLegalSuperClass

This reverts commit c4873819a98f59ce4e2664f94c73c2dfec3393f8.
2024-12-11 13:45:17 +13:00
Owen Anderson
c4873819a9
CodeGen: Eliminate dynamic relocations in the register superclass tables. (#119122) 2024-12-11 12:36:51 +13:00
Jon Roelofs
b6c22a4e58
Add processor aliases back to -print-supported-cpus and -mcpu=help (#118581)
They were accidentally dropped in
https://github.com/llvm/llvm-project/pull/96249

rdar://140853882
2024-12-09 09:18:31 -08:00
Chandler Carruth
f0297ae552
Switch the intrinsic names to a string table (#118929)
This avoids the need to dynamically relocate each pointer in the table.

To make this work, this PR also moves the binary search of intrinsic
names to an internal function with an adjusted signature, and switches
the unittesting to test against actual intrinsics.
2024-12-07 17:53:59 -08:00
Sam Elliott
73731d6873
[llvm-tblgen] Increase Coverage Index Size (#118329) 2024-12-04 09:19:13 +00:00
Mason Remy
0c6457b781
[LLVM][TableGen] Refine overloaded intrinsic suffix check (#117957)
Previously the check comments indicated that [pi][0-9]+ would match as a
type suffix, however the check itself was looking for [pi][0-9]* and
hence an 'i' suffix in isolation was being considered as a type suffix
despite it not having a bitwidth.

This change makes the check consistent with the comment and looks for
[pi][0-9]+
2024-12-03 13:33:15 -05:00
Adam Yang
0a44b24d66
[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#114349)
fixes #112974
partially fixes #70103

An earlier version of this change was reverted so some issues could be fixed.

### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
2024-12-01 22:31:40 -08:00
Jinsong Ji
2e30df740e
[TableGen] Fix validateOperandClass for non Phyical Reg (#118146)
https://github.com/llvm/llvm-project/commit/b71704436e61
Rewrote the register operands handling,
but the Table only contains physical regs, we will SEGV when there are
non physical regs.

---------

Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2024-11-30 12:07:15 -05:00
Jay Foad
89b08c8ee7
[TableGen] Simplify generated code for isSubclass (#117351)
Implement isSubclass with direct lookup into some tables instead of
nested switches.

Part of the motivation for this is improving compile time when clang-18
is used as a host compiler, since it seems to have trouble with very
large switch statements.
2024-11-28 08:52:02 +00:00
Jay Foad
b71704436e
[TableGen] Simplify generated code for validateOperandClass (#117889)
Implement the register operand handling in validateOperandClass with a
table lookup instead of a potentially huge switch.

Part of the motivation for this is improving compile time when clang-18
is used as a host compiler, since it seems to have trouble with very
large switch statements.
2024-11-27 16:49:35 +00:00
Sander de Smalen
318c69de52 Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)"
The issue with slow compile-time was caused by an assert in
AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked'
after adding all the reserved registers. This call gets very expensive
after adding the _HI registers due to the way the function searches
in the 'Exception' list, which is expected to be a small list but isn't
(the patch added 190 _HI regs).

It was possible to rewrite the code in such a way that the _HI registers
are marked as reserved after the check. This makes the problem go away
entirely and restores compile-time to what it was before (tested for
`check-runtimes`, which previously showed a ~5x slowdown).

This reverts commits:
  1434d2ab215e3ea9c5f34689d056edd3d4423a78
  2704647fb7986673b89cef1def729e3b022e2607
2024-11-27 13:31:59 +00:00
Jay Foad
535247841d
[TableGen] Remove comments from generated validateOperandClass (#117352)
This generated comments like:

  // 'BoolReg' class
  case MCK_BoolReg: {

which seem redundant because the name is always repeated on the next
line as part of the MCK_ enumerator.
2024-11-25 12:11:01 +00:00
Vitaly Buka
1434d2ab21
Revert "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)" (#117307)
Details in #114827

This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.
2024-11-22 11:48:25 -08:00
Simon Pilgrim
29f11f0a32
[X86] Add missing reg/imm attributes to VRNDSCALES instruction names (#117203)
More canonicalization of the instruction names to make the predictable - more closely matches VRNDSCALEP / VROUND equivalent instructions
2024-11-22 17:45:30 +00:00
Jay Foad
285754d799 [TableGen] Fix closing brace indentation in validateOperandClass 2024-11-22 17:42:39 +00:00
Pengcheng Wang
4da960b898 [RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 22:58:54 +08:00
Mikhail Goncharov
d1dae1e861 Revert "[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)" chain
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.

multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
2024-11-22 14:09:13 +01:00
Pengcheng Wang
775148f236
[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 19:54:45 +08:00
abhishek-kaushik22
31ce47b5d6
[TableGen] Use std::move to avoid copy (#113061) 2024-11-21 11:48:46 -08:00
abhishek-kaushik22
cec52960fd
[TableGen] Use const auto& instead of auto to avoid copy (#113053) 2024-11-21 18:52:47 +01:00
Simon Pilgrim
3a5cf6d99b
[X86] Rename AVX512 VEXTRACT/INSERT??x? to VEXTRACT/INSERT??X? (#116826)
Use uppercase in the subvector description ("32x2" -> "32X4" etc.) - matches what we already do in VBROADCAST??X?, and we try to use uppercase for all x86 instruction mnemonics anyway (and lowercase just for the arg description suffix).
2024-11-20 08:25:01 +00:00
Yingwei Zheng
c727b48287
[SDAG][ISel][TableGen][LoongArch] Report error for trivial bitcasts when there are predicate calls (#116075)
On loongarch64 with lsx extension, we select `VBITREV_W` for `v4i32 (xor
X, (shl splat(1), Y))`:

8e66303916/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (L1583-L1584)

And `vsplat_imm_eq_1` is defined as:

8e66303916/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (L77-L87)

For the `(bitconvert (v4i32 (build_vector)))` case, the pattern is
expected to be:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (bitconvert:{ *:[v4i32] } (build_vector:{ *:[v4i32] }))<<P:Predicate_vsplat_imm_eq_1>>, v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

However, `simplifyTree` drops the `bitconvert` node and its predicates:

8e66303916/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp (L3036-L3062)

Then llvm will match `vsplat_imm_eq_1` for any v4i32 splats and cause a
miscompilation:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (build_vector:{ *:[v4i32] }), v4i32:{ *:[v4i32] }:$vk))
RESULT:  (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```

This patch adds additional checks for predicates associated with the
trivial bitconvert node. Unused patterns in the LoongArch target are
also removed.

Fixes https://github.com/llvm/llvm-project/issues/116008.
2024-11-19 21:24:40 +08:00
Craig Topper
92ffefe351
[Tablegen] Add more comments for result numbers to DAGISelEmitter.cpp (#116533)
Print what result number the Emit* nodes are storing their results in.
This makes it easy to track the inputs of later opcodes that consume
these results.
2024-11-17 19:11:28 -08:00
Sander de Smalen
c1c68baf7e
[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)
This is a step towards enabling subreg liveness tracking for AArch64,
which requires that registers are fully covered by their subregisters,
as covered here #109797.

There are several changes in this patch:

* AArch64RegisterInfo.td and tests: Define the high bits like B0_HI,
H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some
register class, this added a register class which meant that we had to
update 'magic numbers' in several tests.

The use of ComposedSubRegIndex helped 'compress' the number of bits
required for the lanemask. The correctness of the masks is tested by an
explicit unit tests.

* LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for
register tuples, but with this change to describe the high bits, a
register like 'D0' will also have 'HasDisjunctSubRegs' set to true
(because it's fullly covered by S0 and S0_HI). The fix here is to
explicitly test if the register class is one of the known D/Q/Z tuples.
2024-11-14 09:09:13 +00:00
Rahul Joshi
39a8046b73
[NFC][TableGen] Use formatv automatic index in AsmWriterEmitter (#115966)
Use formatv automatic index assignment in AsmWriterEmitter.
2024-11-13 09:37:11 -08:00
Kazu Hirata
4048c64306
[llvm] Remove redundant control flow statements (NFC) (#115831)
Identified with readability-redundant-control-flow.
2024-11-12 10:09:42 -08:00
Alexandros Lamprineas
3cc852ece4
[FMV][AArch64] Expand feature dependencies using AArch64::ExtensionSet. (#113281)
Currently we maintain a hand written list of subtarget features which we
are implied for a given FMV feature. It is more robust to expand such
dependencies using ExtensionDependency from TargetParser, since that is
generated by tablegen. For this to work each FMV feature must have a
corresponding SubtargetFeature in place. FMV features which didn't
satisfy this criteria have been removed from the ACLE specification
(https://github.com/ARM-software/acle/pull/315). However, I deliberately
marked the ArchExtKind in FMVInfo structure as std::optional in case we
decide to break this rule in the future.

I have also added the missing dependencies:
 * FEAT_DPB2 -> FEAT_DPB
 * FEAT_FlagM2 -> FEAT_FlagM
2024-11-12 16:01:35 +00:00
Kazu Hirata
b9fb6b6cb5
[llvm] Migrate away from PointerUnion::{is,get,dyn_cast} (NFC) (#115681)
Note that PointerUnion::{is,get,dyn_cast} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>
2024-11-11 00:00:07 -08:00
Craig Topper
09b372aa60
[GISel][AArch64][RISCV] Allow G_SEXT_INREG patterns to be imported. (#115576)
SelectionDAG uses VTSDNode to store the extension type. GlobalISel uses
a literal constant operand.

For vectors, SelectionDAG uses a type with the same number of elements
as other operand of the sext_inreg. I assume for GISel we would just use
the scalar size.
2024-11-08 22:26:56 -08:00
Sergei Barannikov
501a583441
[TableGen][SelectionDAG] Remove the implicit DAG node (#115295)
The node was introduced in 59c39dc1 and was intended to allow writing
patterns like this:
`[(set AL, (mul AL, GR8:$src1)), (implicit EFLAGS)]`

However, it does not introduce new functionality because the same
pattern can be equivalently expressed as:
`[(set AL, EFLAGS, (mul AL, GR8:$src1))]`

The latter form is also more flexible as it allows reordering output
operands.

In most places uses of `implicit` were redundant -- removing them didn't
change anything in the generated DAG tables. The only three cases where
it did have effect are in X86InstrArithmetic.td and X86InstrSystem.td --
those were rewritten to use `set` node.

Removing `implicit` from some patterns made them importable by GISel,
hence the change in a test.
2024-11-09 07:25:40 +03:00
Sergei Barannikov
3bdd71137e
[TableGen][GISel] Extract helper function for constraining operands (#115148)
As a side effect, this fixes COPY_TO_REGCLASS not being constrained
if it is not top-level (the reason for changes in tests).
2024-11-07 07:16:54 +03:00
Sander de Smalen
ae0ab24862
[TableGen] Fix calculation of Lanemask for RCs with artificial subregs. (#114392)
TableGen builds up a map of "SubRegIdx -> Subclass" where Subclass is
the largest class where all registers have SubRegIdx as a sub-register.
When SubRegIdx (vis-a-vis the sub-register) is artificial it should
still include it in the map. This map is used in various places,
including in the calculation of the Lanemask of a register class, which
otherwise calculates an incorrect lanemask.
2024-11-04 16:10:50 +00:00
Sander de Smalen
9a211fe7e4
[TableGen] Fix concatenation of subreg and artificial subregs (#114391)
When CoveredBySubRegs is true and a sub-register consists of two
parts; a regular subreg and an artificial subreg, then TableGen
should consider only concatenating the non-artificial subregs. 
For example, S0_S1 is a concatenated subreg from D0_D1,
but S0_S1_HI should not be considered.
2024-11-04 15:51:19 +00:00
Jay Foad
f8559751fc
[llvm-project] Fix typo "propogate" (#114795) 2024-11-04 15:33:19 +00:00
abhishek-kaushik22
a58c3d3ac7
Use std::move to avoid copy (#113055) 2024-11-04 07:28:00 -08:00
Sergei Barannikov
eeb987f6f3
[MC] Make generated MCInstPrinter::getMnemonic const (NFC) (#114682)
The value returned from the function depends only on the instruction opcode.

As a drive-by, change the type of the argument to const-reference.
2024-11-03 20:37:26 +03:00
Phoebe Wang
c72a751dab
[X86][AMX] Support AMX-TRANSPOSE (#113532)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-01 16:45:03 +08:00
Fangrui Song
9bb5af8a42 [TableGen] Replace StringRef::slice with substr. NFC 2024-10-30 22:27:12 -07:00