797 Commits

Author SHA1 Message Date
Mahesh-Attarde
e61a7dc256
[X86][AVX512] Use comx for compare (#113567)
We added AVX10.2 COMEF ISA in LLVM, This does not optimize correctly in
scenario mentioned below.
Summary
Input 
```
define i1 @oeq(float %x, float %y) {
    %1 = fcmp oeq float %x, %y
    ret i1 %1
}define i1 @une(float %x, float %y) {
    %1 = fcmp une float %x, %y
    ret i1 %1
}define i1 @ogt(float %x, float %y) {
    %1 = fcmp ogt float %x, %y
    ret i1 %1
}
// Prior AVX10.2, default code generation

oeq:                                    # @oeq
        cmpeqss xmm0, xmm1
        movd    eax, xmm0
        and     eax, 1
        ret
une:                                    # @une
        cmpneqss        xmm0, xmm1
        movd    eax, xmm0
        and     eax, 1
        ret
ogt:                                    # @ogt
        ucomiss xmm0, xmm1
        seta    al
        ret 
```

This patch will remove `cmpeqss` and `cmpneqss`. For complete transform
check unit test.

Continuing on what PR https://github.com/llvm/llvm-project/pull/113098
added

Earlier Legalization and combine expanded `setcc oeq:ch` node into `and`
and `setcc eq` , `setcc o`. From suggestions in community
new internal transform
```
Optimized type-legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 11 nodes:
  t0: ch,glue = EntryToken
      t2: f16,ch = CopyFromReg t0, Register:f16 %0
      t4: f16,ch = CopyFromReg t0, Register:f16 %1
    t14: i8 = setcc t2, t4, setoeq:ch
  t10: ch,glue = CopyToReg t0, Register:i8 $al, t14
  t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1

Optimized legalized selection DAG: %bb.0 'hoeq:'
SelectionDAG has 12 nodes:
  t0: ch,glue = EntryToken
        t2: f16,ch = CopyFromReg t0, Register:f16 %0
        t4: f16,ch = CopyFromReg t0, Register:f16 %1
      t15: i32 = X86ISD::UCOMX t2, t4
    t17: i8 = X86ISD::SETCC TargetConstant:i8<4>, t15
  t10: ch,glue = CopyToReg t0, Register:i8 $al, t17
  t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1
```
Earlier transform is mentioned here
https://github.com/llvm/llvm-project/pull/113098#discussion_r1810307663

---------

Co-authored-by: mattarde <mattarde@intel.com>
2024-10-30 16:17:25 +08:00
Rahul Joshi
a18af41c20
[LLVM] Change error messages to start with lower case (#113748)
Change LLVM Asm and TableGen Lexer/Parser error messages to begin with
lower case.
2024-10-29 12:26:33 -07:00
Freddy Ye
5aa1275d03
[X86] Support SM4 EVEX version intrinsics/instructions. (#113402)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-10-28 10:46:16 +08:00
Abhina Sree
9d88543301
[AIX] Use internal lit shell for TableGen instead of a global setting (#113627)
This is to address the latest lit regressions
https://lab.llvm.org/buildbot/#/builders/64/builds/1285 caused by using
the internal lit shell. This change will limit using the internal lit
shell to TableGen on AIX so we do not hit these regressions.
2024-10-25 13:06:02 -04:00
Daniel Paoliello
7eb8238a32
[TableGen] Handle Windows line endings in x86-fold-tables.td test (#112997)
The x86-fold-tables.td has been failing for me and [in
CI](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4)
if Git happens to decide to check out the baseline file with Windows
line endings.

This fix for this is to add the `--strip-trailing-cr` option to diff to
normalize the line endings before comparing them.
2024-10-21 09:58:59 -07:00
JL2210
8f6d4913bb
[llvm][TableGen] Count implicit defs as well as explicit ones in the GlobalISel TableGen emitter (#112673)
`NumDefs` only counts the number of registers in `(outs)`, not any
implicit defs specified with `Defs = [...]`

This causes patterns with physical register defs to fail to import here
instead of later where implicit defs are rendered.

Add on `ImplicitDefs.size()` to count both and create `DstExpDefs` to
count only explicit defs, used later on.
2024-10-18 10:50:44 +01:00
Rahul Joshi
2a0073f6b5
[LLVM][TableGen] Check overloaded intrinsic mangling suffix conflicts (#110324)
Check name conflicts between intrinsics caused by mangling suffix.

If the base name of an overloaded intrinsic is a proper prefix of
another intrinsic, check if the other intrinsic name suffix after the
proper prefix can match a mangled type and issue an error if it can.
2024-10-15 08:15:57 -07:00
Ying Yi
e06e493252 Make a tablegen test match-table.td more robust.
Some organizations have added operators downstream, and the test match-table.td tends to fail with off-by-n errors (with n being the number of `added operators`) periodically. This patch will increase the test robust and reduce the impact of merge process.
2024-10-08 13:16:06 +01:00
Rahul Joshi
e31e6f259e
[TableGen] Print assert message inline with assert failure (#111184)
Print assert message after the "assertion failed" message instead of
printing it as a separate note. This makes the assert failure reporting
less verbose and also more useful to see the failure message inline with
the "assertion failed" message.
2024-10-04 13:21:50 -07:00
Rahul Joshi
04540fac5b
[TableGen] Print record location when record asserts fail (#111029)
When record assertions fail, print an error message with the record's
location, so it's easier to see where the record that caused the assert
to fail was instantiated. This is useful when the assert condition in a
class depends on a template parameter, so we need to know the context of
the definition to determine why the assert failed.

Also enhanced the assert.td test to check for these context messages,
and also add checks for some assert failures that were missing in the
test.
2024-10-04 05:45:45 -07:00
Rahul Joshi
876f661dbe
[LIT] Rename substitution %basename_s to %{s:basename} (#111062)
Also added `%{t:stem}` as an alias for `%basename_t` and modified unit
test to test these new substitutions.
2024-10-03 18:18:10 -07:00
Rahul Joshi
6f20c3099e
[LIT] Add support for %basename_s to get base name of source file (#110993)
Add support for `%basename_s` pattern in the RUN commands to get the
base name of the source file, and adopt it in a TableGen LIT test.
2024-10-03 12:29:11 -07:00
Rahul Joshi
241f93658a
[TableGen] Fix source location for anonymous records (#110935)
Fix source location for anonymous records to be the one of the locations
where that record is instantiated as opposed to the location of the
class that was anonymously instantiated.

Currently, when a record is anonymously instantiated (via
`VarDefInit::instantiate`), we use the location of the class for the
record, which is not correct. Instead, pass in the `SMLoc` for the
location where the anonymous instantiation happens and use that location
when the record is instantiated. If there are multiple anonymous
instantiations with the same parameters, the location for the (single)
record created will be one of these instantiation locations as opposed
to the class location.
2024-10-03 06:16:56 -07:00
Rahul Joshi
0de0354aa8
[LLVM][TableGen] Decrease code size of Intrinsic::getAttributes (#110573)
Decrease code size of `Intrinsic::getAttributes` function by uniquing
the function and argument attributes separately and using the
`IntrinsicsToAttributesMap` to store argument attribute ID in low 8 bits
and function attribute ID in upper 8 bits.

This reduces the number of cases to handle in the generated switch from
368 to 131, which is ~2.8x reduction in the number of switch cases.

Also eliminate the fixed size array `AS` and `NumAttrs` variable, and
instead call `AttributeList::get` directly from each case, with an
inline array of the <index, AttribueSet> pairs.
2024-10-01 09:08:47 -07:00
Stephen Chou
ec61311e77
[LLVM][TableGen] Support type casts of nodes with multiple results (#109728)
Currently, type casts can only be used to pattern match for intrinsics
with a single overloaded return value. For instance:
```
def int_foo : Intrinsic<[llvm_anyint_ty], []>;
def : Pat<(i32 (int_foo)), ...>;
```

This patch extends type casts to support matching intrinsics with
multiple overloaded return values. As an example, the following defines
a pattern that matches only if the overloaded intrinsic call returns an
`i16` for the first result and an `i32` for the second result:
```
def int_bar : Intrinsic<[llvm_anyint_ty, llvm_anyint_ty], []>;
def : Pat<([i16, i32] (int_bar)), ...>;
```
2024-10-01 11:17:00 +04:00
Rahul Joshi
2f43e65955
[LLVM][TableGen] Check name conflicts between target dep and independent intrinsics (#109826)
Validate that for target independent intrinsics the second dotted
component of their name (after the `llvm.`) does not match any existing
target names (for which atleast one intrinsic has been defined). Doing
so is invalid as LLVM will search for that intrinsic in that target's
intrinsic table and not find it, and conclude that its an unknown
intrinsic.
2024-09-25 12:01:17 -07:00
Rahul Joshi
ec31f76df1
[NFC] Fix line endings for OptionStrCmp.h and .td test files (#109806)
Fix line endings for these files to Unix style.
2024-09-24 12:24:17 -07:00
Rahul Joshi
66c8dce82e
[TableGen] Add a !listflatten operator to TableGen (#109346)
Add a !listflatten operator that will transform an input list of type
`list<list<X>>` to `list<X>` by concatenating elements of the
constituent lists of the input argument.
2024-09-24 06:01:34 -07:00
Thomas Fransham
283c2c8800
[TableGen] Add explicit symbol visibility macros to code generated (#107873)
Update llvm's TableGen to emit new explicit symbol visibility macros I
added in https://github.com/llvm/llvm-project/pull/96630 to the function
declarations it creates
The generated functions need to be exported from llvm's shared library
for Clang and some OpenMP tests. @compnerd
2024-09-19 12:42:46 -07:00
Rahul Joshi
e0458a24a1
[LLVM][TableGen] Add error check for duplicate intrinsic names (#109226)
Check for duplicate intrinsic names in the intrinsic emitter backend and
issue a fatal error if we find one.
2024-09-19 05:21:00 -07:00
Mahesh-Attarde
311e4e3245
[X86][AVX10.2] Support AVX10.2 MOVZXC new Instructions. (#108537)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

Chapter 14 INTEL® AVX10 ZERO-EXTENDING PARTIAL VECTOR COPY INSTRUCTIONS

---------

Co-authored-by: mattarde <mattarde@intel.com>
2024-09-18 21:01:51 +08:00
Mahesh-Attarde
f5ad9e1ca5
[X86][AVX10.2] Support AVX10.2-COMEF new instructions. (#108063)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

Chapter 8  AVX10 COMPARE SCALAR FP WITH ENHANCED EFLAGS INSTRUCTIONS

---------

Co-authored-by: mattarde <mattarde@intel.com>
2024-09-18 17:55:29 +08:00
David Green
feac761f37
[GlobalISel][AArch64] Add G_FPTOSI_SAT/G_FPTOUI_SAT (#96297)
This is an implementation of the saturating fp to int conversions for
GlobalISel. On AArch64 the converstion instrctions work this way,
producing saturating results. LegalizerHelper::lowerFPTOINT_SAT is
ported from SDAG.

AArch64 has a lot of existing tests for fptosi_sat, covering a wide
range of types. I have tried to make most of them work all at once, but
a few fall back due to other missing features such as f128 handling for
min/max.
2024-09-16 10:33:59 +01:00
Simon Pilgrim
1e33bd2031 [X86] Add missing immediate qualifier to the (V)PINSR/PEXTR instruction names
Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on
2024-09-15 14:12:55 +01:00
Simon Pilgrim
7048857f52 [X86] Add missing immediate qualifier to the (V)EXTRACTPS instruction names
Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on
2024-09-15 13:41:46 +01:00
Simon Pilgrim
614a064cac
[X86] Add missing immediate qualifier to the (V)INSERT/EXTRACT/PERM2 instruction names (#108593)
Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on
2024-09-15 11:42:13 +01:00
Malay Sanghi
a409ebc1fc
[X86][AVX10.2] Support AVX10.2-SATCVT-DS new instructions. (#102592)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-09-12 22:45:20 +08:00
Rahul Joshi
c1e3b990a9
[TableGen] Eliminate static CodeGenIntrinsicMap in PatternParser (#107339)
Instead, move it to CodeGenTarget class, and use it in both
PatternParser and SearchableTableEmitter.
2024-09-07 15:11:34 -07:00
Rahul Joshi
0ceffd362b
[TableGen] Add PrintError family overload that take a print function (#107333)
Add PrintError and family overload that accepts a print function. This
avoids constructing potentially long strings for passing into these
print functions.
2024-09-07 05:13:54 -07:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Rahul Joshi
50be455ab8
[TableGen] Add check for number of intrinsic return values (#107326)
Fail if we see an intrinsic that returns more than the supported number
of return values.

Intrinsics can return only upto a certain nyumber of values, as defined
by the `IIT_RetNumbers` list in `Intrinsics.td`. Currently, if we define
an intrinsic that exceeds the limit, llvm-tblgen crashes. Instead, read
this limit and fail if it's exceeded with a proper error message.
2024-09-05 14:52:30 -07:00
Nikita Popov
f006246299
[CodeGen] Add generic INIT_UNDEF pseudo (#106744)
The InitUndef pass currently uses target-specific pseudo instructions,
with one pseudo per register class.

Instead, add a generic pseudo instruction, which can be used by all
targets and register classes.
2024-09-05 09:34:39 +02:00
Rahul Joshi
98c6bbfe1f
[TableGen] Refactor Intrinsics record (#106986)
Eliminate unused `isTarget` field in Intrinsic record.

Eliminate `isOverloaded`, `Types` and `TypeSig` fields from the record,
as they are already available through the `TypeInfo` field. Change
intrinsic emitter code to look for this info using fields of the
`TypeInfo` record attached to the `Intrinsic` record.

Fix several intrinsic related unit tests to source the `Intrinsic` class
def from Intrinsics.td as opposed to defining a skeleton in the test.

This eliminates some duplication of information in the Intrinsic class,
as well as reduces the memory allocated for record fields, resulting in
~2% reduction (though that's not the main goal).
2024-09-04 15:04:10 -07:00
Freddy Ye
83ad644afa
[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-09-04 08:13:24 +08:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Rahul Joshi
032c3283ab
[NFC][TableGen] Refactor IntrinsicEmitter code (#106479)
- Use formatv() and raw string literals to simplify emission code.
- Use range based for loops and structured bindings to simplify loops.
- Use const Pointers to Records.
- Rename `ComputeFixedEncoding` to `ComputeTypeSignature` to reflect
  what the function actually does, cnd change it to return a vector.
- Use reverse() and range based for loop to pack 8 nibbles into 32-bits.
- Rename some variables to follow LLVM coding standards.
- For function memory effects, print human readable effects in comment.
2024-08-29 08:06:45 -07:00
Freddy Ye
7c4cadfc43
[X86][AVX10.2] Support AVX10.2-CONVERT new instructions. (#101600)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-21 15:44:06 +08:00
Rahul Joshi
340fb65971
[TableGen] Detect invalid -D arguments and fail (#102813)
- Detect invalid macro names specified on command line and fail if one
found.
- Specifically, -DXYZ=1 for example, will fail instead is being silently
accepted.
2024-08-19 10:15:52 -07:00
Akshat Oke
576aa3a509
[TableGen] Resolve References at top level (#104578)
Add a dummy resolver to resolve references outside records. This invokes
Fold() with isFinal to force resolution.

Fixes #102447

Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
2024-08-19 21:05:39 +05:30
David Green
10fe531d6c [GlobalISel] Add and use an Opcode variable and update match-table-cxx.td checks. NFC 2024-08-18 11:08:49 +01:00
Rahul Joshi
8ce5a32f02
[TableGen] Rework error reporting for duplicate Feature/Processor (#102257)
- Extract code for sorting and checking duplicate Records into a helper
  function and update `collectProcModels` to use the helper.
- Update `FeatureKeyValues` to:
  (a) Remove code for duplicate checks and use the helper.
  (b) Trim features with empty name explicitly to be able to use
      the helper.
- Make the sorting deterministic by using record name as a secondary
  key for sorting, and re-enable SubtargetFeatureUniqueNames.td test
  that was disabled due to the non-determinism of the error messages.
- Change wording of error message when duplicate records are found to
  be source code position agnostic, since `First` may not be before
  `Second` lexically.
2024-08-08 02:00:36 +03:00
Rahul Joshi
30a4e9e5be
[NFC] Temporarily disable failing TableGen unit test. (#102308)
- This test was reported as failing in ASAN builds, with
non-deterministic order of the error message and note expected.
- This is due to llvm::sort() being unstable.
- Temporarily disable the test while its been fixed.
2024-08-07 15:41:05 +03:00
Rahul Joshi
4c97c52fe0
[TableGen] Emit better error message for duplicate Subtarget features. (#102090)
- Keep track of last definition of a feature in a `DenseMap` and use 
  it to report a better error message when a duplicate feature is found.
- Use StringMap instead of a std::map in `EmitStageAndOperandCycleData`
- Add a unit test to check if duplicate names are flagged.
2024-08-06 23:39:04 +02:00
Freddy Ye
80721e0d6c
[X86][AVX10.2] Support AVX10.2-SATCVT new instructions. (#101599)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-06 19:37:49 +08:00
Phoebe Wang
b0329206db
[X86][AVX10.2] Support AVX10.2 VNNI FP16/INT8/INT16 new instructions (#101783)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-05 18:57:42 +08:00
Pierre van Houtryve
4cfbd494b0
[GlobalISel][TableGen] Fix match-table-variadics tests label checking (#101625)
We don't really care about the precise label values for specific tests
like this, so just match any number.

The comments are enough to say where we jump in every case, and we have
plenty of tests that check basic match table features (on top of being
just tested by the fact GlobalISel CodeGen isn't broken).
2024-08-05 09:13:31 +02:00
Freddy Ye
3d5cc7e1e6
[X86][AVX10.2] Support AVX10.2-MINMAX new instructions. (#101598)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-05 11:06:02 +08:00
Phoebe Wang
259ca9ee9c
Reland "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)" (#101616)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-03 09:26:07 +08:00
Phoebe Wang
2e0588d5e1
Revert "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions" (#101612)
Reverts llvm/llvm-project#101452

There are several buildbot failed. Revert first.
2024-08-02 13:04:10 +08:00
Phoebe Wang
10bad2c8d7
[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-02 12:10:50 +08:00