1609 Commits

Author SHA1 Message Date
Phoebe Wang
da1eb886c4
[X86] Do not check alignment for VINSERTPS (#65721)
We don't have alignment constraint in AVX instructions.
2023-09-08 19:23:43 +08:00
Daniel Hoekwater
ca72b0a709 [CodeGen] Use the TII hook for Noop insertion in BBSections (NFC)
Refactor BasicBlockSections to use the target-specific noop insertion
hook from TargetInstrInfo instead of building it ourselves. Using the
TII hook is both cleaner and makes it easier to extend BBSections to
non-X86 targets.

Differential Revision: https://reviews.llvm.org/D158303
2023-08-18 19:40:11 +00:00
Shengchen Kan
fda9a9c61e [X86][Codegen] Remove dead code for ADCX/ADOX
There is no pattern for ADCX/ADOX and they are never selected during
ISEL. So we remove the cases in some MIR optimizations in this patch.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D157717
2023-08-14 10:23:42 +08:00
Sander de Smalen
bbb95893de [TII] NFCI: Simplify the interface for isTriviallyReMaterializable
Currently `isTriviallyReMaterializable` calls
`isReallyTriviallyReMaterializable` and
`isReallyTriviallyReMaterializableGeneric`. The two interfaces
are confusing, but there are also some real issues with this.

The documentation of this function (see below) suggests that
`isReallyTriviallyRematerializable` allows the target to override the
default behaviour.

  /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is
  /// set, this hook lets the target specify whether the instruction is actually
  /// trivially rematerializable, taking into consideration its operands.

It however implements something different. The default behaviour
is the analysis done in `isReallyTriviallyReMaterializableGeneric`,
which is testing if it is safe to rematerialize the MachineInstr.

The result of `isReallyTriviallyReMaterializable` is only considered if
`isReallyTriviallyReMaterializableGeneric` returns `false`.  That means
there is no way to override the default behaviour if
`isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to
rematerialize, but we'd rather not).

By making this a single interface, we can override the interface to do either.

Reviewed By: craig.topper, nemanjai

Differential Revision: https://reviews.llvm.org/D156520
2023-08-07 13:01:06 +00:00
Matt Arsenault
c26dfc81e2 [HACK] X86: Disable isCopyInstrImpl for undef subregister defs
This is a workaround for a coalescer bug where coalescing
SUBREG_TO_REG ends up losing the liveness of the high bits of the
source register. The result is an incorrect undef subregister def
instead of preserving the high values. Work around the observed
failure after the resulting mov is eliminated during allocation until
a proper fix is ready. I believe the proper fix is to make
SUBREG_TO_REG use a tied operand.

The test should catch a regression originally observed after
b7836d856206ec39509d42529f958c920368166b and should not show a
difference after a496c8be6e638ae58bb45f13113dbe3a4b7b23fd is reverted.

https://reviews.llvm.org/D156164
2023-07-28 13:33:28 -04:00
Freddy Ye
1c154bd755 [X86] Add AVX-VNNI-INT16 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155145
2023-07-20 14:31:16 +08:00
XinWang10
2d6a5ab5eb [X86]Recommit D154193 - Remove TEST in AND32ri+TEST16rr in peephole-opt
Previously we remove a pattern like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = subreg_to_reg 0, %reg, %subreg.sub_index
  test64rr %src_reg, %src_reg, implicit-def $eflags
We can remove test64rr since it has same functionality as and subreg_to_reg avoid the opt in previous code, so we handle this case specially.
And this case is also can be opted for the same reason, like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = copy %reg.sub_16bit:gr32
  test16rr %src_reg, %src_reg, implicit-def $eflags
The COPY from gr32 to gr16 prevent the opt in previous code too, just handle it specially as what we did for test64rr.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D154193
2023-07-14 03:42:42 -04:00
Wang, Xin10
284a059b33 Revert "[X86]Remove TEST in AND32ri+TEST16rr in peephole-opt"
This reverts commit 2c64226d84174dd1d9f93e1884c1b0bd432f89b5.

revert first due to buildbot fail https://lab.llvm.org/buildbot/#/builders/85/builds/17571
2023-07-10 03:20:11 -04:00
XinWang10
2c64226d84 [X86]Remove TEST in AND32ri+TEST16rr in peephole-opt
Previously we remove a pattern like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = subreg_to_reg 0, %reg, %subreg.sub_index
  test64rr %src_reg, %src_reg, implicit-def $eflags
We can remove test64rr since it has same functionality as and subreg_to_reg avoid the opt in previous code, so we handle this case specially.
And this case is also can be opted for the same reason, like:
  %reg = and32ri %in_reg, 5
  ...                         // EFLAGS not changed.
  %src_reg = copy %reg.sub_16bit:gr32
  test16rr %src_reg, %src_reg, implicit-def $eflags
The COPY from gr32 to gr16 prevent the opt in previous code too, just handle it specially as what we did for test64rr.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D154193
2023-07-09 23:21:32 -04:00
David Green
2802739dfd [NFC] Replace ;; with ; 2023-06-11 10:25:24 +01:00
Dávid Bolvanský
09515f2c20 [SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata
Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata.

Example:

```
int MaxIndex(int n, int *a) {
    int t = 0;
    for (int i = 1; i < n; i++) {
        // cmov is converted to branch by X86CmovConversion
        if (a[i] > a[t]) t = i;
    }
    return t;
}

int MaxIndex2(int n, int *a) {
    int t = 0;
    for (int i = 1; i < n; i++) {
        // cmov is preserved
        if (__builtin_unpredictable(a[i] > a[t])) t = i;
    }
    return t;
}
```

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D118118
2023-06-01 20:56:44 +02:00
Shengchen Kan
f603809637 [X86] Move encoding optimization for PUSH32i, PUSH64i to MC lowering, NFCI 2023-05-20 17:59:43 +08:00
Shengchen Kan
89ca4eb002 [X86][NFC] Correct the instruction names for PUSH16i, PUSH32i
Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D151012
2023-05-20 17:33:42 +08:00
Shengchen Kan
0d9b36ce7d [X86] Remove patterns for IMUL with immediate 8 and optimize during MC lowering, NFCI 2023-05-20 11:14:03 +08:00
Shengchen Kan
c81a121f3f Revert "Revert "[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI""
This reverts commit cb16b33a03aff70b2499c3452f2f817f3f92d20d.

In fact, the test https://bugs.chromium.org/p/chromium/issues/detail?id=1446973#c2
already passed after 5586bc539acb26cb94e461438de01a5080513401
2023-05-19 22:21:56 +08:00
Hans Wennborg
cb16b33a03 Revert "[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI"
This caused compiler assertions, see comment on
https://reviews.llvm.org/D150107.

This also reverts the dependent follow-up change:

> [X86] Remove patterns for ADD/AND/OR/SUB/XOR/CMP with immediate 8 and optimize during MC lowering, NFCI
>
> This is follow-up of D150107.
>
> In addition, the function `X86::optimizeToFixedRegisterOrShortImmediateForm` can be
> shared with project bolt and eliminates the code in X86InstrRelaxTables.cpp.
>
> Differential Revision: https://reviews.llvm.org/D150949

This reverts commit 2ef8ae134828876ab3ebda4a81bb2df7b095d030 and
5586bc539acb26cb94e461438de01a5080513401.
2023-05-19 14:43:33 +02:00
Shengchen Kan
5586bc539a [X86] Remove patterns for ADD/AND/OR/SUB/XOR/CMP with immediate 8 and optimize during MC lowering, NFCI
This is follow-up of D150107.

In addition, the function `X86::optimizeToFixedRegisterOrShortImmediateForm` can be
shared with project bolt and eliminates the code in X86InstrRelaxTables.cpp.

Differential Revision: https://reviews.llvm.org/D150949
2023-05-19 18:22:30 +08:00
Shengchen Kan
2ef8ae1348 [X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI
This is follow-up of D150107.
2023-05-19 10:33:52 +08:00
Shengchen Kan
77589e945f [X86] Remove patterns for shift/rotate with immediate 1 and optimize during MC lowering
It's first suggested by @craig.topper  in D150068. I think there are at least three pros

1. This can reduce the patterns during ISEL, as a result, reducing the bytes in X86GenDAGISel.inc
2. The patterns for shift/rotate with immediate 1 look quite similar to shift/rotate with immediate 8. So this can be seen as eliminating "duplicate" code.
3. Delay the optimization from imm8 to imm1, so that the previous optimization passes do not need to handle the version of imm1

It improves fast isel code and makes X86DomainReassignment work for shifts by 1, but regressed global isel, though no one should care.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D150107
2023-05-17 19:55:44 +08:00
Matthias Braun
b8817825b9 Support critical edge splitting for jump tables
Add support for splitting critical edges coming from an indirect jump
using a jump table ("switch jumps").

This introduces the `TargetInstrInfo::getJumpTableIndex` callback to
allows targets to return an index into `MachineJumpTableInfo` for a
given indirect jump. It also updates to
`MachineBasicBlock::SplitCriticalEdge` to allow splitting of critical
edges by rewriting jump table entries.

This is largely based on work done by Zhixuan Huan in D132202.

Differential Revision: https://reviews.llvm.org/D140975
2023-05-10 20:30:52 -07:00
Sami Tolvanen
e9569748de [CodeGen][KCFI] Move cfi-type lowering to TargetLowering
KCFI machine function passes transform indirect calls with a
cfi-type attribute into architecture-specific type checks bundled
together with the calls. Instead of having a separate pass for each
architecture, add a generic machine function pass for KCFI and
move the architecture-specific code that emits the actual check to
TargetLowering. This avoids unnecessary duplication and makes it
easier to add KCFI support to other architectures.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D149915
2023-05-09 18:38:54 +00:00
Akshay Khadse
5c7c3af1d0 Reapply [Coverity] Fix explicit null dereferences
This change fixes static code analysis errors

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D149506
2023-05-08 21:19:40 +08:00
Luo, Yuanke
40222ddcf8 [X86] Fix the vnni machine combine issue.
The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly
in genAlternativeDpCodeSequence(). It causes vnni lit test failure when
LLVM_ENABLE_EXPENSIVE_CHECKS is on.
2023-04-29 13:51:08 +08:00
Jie Fu
563e3028c9 [X86] Fix -Wstring-conversion in X86InstrInfo.cpp (NFC)
/Users/jiefu/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9794:12: error: implicit conversion turns string literal into bool: 'const char[25]' to 'bool' [-Werror,-Wstring-conversion]
    assert("It should not reach here");
    ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
/Applications/Xcode13.1/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.0.sdk/usr/include/assert.h:99:25: note: expanded from macro 'assert'
    (__builtin_expect(!(e), 0) ? __assert_rtn(__func__, __ASSERT_FILE_NAME, __LINE__, #e) : (void)0)
                      ~ ^
1 error generated.
2023-04-27 17:52:57 +08:00
Jie Fu
8de16131cb [X86] Fix -Wsometimes-uninitialized in X86InstrInfo.cpp (NFC)
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'MaddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
  default:
  ^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9854:25: note: uninitialized use occurs here
  Madd->setDesc(TII.get(MaddOpc));
                        ^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9791:19: note: initialize the variable 'MaddOpc' to silence this warning
  unsigned MaddOpc;
                  ^
                   = 0
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'AddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
  default:
  ^~~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9862:46: note: uninitialized use occurs here
      BuildMI(*MF, MIMetadata(Root), TII.get(AddOpc), DstReg)
                                             ^~~~~~
/data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9790:18: note: initialize the variable 'AddOpc' to silence this warning
  unsigned AddOpc;
                 ^
                  = 0
2 errors generated.
2023-04-27 17:08:24 +08:00
Luo, Yuanke
8f7f9d86a7 [X86] Machine combine vnni instruction.
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is
reduced after combination. However when vpdpwssd is in a critical path
the combination get less ILP. It happens when vpdpwssd is in a loop, the
vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd
has data dependency for each iterations. If vpaddd is in a critical path
while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd
+ vpaddd".
This patch is based on the machine combiner framework to acheive decision
on "vpmaddwd + vpaddd" combination. The typical example code is as
below.
```
__m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) {

    for (int i = 0; i < cnt; ++i) {
        __m256i a = p[i];
        __m256i m = _mm256_madd_epi16 (b, a);
        c = _mm256_add_epi32(m, c);
    }

    return c;
}
```

Differential Revision: https://reviews.llvm.org/D148980
2023-04-27 16:42:04 +08:00
Tom Weaver
b63c08c773 Revert "[Coverity] Fix explicit null dereferences"
This reverts commit 22b23a5213b57ce1834f5b50fbbf8a50297efc8a.

This commit caused the following two build bots to start failing:
https://lab.llvm.org/buildbot/#/builders/216/builds/20322
https://lab.llvm.org/buildbot/#/builders/123/builds/18511
2023-04-24 11:14:10 +01:00
Akshay Khadse
22b23a5213 [Coverity] Fix explicit null dereferences
This change fixes static code analysis errors

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D148912
2023-04-23 12:07:11 +08:00
Shengchen Kan
92af50f41c [X86][NFC] Fix for warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits 2023-04-06 17:13:10 +08:00
Shengchen Kan
ea91acda05 [X86][mem-fold] Simplify the logic and correct the comments for TB_ALIGN, NFCI 2023-04-06 16:38:30 +08:00
Amara Emerson
41e9c4b88c [NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction.
I think it's good practice to avoid having default ctors unless they're really
valid/useful. For OutlinedFunction the default ctor was used to represent a
bail-out value for getOutliningCandidateInfo(), so I changed the API to return
an optional<getOutliningCandidateInfo> instead which seems a tad cleaner.

Differential Revision: https://reviews.llvm.org/D146375
2023-03-20 11:17:10 -07:00
duk
d61d591411
[MachineOutliner] Make getOutliningType partially target-independent
The motivation behind this patch is to unify some of the outliner logic across architectures. This looks nicer in general and makes fixing [issues like this](https://reviews.llvm.org/D124707#3483805) easier.
There are some notable changes here:
    1. `isMetaInstruction()` is used directly instead of checking for specific meta-instructions like `IMPLICIT_DEF` or `KILL`. This was already done in the RISC-V implementation, but other architectures still did hardcoded checks.
        - As an exception to this, CFI instructions are explicitly delegated to the target because RISC-V has different handling for those.

    2. `isTargetIndex()` checks are replaced with an assert; none of the architectures supported actually use `MO_TargetIndex` at this point in time.

    3. `isCFIIndex()` and `isFI()` checks are also replaced with asserts, since these operands should not exist in [any context](https://reviews.llvm.org/D122635#3447214) at this stage in the pipeline.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D125072
2023-02-09 14:35:00 -05:00
Shengchen Kan
011e4abb49 [X86][MC][bugfix] Report error for mismatched modifier in inline asm and remove function getX86SubSuperRegisterOrZero
```
MCRegister getX86SubSuperRegister*(MCRegister Reg, unsigned Size,
                                  bool High = false);
```
A strange behavior of the functions `getX86SubSuperRegister*` was
introduced by llvm-svn:145579: The returned register may not
match the parameters when a 8-bit high register is required.

And llvm-svn: 175762 refined the code and dropped the comments, then we
knew nothing happened there from the code :-(

These two functions are only called with `Size=8` and `High=true` in two places.
One is in `X86FixupBWInsts.cpp` for liveness of registers and the other is in
`X86AsmPrinter.cpp` for inline asm.

For the first one, we provide an alternative in this patch.
For the second one, the strange behaviour caused a bug that an erorr was not reported for mismatched modifier.

```
void f() {
  char x;
  asm volatile ("mov %%ah, %h0" :"=r"(x)::"%eax", "%ebx", "%ecx", "%edx", "edi", "esi");
}
```

```
$ gcc -S test.c

error: extended registers have no high halves
```

```
$ clang -S test.c

no error
```

so we fix the bug in this patch.

`getX86SubSuperRegister` is just a wrapper of `getX86SubSuperRegisterOrZero` with a `assert`.
I belive we should remove the latter.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D142834
2023-02-02 10:08:56 +08:00
Bill Wendling
7d626e7cbb [X86] Move RDFLAGS/WRFLAGS expansion until after RA
The register allocator may introduce reloads in the middle of reading
and writing the EFLAGS register, due to the RDFLAGS & WRFLAGS pseudos
being expanded before RA. This may cause an issue where the stack
pointer was adjusted but the stack offset for the reload wasn't
accounted for (see [1]).

To avoid this, expand these pseudos after register allocation.

[1] https://github.com/llvm/llvm-project/issues/59102

Reviewed By: craig.topper, nickdesaulniers, pengfei

Differential Revision: https://reviews.llvm.org/D140045
2023-01-30 15:32:16 -08:00
Kazu Hirata
5d3462162e [X86] Use llvm::countr_zero instead of findFirstSet (NFC)
At the call site of findFirstSet, ZMask | (1 << DstIdx) always have
exactly 3 bits set, and they are all among the 4 least significant
bits, so (ZMask | (1 << DstIdx)) ^ 15 has exactly one bit set.  Since
the argument to findFirstSet is nonzero, we can safely switch to
llvm::countr_zero.
2023-01-24 23:26:08 -08:00
Kazu Hirata
caa99a01f5 Use llvm::popcount instead of llvm::countPopulation(NFC) 2023-01-22 12:48:51 -08:00
Craig Topper
79858d1908 [CodeGen][Target] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.
2023-01-13 23:12:48 -08:00
serge-sans-paille
38818b60c5
Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955
2023-01-05 14:11:08 +01:00
Christudasan Devadasan
b5efec4b27 [CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot
With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138656
2022-12-17 11:55:34 +05:30
Anton Sidorenko
f8ed709345 [MachineCombiner] Extend reassociation logic to handle inverse instructions
Machine combiner supports generic reassociation only of associative and
commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we
can extend this generic support to handle patterns like
(X + A) - Y => (X - Y) + A), where `-` is the inverse of `+`.
This patch adds interface functions to process reassociation patterns of
associative/commutative instructions and their inverse variants with minimal
changes in backends.

Differential Revision: https://reviews.llvm.org/D136754
2022-12-07 13:50:28 +03:00
Fangrui Song
b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Kazu Hirata
20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Shengchen Kan
861f5dd688 [X86][NFC] Minor improvement in X86InstrInfo::optimizeCompareInstr
Before this patch, the code enumerated `getCondFromBranch`, `getCondFromSETCC` and `getCondFromFromCMov` to get the condition code of a `MachineInstr`, and assigned the result to variable `OldCC` when `MI || IsSwapped || ImmDelta != 0` was satisfiled.

After this patch, the `if-else` structure is eliminated by using `getCondFromMI`. Since `OldCC` is only used when  `MI || IsSwapped || ImmDelta != 0`  is true, it is initialized with `getCondFromMI` directly outside the scope of `if` now.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D138349
2022-11-21 21:00:07 +08:00
Sami Tolvanen
7c96f61aaa [X86][KCFI] Don't fold loads into indirect calls that need a KCFI check
Avoid unnecessary folding as X86KCFIPass would have to unfold these
anyway when emitting the KCFI_CHECK.
2022-11-18 21:55:41 +00:00
Bing1 Yu
5bc36c8cb4 [X86] Add necessary check isReg() when updating LiveVariables in convertToThreeAddress
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D137388
2022-11-10 21:12:00 +08:00
Freddy Ye
23f02693ec [X86] Add AVX-VNNI-INT8 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135938
2022-10-28 10:39:54 +08:00
Freddy Ye
0e720e6ada [X86] Add AVX-IFMA instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135932
2022-10-28 09:42:30 +08:00
Guozhi Wei
d24c93cc41 [X86] Enable reassociation for ADD instructions
ADD is an associative and commutative operation, so we can do reassociation for it.

Differential Revision: https://reviews.llvm.org/D136396
2022-10-26 00:46:13 +00:00
Jay Foad
325927ffb9 [X86] Update LiveVariables in more cases in convertToThreeAddress
Following on from D129634, this patch fixes more X86 CodeGen test
failures with D129213 applied, which adds verification of LiveIntervals
after the TwoAddressInstruction pass runs. These failures only showed up
with LLVM_ENABLE_EXPENSIVE_CHECKS=ON which adds the equivalent of an
implicit -verify-machineinstrs on all tests.

Differential Revision: https://reviews.llvm.org/D136596
2022-10-25 09:21:51 +01:00
Joao Moreira
eac3e5c3fb [X86] Do not emit JCC to __x86_indirect_thunk
Clang may optimize conditional tailcall blocks with the following layout:

cmp <condition>
je  tailcall_target
ret

When retpoline is in place, indirect calls are converted into direct calls to a retpoline thunk. When these indirect calls are tail calls, they may be subject to the above described optimization (there is no indirect JCC, but since now the jump is direct it can be made conditional). The above layout is non-ideal for the Linux kernel scenario because the branches into thunks may be patched back into indirect branches during runtime depending on the underlying CPU features, what would not be feasible if the binary is emitted with the optimized layout above.

Thus, prevent clang from emitting this it if CodeModel is Kernel.

Feature request from the respective kernel mailing list: https://lore.kernel.org/llvm/Yv3uI%2FMoJVctmBCh@worktop.programming.kicks-ass.net/

Reviewed By: nickdesaulniers, pengfei

Differential Revision: https://reviews.llvm.org/D134915
2022-10-06 11:09:24 -07:00