33969 Commits

Author SHA1 Message Date
Simon Pilgrim
c7fce3f98b [DAG] Rename computeOverflowKind -> computeOverflowForUnsignedAdd. NFC.
Matches the naming convention for the equivalent ValueTracking helpers - further SelectionDAG computeOverflowFor*() helpers will be added soon.
2023-05-05 19:38:54 +01:00
Simon Pilgrim
051918c71e [DAG] expandIntMINMAX - add umax(x,1) --> sub(x,cmpeq(x,0)) fold
Move the fold from X86 to generic expansion

(We also have several existing expansions that are missing freezes on repeated operands - I've added a TODO for now).
2023-05-05 19:27:52 +01:00
Simon Pilgrim
04e809ab90 [DAG] Add TargetLowering::expandABD and convert X86 lowering to use it
Scalar widening cases are still custom lowered in the X86 backend - we still need to add promotion/legalization support to handle these
2023-05-05 15:13:23 +01:00
Luo, Yuanke
ae1ca47bb4 [Coverity] Big parameter passed by value. 2023-05-05 09:50:38 +08:00
Luo, Yuanke
b0fb98227c [Coverity] Big parameter passed by value. 2023-05-05 09:15:22 +08:00
Craig Topper
fe9f557578 [DAGCombiner][RISCV] Enable reassociation for VP_FMA in visitFADDForFMACombine.
Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D149911
2023-05-04 17:20:58 -07:00
Ilya Kuklin
c395a84600 [MSP430] Get the DWARF pointer size from MCAsmInfo instead of DataLayout.
This change will allow to put code pointers in DWARF info fields that are larger than actual pointer size, e.g. 16-bit pointers into 32-bit fields.

The need for this came up while creating support for MSP430 in LLDB. MSP430-GCC already generates DWARF info with 32-bit fields, so this change is necessary for LLDB to maintain compatibility with both GCC and LLVM binaries. Moreover, right now in LLDB there is no support for having DWARF pointer size different from ELF header type, e.g. 16-bit DWARF info within ELF32, and it seems there is no such thing as ELF16.

Since other mainline targets are made to have the same pointer size in both MCAsmInfo and DataLayout, there is no need to change anything there.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D148042
2023-05-04 12:37:30 -07:00
Felipe de Azevedo Piovezan
ae39de91b8 [MIRParser][nfc] Factor out code parsing debug MD nodes
This commit splits a function that both parses MD nodes from YAML into
DI{Expr,Loc,Variable} objects AND adds an entry to the MF variable table, so
that each of those jobs is done separately.

It will enable subsequent patches to reuse the MD node parsing code.

Differential Revision: https://reviews.llvm.org/D149870
2023-05-04 14:17:08 -04:00
Yeting Kuo
287aa6c453 [DAGCombiner] Use generalized pattern match for visitFSUBForFMACombine.
The patch makes visitFSUBForFMACombine serve vp.fsub too. It helps DAGCombiner
to fuse vp.fsub and vp.fmul patterns to vp.fma.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D149821
2023-05-04 22:02:32 +08:00
Luo, Yuanke
d9b92c4d55 [Coverity] Improper use of negtive value.
The `Iteration` value may be -1 which would cause incorrect loop count
when pass the value to buildSqrtNROneConst or buildSqrtNRTwoConst.
2023-05-04 21:11:49 +08:00
Evgenii Kudriashov
a82d27a9a6 [X86] Support llvm.{min,max}imum.f{16,32,64}
Addresses https://github.com/llvm/llvm-project/issues/53353

Reviewed By: RKSimon, pengfei

Differential Revision: https://reviews.llvm.org/D145634
2023-05-04 21:04:48 +08:00
NAKAMURA Takumi
342a3ce27e Move LLT::dump()'s impl to LowLevelType.cpp
Suggested by @jobnoorman
https://reviews.llvm.org/D148767#4317848
2023-05-04 21:29:59 +09:00
Tom Weaver
1d8ab713ad Revert "[DebugLine] save one debug line entry for empty prologue"
This reverts commit b48a8233f5e230e46182bf5c523ceb6a04cec8f5.

This change caused https://lab.llvm.org/buildbot/#/builders/247/builds/4125
to start failing, please address the failures before resubmitting.
2023-05-04 11:08:58 +01:00
Simon Pilgrim
3928589314 [DAG] computeKnownBits - remove old ashr TODO comment
KnownBits::ashr now uses the minimum shift amount to try and extend the sign bit
2023-05-04 10:26:30 +01:00
Chen Zheng
b48a8233f5 [DebugLine] save one debug line entry for empty prologue
Some debuggers like DBX on AIX assume the address in debug line
entries is always incremental. But clang generates two entries (entry
for file scope line and entry for prologue end) with same address if
prologue is empty

And if the prologue is empty, seems the first debug line entry for the
function is unnecessary(i.e. removing the first entry won't impact the
behavior in GDB on Linux), so I implement this for all debuggers.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D147506
2023-05-04 04:37:34 +00:00
Daniel Paoliello
e48826e016 Emit the correct flags for the PROC CodeView Debug Symbol
The S_LPROC32_ID and S_GPROC32_ID CodeView Debug Symbols have a flags
field which LLVM has had the values for (in the ProcSymFlags enum) but
has never actually set.

These flags are used by Microsoft-internal tooling that leverages debug
information to do binary analysis.

Modified LLVM to set the correct flags:

- ProcSymFlags::HasOptimizedDebugInfo - always set, as this indicates that
debug info is present for optimized builds (if debug info is not emitted
for optimized builds, then LLVM won't emit a debug symbol at all).
- ProcSymFlags::IsNoReturn and ProcSymFlags::IsNoInline - set if the
function has the NoReturn or NoInline attributes respectively.
- ProcSymFlags::HasFP - set if the function requires a frame pointer (per
TargetFrameLowering::hasFP).

Differential Revision: https://reviews.llvm.org/D148761
2023-05-03 18:20:16 -07:00
Mateja Marjanovic
cf76074a36 [AMDGPU][GlobalISel] Check exact width in get*ClassForBitWidth and widen if necessary
Instead of checking if the given bitwidth is less or equal to a bitwidth of an existing RegClass,
check if it has the exact same value.

For LLVM vector types that don't have a corresponding Register Class, widen them during legalization.
That goes for G_EXTRACT_VECTOR_ELT, G_INSERT_VECTOR_ELT and G_BUILD_VECTOR.

Differential revision: https://reviews.llvm.org/D148096
Reviewers: foad, arsenm
2023-05-03 17:32:24 +02:00
Mateja Marjanovic
6175ec0bb6 Revert "[AMDGPU][GlobalISel] Widen the vector operand in G_BUILD/INSERT/EXTRACT_VECTOR"
This reverts commit b25c7cafcbe1b52ea2d1ff5e5c2f13674b5f297d.
2023-05-03 17:28:01 +02:00
Mateja Marjanovic
b25c7cafcb [AMDGPU][GlobalISel] Widen the vector operand in G_BUILD/INSERT/EXTRACT_VECTOR
Widen the vector operand type in G_BUILD_VECTOR, G_INSERT_VECTOR_ELT,
G_EXTRACT_VECTOR_ELT to the nearest larger RegClass.
2023-05-03 17:14:38 +02:00
Felipe de Azevedo Piovezan
a524f84780 [SelectionDAG][NFCI] Use common logic for identifying MMI vars
After function argument lowering, but prior to instruction selection,
dbg declares pointing to function arguments are lowered using special
logic.

Later, during instruction selection (both "fast" and regular ISel), this
logic is "repeated" in order to identify which intrinsics have already
been lowered. This is bad for two reasons:

1. The logic is not _really_ repeated, the code is different, which
could lead to duplicate lowering of the intrinsic.
2. Even if the logic were repeated properly, this is still code
duplication.

This patch addresses these issues by storing all preprocessed
dbg.declare intrinsics in a set inside FuncInfo; the set is queried upon
instruction selection.

Differential Revision: https://reviews.llvm.org/D149682
2023-05-03 10:58:31 -04:00
Florian Hahn
4e2b4f97a0
[ShrinkWrap] Use underlying object to rule out stack access.
Allow shrink-wrapping past memory accesses that only access globals or
function arguments. This patch uses getUnderlyingObject to try to
identify the accessed object by a given memory operand. If it is a
global or an argument, it does not access the stack of the current
function and should not block shrink wrapping.

Note that the caller's stack may get accessed when passing an argument
via the stack, but not the stack of the current function.

This addresses part of the TODO from D63152.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D149668
2023-05-03 09:28:07 +01:00
Shengchen Kan
3910a9fcb2 Revert part of D149033 b/c original code is correct
This reverts part of D149033 and  rG8f966cedea594d9a91e585e88a80a42c04049e6c. The added test case
is kept to avoid future regression.

Reviewed By: vzakhari, vdonaldson

Differential Revision: https://reviews.llvm.org/D149639
2023-05-03 12:20:19 +08:00
Dávid Bolvanský
20831c3c23 [MachineInst] Switch NumOperands to 16bits
Decrease NumOperands from 32 to 16bits (matches MCInstrDesc) so we can use saved bits to extend Flags (https://reviews.llvm.org/D118118).

Reviewed By: barannikov88

Differential Revision: https://reviews.llvm.org/D149445
2023-05-02 22:48:22 +02:00
NAKAMURA Takumi
631bfdbee5 Switch llvm/CodeGen/MachineValueType.h to the generated one
Prune `SupportTests/MVTTest` since it is no longer needed.

Depends on D148769

Differential Revision: https://reviews.llvm.org/D148770
2023-05-03 00:13:20 +09:00
NAKAMURA Takumi
5d71ec6e44 Split out CodeGenTypes from CodeGen for LLT/MVT
This reduces dependencies on `llvm-tblgen` so much.

`CodeGenTypes` depends on `Support` at the moment.
Be careful to append deps on this, since Targets' tablegens
depend on this.

Depends on D149024

Differential Revision: https://reviews.llvm.org/D148769
2023-05-03 00:13:20 +09:00
NAKAMURA Takumi
c1221251fb Restore CodeGen/MachineValueType.h from Support
This is rework of;

  - rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.

Depends on D148767

Differential Revision: https://reviews.llvm.org/D149024
2023-05-03 00:13:20 +09:00
NAKAMURA Takumi
9cfeba5b12 Restore CodeGen/LowLevelType from Support
This is rework of;
  - D30046 (LLT)

Since I have introduced `llvm-min-tblgen` as D146352, `llvm-tblgen`
may depend on `CodeGen`.

`LowLevlType.h` originally belonged to `CodeGen`. Almost all userse are
still under `CodeGen` or `Target`. I think `CodeGen` is the right place
to put `LowLevelType.h`.

`MachineValueType.h` may be moved as well. (later, D149024)

I have made many modules depend on `CodeGen`. It is consistent but
inefficient. It will be split out later, D148769

Besides, I had to isolate MVT and LLT in modmap, since
`llvm::PredicateInfo` clashes between `TableGen/CodeGenSchedule.h`
and `Transforms/Utils/PredicateInfo.h`.
(I think better to introduce namespace llvm::TableGen)

Depends on D145937, D146352, and D148768.

Differential Revision: https://reviews.llvm.org/D148767
2023-05-03 00:13:19 +09:00
Jay Foad
55678b43b5 [CodeGen] One more use of MachineBasicBlock::phis. NFC. 2023-05-02 14:55:24 +01:00
Jay Foad
4b2381a5f0 [CodeGen] Make use of MachineBasicBlock::phis. NFC. 2023-05-02 13:39:01 +01:00
Shengchen Kan
8f966cedea [SelectionDAG] Use int64_t to store the integer power of llvm.powi
https://llvm.org/docs/LangRef.html#llvm-powi-intrinsic
The max length of the integer power of `llvm.powi` intrinsic is 32, and
the value can be negative. If we use `int32_t` to store this value, `-Val`
will underflow when it is `INT32_MIN`

The issue was reported in D149033.
2023-05-02 14:08:42 +08:00
Phoebe Wang
c2dd36cd39 [Coverity] Fix unchecked return value, NFC
The `ReversePredicate` should have made sure the reverse predicate is
supported by target, but the check comes from early function and might
be invalid by any mistake. So it's better to double confirm it here.

Differential Revision: https://reviews.llvm.org/D149586
2023-05-02 13:43:28 +08:00
Shengchen Kan
4e4db6f6c6 Revert "[SelectionDAG] Use logic right shift to avoid loop hang"
This reverts commit b73229e55543b4ba2b293adcb8b7d6025f01f7d9.

It caused LIT failure on non-X86 targets.
2023-05-02 13:14:47 +08:00
Shengchen Kan
b73229e555 [SelectionDAG] Use logic right shift to avoid loop hang
Issue was reported in D149033, `Val` can be negative value and
arithmetic right shift always keeps the sign bit.

BTW, the redundant code `Val = -Val` is removed by this patch.
2023-05-02 12:47:28 +08:00
Craig Topper
2d58925362 [LegalizeVectorOps][RISCV] Support condition code legalization for ISD::STRICT_FSETCC/FSETCCS during LegalizeVectorOps.
Switch RISC-V to legalize during LegalizeVectorOps instead of
LegalizeDAG. LegalizeDAG uses the OpVT for legalize action while
LegalizeVectorOps uses the result VT. We really should fix that.
2023-04-29 22:55:41 -07:00
Craig Topper
344368fb98 [TargetLowering] Stop passing an ISD::CondCode to isOperationLegalOrCustom.
ISD::CondCode is a separate num space from opcodes. isOperationLegalOrCustom
should take an opcode.

Reviewed By: barannikov88

Differential Revision: https://reviews.llvm.org/D149528
2023-04-29 15:23:09 -07:00
Sergei Barannikov
e744e51b12 [SelectionDAG] Rename ADDCARRY/SUBCARRY to UADDO_CARRY/USUBO_CARRY (NFC)
This will make them consistent with other overflow-aware nodes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D148196
2023-04-29 21:59:58 +03:00
Craig Topper
df017ba9d3 [TargetLowering] Don't use ISD::SELECT_CC in expandFP_TO_INT_SAT.
This function gets called for vectors and ISD::SELECT_CC was never
intended to support vectors. Some updates were made to support
it when this function started getting used for vectors.

Overall, using separate ISD::SETCC and ISD::SELECT looks like an
improvement even for scalar.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D149481
2023-04-29 10:23:08 -07:00
Matt Arsenault
bc37be1855 LangRef: Add "dynamic" option to "denormal-fp-math"
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.

There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.

Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.

Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.

I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.

If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.

This could also be of use for strictfp handling, where code may be
changing the denormal mode.

Alternative name could be "unknown".

Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
2023-04-29 08:44:59 -04:00
Luo, Yuanke
40222ddcf8 [X86] Fix the vnni machine combine issue.
The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly
in genAlternativeDpCodeSequence(). It causes vnni lit test failure when
LLVM_ENABLE_EXPENSIVE_CHECKS is on.
2023-04-29 13:51:08 +08:00
Wang, Xin10
9c1e4ee690 [NFC]Fix 2 logic dead code
First, in CodeGenPrepare.cpp, line 6891, the VectorCond will always be false
because if not function will return at 6888.
Second, in SelectionDAGBuilder.cpp, line 5443, getSExtValue() will return
value as int type, but now we use unsigned Val to maintain it, which make the
if condition at 5452 meaningless.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D149033
2023-04-28 03:02:59 -04:00
Jordan Rupprecht
fbf42f1fe2 Revert "[CodeGenPrepare] Estimate liveness of loop invariants when checking for address folding profitability"
This reverts commit 5344d8e10bb7d8672d4bfae8adb010465470d51b.

It causes non-determinism when building clang. See the review thread on D143897.
2023-04-27 19:16:32 -07:00
Nick Desaulniers
012ea747ed [CodeGen][MachineLastInstrsCleanup] fix INLINEASM_BR hazard
If the removable definition resides in an INLINEASM_BR target, the
reuseable candidate might not dominate the INLINEASM_BR.

   bb0:
      INLINEASM_BR &"" %bb.1
      renamable $x8 = MOVi64imm 29273397577910035
      B %bb.2
      ...
    bb1:
      renamable $x8 = MOVi64imm 29273397577910035
      renamable $x8 = ADDXri killed renamable $x8, 2048, 0
    bb2:

Removing the second mov is a hazard when the inline asm branches to bb1.

Skip such replacements when the to be removed instruction is in the
target of such an INLINEASM_BR instruction.

We could get more aggressive about this in the future, but for now
simply abort.

This is causing a boot failure on linux-4.19.y branches of the LTS Linux
kernel for ARCH=arm64 with CONFIG_RANDOMIZE_BASE=y (KASLR) and
CONFIG_UNMAP_KERNEL_AT_EL0=y (KPTI).

Link: https://reviews.llvm.org/D123394
Link: https://github.com/ClangBuiltLinux/linux/issues/1837

Thanks to @nathanchance for the report, and @ardb for debugging.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D149191
2023-04-27 13:40:00 -07:00
ManuelJBrito
d22edb9794 [IR][NFC] Change UndefMaskElem to PoisonMaskElem
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.

Differential Revision: https://reviews.llvm.org/D149256
2023-04-27 18:01:54 +01:00
Alexis Engelke
1e743732e7 [RegAllocFast] Use uint16_t SparseT for LiveRegMap
For functions with very large numbers of live variables, lookups into
LiveRegMap previously detoriated to linear searches.

This slightly increases memory usage, but that is barely measurable.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D149330
2023-04-27 18:58:49 +02:00
Craig Topper
0b5396b163 [LegalizeVectorOps] Use all ones mask when expanding i1 VP_SELECT.
We were previously using the condition as the mask. By the semantics
of VP operations, that means that anywhere the condition is false
returns poison and not the false operand.

Use an all ones mask instead.

No tests are affected because RISC-V drops the mask when lowering.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D149310
2023-04-27 08:26:16 -07:00
Jay Foad
fdc0d5f399 [DAG] Do not call computeKnownBits from isKnownToBeAPowerOfTwo
The only way known bits could help identify a known power of two is if
it knows exactly which power of two it is, i.e. if it is a known
constant. But in that case the value should have been simplified to a
constant already. So save some compile time by not calling
computeKnownBits.

Differential Revision: https://reviews.llvm.org/D149325
2023-04-27 11:05:56 +01:00
Luo, Yuanke
8f7f9d86a7 [X86] Machine combine vnni instruction.
"vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is
reduced after combination. However when vpdpwssd is in a critical path
the combination get less ILP. It happens when vpdpwssd is in a loop, the
vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd
has data dependency for each iterations. If vpaddd is in a critical path
while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd
+ vpaddd".
This patch is based on the machine combiner framework to acheive decision
on "vpmaddwd + vpaddd" combination. The typical example code is as
below.
```
__m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) {

    for (int i = 0; i < cnt; ++i) {
        __m256i a = p[i];
        __m256i m = _mm256_madd_epi16 (b, a);
        c = _mm256_add_epi32(m, c);
    }

    return c;
}
```

Differential Revision: https://reviews.llvm.org/D148980
2023-04-27 16:42:04 +08:00
Jay Foad
47d3cbcf84 [BranchFolder] Skip redundant IMPLICIT_DEFs of subregs
Differential Revision: https://reviews.llvm.org/D148509
2023-04-27 09:40:06 +01:00
Mingming Liu
9879e5865a [InlineAsm][AArch64]Add backend support for flag output parameters
- The set of flag is from https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Flag-Output-Operands

Before:
- ARM64 GCC supports flag output constraints, while Clang doesn't parse condition code, as shown in https://gcc.godbolt.org/z/7jzMEK796
- LLVM ISel won't lower them either (as shown in https://gcc.godbolt.org/z/Pv4PPf56c)

After:
- Given flag output constraints in LLVM IR, condition code is parsed and flag output is lowered to 'cset'.
- Clang parse is not added in this patch.

Differential Revision: https://reviews.llvm.org/D149032
2023-04-26 09:18:41 -07:00
Felipe de Azevedo Piovezan
815eab2d3c [DebugLocEntry][nfc] Remove redundant cast
A cast from DIExpression->DIExpression is not needed.

Differential Revision: https://reviews.llvm.org/D149178
2023-04-26 07:56:15 -04:00