33671 Commits

Author SHA1 Message Date
Kazu Hirata
cbde2124f1 Use APInt::popcount instead of APInt::countPopulation (NFC)
This is for consistency with the C++20-style bit manipulation
functions in <bit>.
2023-02-19 11:29:12 -08:00
Amara Emerson
ddf167c442 [GlobalISel] Fix G_ZEXTLOAD being converted to G_SEXTLOAD incorrectly.
The extending loads combine tries to prefer sign-extends folding into loads vs
zexts, and in cases where a G_ZEXTLOAD is first used by a G_ZEXT, and then used
by a G_SEXT, it would select the G_SEXT even though the load is already
zero-extending.

Fixes issue #59630
2023-02-18 10:05:08 -08:00
Paulo Matos
890146b192 [WebAssembly] Initial support for reference type externref in clang
This patch introduces a new type __externref_t that denotes a WebAssembly opaque
reference type. It also implements builtin __builtin_wasm_ref_null_extern(),
that returns a null value of __externref_t. This lays the ground work
for further builtins and reference types.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122215
2023-02-17 18:48:48 -08:00
Amara Emerson
b309bc04ee [GlobalISel] Combine out-of-range shifts to undef.
Differential Revision: https://reviews.llvm.org/D144303
2023-02-17 15:05:00 -08:00
Anton Sidorenko
2693efa8a5 [MachineCombiner] Support local strategy for traces
For in-order cores MachineCombiner makes better decisions when the critical path
is calculated only for the current basic block and does not take into account
other blocks from the trace.

This patch adds a virtual method to TargetInstrInfo to allow each target decide
which strategy to use.

Depends on D140541

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140542
2023-02-17 13:17:22 +03:00
Yeting Kuo
a96cbeb450 [DAGCombiner] Teach MatchContextClass classes to use TargetLowering::isOperationLegalOrCustom().
Some of TargetLowering functions needed opcodes are often used in DAGCombiner.
The patch make those MatchContextClass classes have TargetLowering members and
pass specific opcodes for those TargetLowering functions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D144075
2023-02-17 15:58:47 +08:00
Kazu Hirata
93de5f13b9 [CodeGen] Fix warnings
This patch fixes:

  llvm/lib/CodeGen/CallBrPrepare.cpp:154:14: error: unused variable
  'IsDominated' [-Werror,-Wunused-variable]

  llvm/lib/CodeGen/CallBrPrepare.cpp:150:13: error: unused function
  'PrintDebugDomInfo' [-Werror,-Wunused-function]
2023-02-16 20:08:35 -08:00
Nick Desaulniers
a3a84c9e25 [llvm] add CallBrPrepare pass to pipelines
Capstone of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8

Clang changes are still necessary to enable the use of outputs along
indirect edges of asm goto statements.

Link: https://github.com/llvm/llvm-project/issues/53562

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D140180
2023-02-16 17:58:34 -08:00
Nick Desaulniers
5cc1016a57 [llvm][SelectionDAGBuilder] codegen callbr.landingpad intrinsic
Given a CallBrInst, retain its first virtual register in SelectionDagBuilder's
FunctionLoweringInfo if there's corresponding landingpad. Walk the list
of COPY MachineInstr to find the original virtual and physical registers
defined by the INLINEASM_BR MachineInst.

Test cases from https://reviews.llvm.org/D139565.
Link: https://github.com/llvm/llvm-project/issues/59538

Part 3 from
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8

Follow up patches still need to wire up CallBrPrepare into the pass
pipelines.

Reviewed By: efriedma, void

Differential Revision: https://reviews.llvm.org/D140160
2023-02-16 17:58:34 -08:00
Nick Desaulniers
28d45c843c [llvm][CallBrPrepare] use SSAUpdater to use intrinsic value
Now that we've inserted a call to an intrinsic, we need to update
certain previous uses of CallBrInst values to use the value of this
intrinsic instead.

There are 3 cases to handle:
1. The @llvm.callbr.landingpad.<type>() intrinsic call is in the same
   BasicBlock as the use of the callbr we're replacing.
2. The use is dominated by the direct destination.
3. The use is not dominated by the direct destination, and may or may
   not be dominated by the indirect destination.

Part 2c of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8.

Reviewed By: efriedma, void, jyknight

Differential Revision: https://reviews.llvm.org/D139970
2023-02-16 17:58:34 -08:00
Nick Desaulniers
094190c2f5 [llvm][CallBrPrepare] add llvm.callbr.landingpad intrinsic
Insert a new intrinsic call after splitting critical edges, and verify
it. Later commits will update the SSA values to use this new value along
indirect branches rather than the callbr's value, and have SelectionDAG
consume this new value.

Part 2b of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8.

Reviewed By: efriedma, jyknight

Differential Revision: https://reviews.llvm.org/D139883
2023-02-16 17:58:33 -08:00
Nick Desaulniers
0a39af0eb7 [llvm][CallBrPrepare] split critical edges
If we have a CallBrInst with output that's used, we need to split
critical edges so that we have some place to insert COPYs for physregs
to virtregs.

Part 2a of
https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8.

Test cases and logic re-purposed from D138078.

Reviewed By: efriedma, void, jyknight

Differential Revision: https://reviews.llvm.org/D139872
2023-02-16 17:58:33 -08:00
Nick Desaulniers
fb471158aa [llvm] boilerplate for new callbrprepare codegen IR pass
Because this pass is to be a codegen pass, it must use the legacy pass
manager.

Link: https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8

Reviewed By: aeubanks, void

Differential Revision: https://reviews.llvm.org/D139861
2023-02-16 17:58:33 -08:00
Anton Sidorenko
5bdd0beeee [MachineCombiner][NFC] Rename MinInstr to TraceEnsemble
We are about to allow different trace strategies for MachineCombiner. Make
the name of the ensemble strategy-neutral.

Depends on D140540

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140541
2023-02-16 15:09:02 +03:00
Kazu Hirata
7e6e636fb6 Use llvm::has_single_bit<uint32_t> (NFC)
This patch replaces isPowerOf2_32 with llvm::has_single_bit<uint32_t>
where the argument is wider than uint32_t.
2023-02-15 22:17:27 -08:00
Anton Sidorenko
980aa8d740 [MachineTraceMetrics] Add local strategy
This strategy makes each trace local to the basic block. For in-order cores some
heuristics work better when we do local decisions. For example, MachineCombiner
may expect that instructions outside the current basic block do not lengthen
the critical path when we execute instructions in order or the core has a
small re-order buffer.

This patch only introduce the strategy, real use-case is added in the further
pathes.

Depends on D140539

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140540
2023-02-15 15:53:14 +03:00
Samuel Parker
c7f9344d0f [DAGCombine] Fold redundant select
Recommit bbdf24357932b064f2aa18ea1356b474e0220dde.

Original commit message:

If a chain of two selects share a true/false value and are controlled
by two setcc nodes, that are never both true, we can fold away one of
the selects. So, the following:
(select (setcc X, const0, eq), Y,
  (select (setcc X, const1, eq), Z, Y))

Can be combined to:
  select (setcc X, const1, eq) Z, Y

Differential Revision: https://reviews.llvm.org/D142535
2023-02-15 10:32:16 +00:00
Noah Goldstein
42e11a6ea3 Add transform (and/or (icmp eq/ne (A, C)), (icmp eq/ne (A, -C))) -> (icmp eq/ne (ABS A), ABS(C))
This can be beneficial if there is a fast `ABS` (For example with X86
`vpabs`) or if there is a dominating ABS(A) in the `DAG`.

Note `C` is constant so `ABS(C)` is just a constant.

Alive2 Links:
EQ: https://alive2.llvm.org/ce/z/829F-c
NE: https://alive2.llvm.org/ce/z/tsS8bU

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D142601
2023-02-14 18:59:04 -06:00
Noah Goldstein
54a9e992c8 Add Transform for (and/or (eq/ne A,Pow2),(eq/ne A,-Pow2))->(eq/ne (and (and A,Pow2),~(Pow2*2)), 0)
In many instances this can be preferable if the `icmp` -> `i1` cannot be
done in one instruction (such as X86 for scalars).

At the moment guarded behind `TLI.isDesirableToCombineLogicOpOfSETCC`.

alive2 links:
https://alive2.llvm.org/ce/z/nLm5sN
https://alive2.llvm.org/ce/z/moEcyE

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D142344
2023-02-14 18:59:04 -06:00
Jake Egan
08533f8b86 Revert "[CGP] Add generic TargetLowering::shouldAlignPointerArgs() implementation"
These commits are causing a test-suite build failure on AIX. Revert for now for time to investigate.
https://lab.llvm.org/buildbot/#/builders/214/builds/5779/steps/9/logs/stdio

This reverts commit bd87a2449da0c82e63cebdf9c131c54a5472e3a7 and 4c72266830ffa332ebb7cf1d3bbd6c56d001fa0f.
2023-02-14 15:20:06 -05:00
Jay Foad
c5085c91cc [CodeGen] Trivial simplification of some getRegisterType calls. NFC. 2023-02-14 16:31:46 +00:00
Matt Arsenault
09dd4d870e DAG: Remove hasBitPreservingFPLogic
This doesn't make sense as an option. fneg and fabs are bit
preserving by definition. If a target has some fneg or fabs
instruction that are not bitpreserving it's incorrect to lower
fneg/fabs to use it.
2023-02-14 10:25:24 -04:00
Anton Sidorenko
77bd15ae2f [MachineTraceMetrics][NFC] Move Strategy enum out of the class
Make forward declaration possible to reduce amount of dependencies and reduce
re-compilation burden caused by further patches.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D140539
2023-02-14 16:38:47 +03:00
Matt Arsenault
f3c008ca77 DAG: Relax foldBitcastedFPLogic conditions
Requiring a bitcast to exist was unhelpful. The most basic cases
are always going to be a CopyFromReg or load, so they would need
a new cast inserted. Don't require a bitcast if it's a free
operation. I don't think this logic makes particularly much sense
(it seems to be imparting special interpretation of bitcast), but
this needs to be in sync with foldSignChangeInBitcast.

We should also get rid of this hasBitPreservingFPLogic hook. fabs/fneg
are bitpreserving or incorrectly implemented, so this should just be a
regular legality check.
2023-02-14 07:59:10 -04:00
Kazu Hirata
64dad4ba9a Use llvm::bit_cast (NFC) 2023-02-14 01:22:12 -08:00
Fangrui Song
1e6921131a Move global namespace cl::opt inside llvm:: 2023-02-14 00:09:44 -08:00
pvanhout
04f6934589 [DAG] Handle build_vector with all undefs in reduceBuildVecTruncToBitCast
While working on D143731 I hit a case where a build_vector with 2 undef operands could be generated (with one undef hidden behind a bitcast).
That made `reduceBuildVecTruncToBitCast` crash because it seems to assume there is at least one good operand.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D143886
2023-02-14 08:52:28 +01:00
Arthur Eubanks
7c6b46e87e Revert "[DAGCombiner] handle more store value forwarding"
This reverts commit f35a09daebd0a90daa536432e62a2476f708150d.

Causes miscompiles, see D138899
2023-02-13 19:07:28 -08:00
Arthur Eubanks
ac6219d0ae Revert "[DAGCombiner] fix comments for D138899; NFC"
This reverts commit 63854f91d3ee1056796a5ef27753648396cac6ec.

Dependent commit to be reverted.
2023-02-13 19:07:27 -08:00
Chen Zheng
6ee2f770ef [PowerPC][GISel] add support for fpconstant
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D133340
2023-02-14 02:39:22 +00:00
Dinar Temirbulatov
d44b31eca2 [DAGCombine] Allow DAGCombine to remove dead masked stores.
Remove a dead masked store if another one has the same base pointer and mask or
the following store has all true constant mask and size if equal or bigger to
the first store.

Differential Revision: https://reviews.llvm.org/D143069
2023-02-13 16:11:11 +00:00
Samuel Parker
2a58be4239 [HardwareLoops] NewPM support.
With the NPM, we're now defaulting to preserving LCSSA, so a couple
of tests have changed slightly.

Differential Revision: https://reviews.llvm.org/D140982
2023-02-13 09:46:31 +00:00
Darshan Bhat
19c42f672f [DFAPacketizer] Move DefaultVLIWScheduler class declaration to header file
This change moves "DefaultVLIWScheduler" class declaration from
DFAPacketizer.cpp to DFAPacketizer.h.
This is needed because there is a protected class member of
type "DefaultVLIWScheduler*" in "VLIWPacketizerList" class.
The derived classes cannot use this memeber unless declaration
is available to it. More specifically :

// Without this change

```
class HexagonPacketizerList : public VLIWPacketizerList {
  public :
	HexagonPacketizerList() {
	// Below line will cause incomplete class error since
	// declaration was not available through header.
	VLIWScheduler->schedule();
  }
}
```

Reviewed By: kparzysz

Differential Revision: https://reviews.llvm.org/D139767
2023-02-11 14:31:58 +05:30
Benjamin Maxwell
f1837c7074 [DebugInfo] Handle missed DW_FORM_addrx3 and DW_FORM_strx3 cases
This fixes a few places where the addrx3 and strx3 forms were missed.
Previously this meant if one of these forms appeared somewhere various
errors could occur. This now also adds an extra test case for the addrx3
form (which previously failed).

Differential Revision: https://reviews.llvm.org/D143488
2023-02-10 14:44:18 +00:00
OCHyams
25d0f3c4d0 [Assignment Tracking] Fix fragment index error in getDerefOffsetInBytes
Without this patch `getDerefOffsetInBytes` incorrectly always returns
`std::nullopt` for expressions with fragments due to an off-by-one error with
fragment element indices.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D143567
2023-02-10 13:49:05 +00:00
Archibald Elliott
d768bf994f [NFC][TargetParser] Replace uses of llvm/Support/Host.h
The forwarding header is left in place because of its use in
`polly/lib/External/isl/interface/extract_interface.cc`, but I have
added a GCC warning about the fact it is deprecated, because it is used
in `isl` from where it is included by Polly.
2023-02-10 09:59:46 +00:00
Pierre van Houtryve
d9a6fc82f5 [AMDGPU] Run unmerge combines post regbankselect
RegBankSelect can insert G_UNMERGE_VALUES in a lot of places which
left us with a lot of unmerge/merge pairs that could be simplified.
These often got in the way of pattern matching and made codegen
worse.

This patch:
  - Makes the necessary changes to the merge/unmerge combines so they can run post RegBankSelect
  - Adds relevant unmerge combines to the list of RegBankSelect combines for AMDGPU
  - Updates some tablegen patterns that were missing explicit cross-regbank copies (V_BFI patterns were causing constant bus violations with this change).

This seems to be mostly beneficial for code quality.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142192
2023-02-10 08:34:23 +01:00
Michael Buch
657672667f [llvm][DebugInfo] Add DW_TAG_imported_declaration to accelerator tables
**Summary**

After this patch, `DW_TAG_imported_declaration`s will be emitted into
the DWARF accelerator tables (under `.apple_namespaces`)

**Motivation**

Currently LLDB expression evaluation doesn't see through namespace
aliases. This is because LLDB only considers namespaces that are
part of `.apple_namespaces` when building a nested namespace
identifier for C++, which currently doesn't include import
declarations. The alternative to putting imports into accelerator
tables is to do a linear scan of a `DW_TAG_namespace` and look
for import declarations that look like they would satisfy the lookup
request, which is prohibitively expensive.

**Testing**

* Added unit-test

Differential Revision: https://reviews.llvm.org/D143397
2023-02-10 01:33:51 +00:00
duk
d61d591411
[MachineOutliner] Make getOutliningType partially target-independent
The motivation behind this patch is to unify some of the outliner logic across architectures. This looks nicer in general and makes fixing [issues like this](https://reviews.llvm.org/D124707#3483805) easier.
There are some notable changes here:
    1. `isMetaInstruction()` is used directly instead of checking for specific meta-instructions like `IMPLICIT_DEF` or `KILL`. This was already done in the RISC-V implementation, but other architectures still did hardcoded checks.
        - As an exception to this, CFI instructions are explicitly delegated to the target because RISC-V has different handling for those.

    2. `isTargetIndex()` checks are replaced with an assert; none of the architectures supported actually use `MO_TargetIndex` at this point in time.

    3. `isCFIIndex()` and `isFI()` checks are also replaced with asserts, since these operands should not exist in [any context](https://reviews.llvm.org/D122635#3447214) at this stage in the pipeline.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D125072
2023-02-09 14:35:00 -05:00
Amir Aupov
782045e727 Revert "HHVM calling conventions."
This reverts commit cce239c45d6ef3865a017b5b3f935964e0348734.

HHVM calling conventions are unused. Remove them by partially reverting the commit.

Reviewed By: MaskRay, MatzeB

Differential Revision: https://reviews.llvm.org/D124330
2023-02-09 10:53:11 -08:00
Andrew Savonichev
c65b4d64d4 [SelectionDAG] Do not second-guess alignment for alloca
Alignment of an alloca in IR can be lower than the preferred alignment
on purpose, but this override essentially treats the preferred
alignment as the minimum alignment.

The patch changes this behavior to always use the specified
alignment. If alignment is not set explicitly in LLVM IR, it is set to
DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign.

Tests are changed as well: explicit alignment is increased to match
the preferred alignment if it changes output, or omitted when it is
hard to determine the right value (e.g. for pointers, some structs, or
weird types).

Differential Revision: https://reviews.llvm.org/D135462
2023-02-09 18:45:20 +03:00
Alex Richardson
4c72266830 Fix call to deprecated API in bd87a2449da0c82e63cebdf9c131c54a5472e3a7 2023-02-09 10:26:33 +00:00
Alex Richardson
bd87a2449d [CGP] Add generic TargetLowering::shouldAlignPointerArgs() implementation
This function was added for ARM targets, but aligning global/stack pointer
arguments passed to memcpy/memmove/memset can improve code size and
performance for all targets that don't have fast unaligned accesses.
This adds a generic implementation that adjusts the alignment to pointer
size if unaligned accesses are slow.
Review D134168 suggests that this significantly improves performance on
synthetic benchmarks such as Dhrystone on RV32 as it avoids memcpy() calls.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D134282
2023-02-09 10:11:40 +00:00
Aiden Grossman
a95aa86b53 [MLGO] Add BB Profile Dump in AsmPrinter
This patch adds a basic block profile dump option within the AsmPrinter
and dumps basic block profile information so that cost models can use
the data for downstream analysis.

Differential Revision: https://reviews.llvm.org/D143311
2023-02-08 23:13:42 +00:00
Simon Pilgrim
ce63cd3bf1 [DAG] Fold freeze(concat_vectors(x,y,...)) -> concat_vectors(freeze(x),freeze(y),...)
Another of the cleanups necessary for D136529
2023-02-08 20:26:43 +00:00
Simon Pilgrim
b7deb71ef5 [DAG] Fold freeze(build_pair(x,y)) -> build_pair(freeze(x),freeze(y))
One of the cleanups necessary for D136529 - another being how we're going to handle moving freeze through multiple result nodes (like uaddo and subcarry)
2023-02-08 17:54:03 +00:00
Marco Elver
bf9814b705 [SanitizerBinaryMetadata] Emit constants as ULEB128
Emit all constant integers produced by SanitizerBinaryMetadata as
ULEB128 to further reduce binary space used. Increasing the version is
not necessary given this change depends on (and will land) along with
the bump to v2.

To support this, the !pcsections metadata format is extended to allow
for per-section options, encoded in the first MD operator which must
always be a string and contain the section: "<section>!<options>".

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143484
2023-02-08 13:12:34 +01:00
Marco Elver
3d53b52730 [SanitizerBinaryMetadata] Optimize used space for features and UAR stack args
Optimize the encoding of "covered" metadata by:

 1. Reducing feature mask from 4 bytes to 1 byte (needs increase once we
    reach more than 8 features).

 2. Only emitting UAR stack args size if it is non-zero, saving 4 bytes
    in the common case.

One caveat is that the emitted metadata for function PC (offset), size,
and UAR size (if enabled) are no longer aligned to 4 bytes.

SanitizerBinaryMetadata version base is increased to 2, since the change
is backwards incompatible.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143482
2023-02-08 13:12:33 +01:00
David Green
1af3f596f6 [DAG] Fold Op(vecreduce(a), vecreduce(b)) into vecreduce(Op(a,b))
So long as the operation is reassociative, we can reassociate the double
vecreduce from for example fadd(vecreduce(a), vecreduce(b)) to
vecreduce(fadd(a,b)). This will in general save a few instructions, but some
architectures (MVE) require the opposite fold, so a shouldExpandReduction is
added to account for it. Only targets that use shouldExpandReduction will be
affected.

Differential Revision: https://reviews.llvm.org/D141870
2023-02-08 11:43:36 +00:00
Fangrui Song
a13645cf8c DAGCombiner: fix -Wunused-private-field. NFC 2023-02-07 22:33:56 -08:00