149308 Commits

Author SHA1 Message Date
Johannes Doerfert
d4bfce5521 [Attributor] Utilize the InstSimplify interface to simplify instructions
When we simplify at least one operand in the Attributor simplification
we can use the InstSimplify to work on the simplified operands. This
allows us to avoid duplication of the logic.

Depends on D106189

Differential Revision: https://reviews.llvm.org/D106190
2021-07-27 00:56:23 -05:00
Johannes Doerfert
75636868e2 [InstSimplify] Expose generic interface for replaced operand simplification
Users, especially the Attributor, might replace multiple operands at
once. The actual implementation of simplifyWithOpReplaced is able to
handle that just fine, the interface was simply not allowing to replace
more than one operand at a time. This is exposing a more generic
interface without intended changes for existing code.

Differential Revision: https://reviews.llvm.org/D106189
2021-07-27 00:56:12 -05:00
Chuanqi Xu
0237dbfdd3 [Coroutine] Record the elided coroutines
Reviewed By: lxfind

Differential Revision: https://reviews.llvm.org/D105606
2021-07-27 13:14:09 +08:00
wlei
f0d41b58da [CSSPGO] Tweak ICP threshold in top-down inliner
This change slightly relaxed the current ICP threshold in top-down inliner, specifically always allow one ICP for it. It shows some perf improvements on SPEC and our internal benchmarks. Also renamed the previous flag. We can also try to turn off PGO ICP in the future.

Reviewed By: wenlei, hoy, wmi

Differential Revision: https://reviews.llvm.org/D106588
2021-07-26 21:49:20 -07:00
Johannes Doerfert
25a3130d89 [Local] Do not introduce a new llvm.trap before unreachable
This is the second attempt to remove the `llvm.trap` insertion after
https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6
reverted the first one. It is not clear what the exact issue was back
then and it might already be gone by now, it has been >5 years after
all.

Replaces D106299.

Differential Revision: https://reviews.llvm.org/D106308
2021-07-26 23:33:36 -05:00
Johannes Doerfert
41bd26dff9 [Attributor] Delete dead stores
D106185 allows us to determine if a store is needed easily. Using that
knowledge we can start to delete dead stores.

In AAIsDead we now track more state as an instruction can be dead (= the
old optimisitc state) or just "removable". A store instruction can be
removable while being very much alive, e.g., if it stores a constant
into an alloca or internal global. If we would pretend it was dead
instead of only removablewe we would ignore it when we determine what
values a load can see, so that is not what we want.

Differential Revision: https://reviews.llvm.org/D106188
2021-07-26 23:33:36 -05:00
Johannes Doerfert
adddd3dbda [Attributor] Introduce getPotentialCopiesOfStoredValue and use it
This patch introduces `getPotentialCopiesOfStoredValue` which uses
AAPointerInfo to determine all "aliases" or "potential copies" of a
value that is stored into memory. This operation can fail but if it
succeeds it means we can visit all "uses" of a value even if it is
temporarily stored in memory.

There are two users for the function:
  1) `Attributor::checkForAllUses` which will now ignore the value use
     in a store if all "potential copies" can be identified and instead
     be visited. This allows various AAs, including AAPointerInfo
     itself, to look through memory.
  2) `AANoCapture` which uses a custom use tracking through the
     CaptureTracker interface and therefore needs to be thought
     explicitly.

Differential Revision: https://reviews.llvm.org/D106185
2021-07-26 23:33:36 -05:00
Mehdi Amini
402461beb0 Build libSupport with -Werror=global-constructors (NFC)
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.

The -Werror=global-constructors is only added on platform that have
support for the flag and for which std::mutex does not have a global
destructor. This is ensured by having CMake trying to compile a file
with a global mutex before adding the flag to libSupport.
2021-07-27 04:27:18 +00:00
Craig Topper
2ea9db0c49 [AArch64] Fix -Wparentheses warning with gcc 5.4. NFC 2021-07-26 21:08:56 -07:00
Jun Ma
958dddf7df [NFC][InstCombine] Fix typo 2021-07-27 11:33:10 +08:00
Mitch Phillips
ae70b211eb Revert "[GlobalISel] Add scalar widening for G_MERGE_VALUES destination"
This reverts commit 0a37163d1d855a2db41e1f46ddbc3f4570bd7ca6.

Reason: Broke the sanitizer msan bots. More details are available in the
original Phabricator review: https://reviews.llvm.org/D106814.
2021-07-26 19:52:12 -07:00
Shilei Tian
e97e0a4fad [AbstractAttributor] Fold __kmpc_parallel_level if possible
Similar to D105787, this patch tries to fold `__kmpc_parallel_level` if possible.
Note that `__kmpc_parallel_level` doesn't take activeness into consideration,
based on current `deviceRTLs`, its return value can be such as 0, 1, 2, instead
of 0, 129, 130, etc. that also indicate activeness.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106154
2021-07-26 22:46:19 -04:00
Johannes Doerfert
be2b569646 [OpenMP] Run rewriteDeviceCodeStateMachine in the Module not CGSCC pass
While rewriteDeviceCodeStateMachine should probably be folded into
buildCustomStateMachine, we at least need the optimization to happen.
This was not reliably the case in the CGSCC pass but in the Module pass
it seems to work reliably.

This also ports a test to the new kernel encoding (target_init/deinit),
and makes sure we cannot run the kernel in SPMD mode.

Differential Revision: https://reviews.llvm.org/D106345
2021-07-26 21:26:07 -05:00
Johannes Doerfert
e6f3e648c9 [Attributor][FIX] Do not return CHANGED unconditionally
This caused us to rerun AAMemoryBehaviorFloating::updateImpl over and
over again. Unfortunately it turned out to be hard to reproduce the
behavior in a reasonable way.
2021-07-26 21:22:02 -05:00
Johannes Doerfert
8befd05aad [Attributor][FIX] Track change status for AAIsDead properly
If we add a new live edge we need to indicate a change or otherwise the
new live block is not shown to users. Similarly, new known dead ends and
a changed `ToBeExploredFrom` set need to cause us to return CHANGED.
2021-07-26 21:21:59 -05:00
Carl Ritson
fbaa35e169 [AMDGPU] Add SelectionDAG support for insert_subvector on v4f64
Enable custom insert_subvector for larger vector types.
This is necessary now that SelectionDAG can attempt v3f64 insert
to v4f64, etc.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D105385
2021-07-27 10:11:34 +09:00
Mehdi Amini
2f49eb4794 Revert "Build libSupport with -Werror=global-constructors (NFC)"
This reverts commit beff86e8ff429f11da6fe37efde86d22ea636ed5.

The sanitizer-x86_64-linux bot is still broken.
2021-07-27 01:08:18 +00:00
Nemanja Ivanovic
9654cfd5bb [PowerPC] Fix materialization of SP float values on Power10
All floating point values in registers are in double precision
representation. In order to materialize the correct single precision
value, we need to convert the APFloat that represents the value
to double precision first.

Reviewed By: amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D106812
2021-07-26 19:43:10 -05:00
Jon Roelofs
f2e8e46d78 Revert "[AArch64][GlobalISel] Legalize ctpop s128"
This reverts commit 97e95fea53fc403c2a12e356dc835fc922123575.

It broke test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll. Not sure why I didn't see that.
2021-07-26 17:06:43 -07:00
Jessica Paquette
0a37163d1d [GlobalISel] Add scalar widening for G_MERGE_VALUES destination
This adds support for the case where

WideSize = DstSize + K * SrcSize

In this case, we can pad the G_MERGE_VALUES instruction with K extra undef
values with width SrcSize. Then the destination can be handled via
widenScalarDst.

Differential Revision: https://reviews.llvm.org/D106814
2021-07-26 17:00:00 -07:00
Philip Reames
f82f39b9cf [SCEV] Add a comment about invariant in howManyLessThans 2021-07-26 16:39:26 -07:00
Jon Roelofs
97e95fea53 [AArch64][GlobalISel] Legalize ctpop s128
Differential revision: https://reviews.llvm.org/D106494
2021-07-26 16:33:50 -07:00
Masoud Ataei
45951ad323 [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX
Before MASSV only supported P8 and P9 on AIX ans Linux . This patch proposes
MASSV to add support of P7 and P10 only on AIX too.

Differential: https://reviews.llvm.org/D106678
2021-07-26 23:21:38 +00:00
Mehdi Amini
beff86e8ff Build libSupport with -Werror=global-constructors (NFC)
Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.

The -Werror=global-constructors is only added on platform that have
support for the flag and for which std::mutex does not have a global
destructor. This is ensured by having CMake trying to compile a file
with a global mutex before adding the flag to libSupport.
2021-07-26 23:06:15 +00:00
Amara Emerson
172051a1f4 [AArch64][GlobalISel] Add identity combines to post-legal combiner.
We see some shifts of zero emitted during legalization.

Differential Revision: https://reviews.llvm.org/D106816
2021-07-26 15:17:11 -07:00
Amara Emerson
c658b472f3 [GlobalISel] Add a constant folding combine.
Use it AArch64 post-legal combiner. These don't always get folded because when
the instructions are created the constants are obscured by artifacts.

Differential Revision: https://reviews.llvm.org/D106776
2021-07-26 14:53:33 -07:00
Heejin Ahn
a48ee9f255 [WebAssembly] Remove dominator dependency in WasmEHPrepare (NFC)
Dominator trees were previously used for an optimization related to
`wasm.lsda` but the optimization was removed in D97309. Currently
dominators are not doing anything in this pass. Also removes some
`include` lines without which it compiles.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D106811
2021-07-26 14:45:13 -07:00
Heejin Ahn
c285a11efd [WebAssembly] Make Emscripten EH work with Emscripten SjLj
When Emscripten EH mixes with Emscripten SjLj, we are not currently
handling some of them correctly. There are three cases:
1. The current function calls `setjmp` and there is an `invoke` to a
   function that can either throw or longjmp. In this case, we have to
   check both for exception and longjmp. We are currently handling this
   case correctly:
   0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1058-L1090)
   When inserting routines for functions that can longjmp, which we do
   only for setjmp-calling functions, we check if the function was
   previously an `invoke` and handle it correctly.

2. The current function does NOT call `setjmp` and there is an `invoke`
   to a function that can either throw or longjmp. Because there is no
   `setjmp` call, we haven't been doing any check for functions that can
   longjmp. But in that case, for `invoke`, we only check for an
   exception and if it is not an exception we reset `__THREW__` to 0,
   which can silently swallow the longjmp:
   0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L70-L80)
   This CL fixes this.

3. The current function calls `setjmp` and there is no `invoke`. Because
   it is not an `invoke`, we haven't been doing any check for functions
   that can throw, and only insert longjmp-checking routines for
   functions that can longjmp. But in that case, if a longjmpable
   function throws, we only check for a longjmp so if it is not a
   longjmp we reset `__THREW__` to 0, which can silently swallow the
   exception:
   0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L156-L169)
   This CL fixes this.

To do that, this moves around some code, so we register necessary
functions for both EH and SjLj and precompute some data (the set of
functions that contains `setjmp`) before doing actual EH or SjLj
transformation.

This CL makes 2nd and 3rd tests in
https://github.com/emscripten-core/emscripten/pull/14732 work.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D106525
2021-07-26 13:48:31 -07:00
Roman Lebedev
1901c98dd8
[SimplifyCFG] SwitchToLookupTable(): don't increase ret count
The very next SimplifyCFG pass invocation will tail-merge these two ret's
anyways, there is not much point in creating more work for ourselves.
2021-07-26 23:29:55 +03:00
Roman Lebedev
08efc2e68d
[SimplifyCFG] Drop support for simplifying cond branch to two (different) ret's
Nowadays, simplifycfg pass already tail-merges all the ret blocks together
before doing anything, and it should not increase the count of ret's,
so this is dead code.
2021-07-26 23:29:52 +03:00
Roman Lebedev
7c5f104e45
[SimplifyCFG] Drop support for duplicating ret's into uncond predecessors
This functionality existed only under a default-off flag,
and simplifycfg nowadays prefers to not increase the count of ret's.
2021-07-26 23:29:21 +03:00
Matheus Izvekov
f84c70a379 [CodeView] Saturate values bigger than supported by APInt.
This fixes an assert firing when compiling code which involves 128 bit
integrals.

This would trigger runtime checks similar to this:
```
Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646
```

To get around this, we just saturate those big values.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D105320
2021-07-26 22:15:26 +02:00
Lei Huang
64a15817a0 [PowerPC]Add addex instruction definition and MC tests
Add td definitions and asm/disasm tests for the addex instruction introduced in
ISA 3.0.

Reviewed By: nemanjai, amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D106666
2021-07-26 14:55:38 -05:00
Sander de Smalen
13ccb09725 [LV] Don't let ForceTargetInstructionCost override Invalid cost.
Invalid costs can be used to avoid vectorization with a given VF, which is
used for scalable vectors to avoid things that the code-generator cannot
handle. If we override the cost using the -force-target-instruction-cost
option of the LV, we would override this mechanism, rendering the flag useless.

This change ensures the cost is only overriden when the original cost that
was calculated is valid. That allows the flag to be used in combination
with the -scalable-vectorization option.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106677
2021-07-26 20:27:49 +01:00
Reid Kleckner
d56e698552 [SimplifyCFG] Remove stale comment after d7378259aa, NFC 2021-07-26 12:25:29 -07:00
Reid Kleckner
3230493299 Fix clang debug info irgen of i128 enums
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.

Split from D105320, thanks to Matheus Izvekov for the test case and
report.

Differential Revision: https://reviews.llvm.org/D106585
2021-07-26 12:25:29 -07:00
Lei Huang
2d788959ed [PowerPC] Add implicit-def RM to instructions mtfsb[01]
This is a followup patch for D105930 to add implicit-def of RM for
mtfsb[01] instructions as per review comments.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D106603
2021-07-26 14:07:08 -05:00
Joseph Huber
e757a3b05f [OpenMP][NFC] Remove unncessary capture in RAII struct
Summary:
There was an unnecessary variable assigned to the information cache when we
only need it in the constructor to extract the function declaration.
2021-07-26 15:05:55 -04:00
Michael Liao
b0402a35fc [amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106445
2021-07-26 14:50:30 -04:00
Craig Topper
14e356d121 [TypePromotion] Remove redundant if. NFC
The same condition was checked in the previous if. Maybe this was
a bad merge resolution?
2021-07-26 11:47:25 -07:00
Amara Emerson
6af8d36054 [AArch4][GlobalISel] Post-legalize combine s64 = G_MERGE s32, 0 -> G_ZEXT.
These are generated as a byproduce of legalization.

Differential Revision: https://reviews.llvm.org/D106768
2021-07-26 10:58:04 -07:00
Eli Friedman
5c486ce04d [LLVM IR] Allow volatile stores to trap.
Proposed alternative to D105338.

This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).

Differential Revision: https://reviews.llvm.org/D106309
2021-07-26 10:51:00 -07:00
Amara Emerson
0d41d21929 [AArch64][GlobalISel] Enable some select combines after legalization.
The legalizer generates selects for some operations, which can have constant
condition values, resulting in lots of dead code if it's not folded away.

Differential Revision: https://reviews.llvm.org/D106762
2021-07-26 10:40:32 -07:00
Amara Emerson
dec34104bf [GlobalISel] Add combine for merge(unmerge) and use AArch64 postlegal-combiner.
Differential Revision: https://reviews.llvm.org/D106761
2021-07-26 10:37:31 -07:00
Heejin Ahn
6b9aba43a2 [WebAssembly] Improve pseudocode in LowerEmscriptenEHSjLj
Both `__THREW__` and `__threwValue` are global variables, and we have
been distinguishing the global variable `__THREW__` and the loaded value
`%__THREW__.val` in comments but not doing it for `__threwValue`. Made
the pseudocode comments consistent for both variables.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D106524
2021-07-26 10:13:28 -07:00
Florian Hahn
6d753b0751
[LAA] Remove RuntimeCheckingPtrGroup::RtCheck member (NFC).
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.

RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.

For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D105481
2021-07-26 17:38:10 +01:00
Stephen Tozer
31e7551217 [DebugInfo] Correctly update debug users of SSA values in tail duplication
During tail duplication, SSA values may be updated and have their uses
replaced with a virtual register, and any debug instructions that use
that value are deleted. This patch fixes the implementation of the debug
instruction deletion to work correctly for debug instructions that use
the SSA value multiple times, by batching deletions so that we don't
attempt to delete the same instruction twice.

Differential Revision: https://reviews.llvm.org/D106557
2021-07-26 17:27:57 +01:00
Sander de Smalen
b9051ba848 [LV] Remove assert that VF cannot be scalable in setCostBasedWideningDecision.
Scalarization for scalable vectors is not (yet) supported, so the
LV discards a VF when scalarization is chosen as the widening
decision. It should therefore not assert that the VF is not scalable
when it computes the decision to scalarize.

The code can get here when both the interleave-cost, gather/scatter cost
and scalarization-cost are all illegal. This may e.g. happen for SVE
when the VF=1, to avoid generating `<vscale x 1 x eltty>` types that
the code-generator cannot yet handle.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106656
2021-07-26 17:11:45 +01:00
Nikita Popov
f921bf6049 [MergeICmps] Collect block instructions once (NFC)
Collect the relevant instructions for a given BCECmpBlock once
on construction, rather than repeating this logic in multiple
places.
2021-07-26 18:07:20 +02:00
Fangrui Song
c0da287c30 [yaml2obj][MachO] Rename PayloadString to Content
The new name is conciser and matches yaml2obj ELF & DWARF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D106759
2021-07-26 09:04:51 -07:00