861 Commits

Author SHA1 Message Date
Florian Hahn
292077072e
[Local] Treat calls that may not return as being alive.
With the addition of the `willreturn` attribute, functions that may
not return (e.g. due to an infinite loop) are well defined, if they are
not marked as `willreturn`.

This patch updates `wouldInstructionBeTriviallyDead` to not consider
calls that may not return as dead.

This patch still provides an escape hatch for intrinsics, which are
still assumed as willreturn unconditionally. It will be removed once
all intrinsics definitions have been reviewed and updated.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94106
2021-01-23 16:05:14 +00:00
Roman Lebedev
32fc32317a
[SimplifyCFG] markAliveBlocks(): catchswitch: preserve PostDomTree
When removing catchpad's from catchswitch, if that removes a successor,
we need to record that in DomTreeUpdater.

This fixes PostDomTree preservation failure in an existing test.
This appears to be the single issue that i see in my current test coverage.
2021-01-17 01:21:05 +03:00
Roman Lebedev
90a92f8b4d
[NFCI][Utils/Local] removeUnreachableBlocks(): cleanup support for lazy DomTreeUpdater
When DomTreeUpdater is in lazy update mode, the blocks
that were scheduled to be removed, won't be removed
until the updates are flushed, e.g. by asking
DomTreeUpdater for a up-to-date DomTree.

From the function's current code, it is pretty evident
that the support for the lazy mode is an afterthought,
see e.g. how we roll-back NumRemoved statistic..

So instead of considering all the unreachable blocks
as the blocks-to-be-removed, simply additionally skip
all the blocks that are already scheduled to be removed
2021-01-12 02:09:47 +03:00
Roman Lebedev
8e8d214c4a
[NFCI][SimplifyCFG] Prefer to add Insert edges before Delete edges into DomTreeUpdater, if reasonable
This has a measurable impact on the number of DomTree recalculations.
While this doesn't handle all the cases,
it deals with the most obvious ones.
2021-01-11 00:30:44 +03:00
Roman Lebedev
f2f81c554b
[SimplifyCFG] markAliveBlocks(): switch to non-permissive DomTree updates
No actual changes needed, invoke can't have the same block as an unwind
destination and a normal destination.
2021-01-08 02:15:27 +03:00
Roman Lebedev
d59f97bb3a
[SimplifyCFG] removeUnwindEdge(): switch to non-permissive DomTree updates
No actual changes needed, Catchswitch cannot unwind to one of its catchpads.
2021-01-08 02:15:27 +03:00
Roman Lebedev
f0eba8ce2d
[SimplifyCFG] changeToCall(): switch to non-permissive DomTree updates
No actual changes needed, normal and unwind destinations of an invoke
can never be identical.
2021-01-08 02:15:27 +03:00
Roman Lebedev
05adc73db0
[SimplifyCFG] changeToUnreachable(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:26 +03:00
Roman Lebedev
7600d7c7be
[SimplifyCFG] removeUnreachableBlocks(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:26 +03:00
Roman Lebedev
1f9b591ee6
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock(): switch to non-permissive DomTree updates
... which requires not deleting edges that were just deleted already,
    by not processing the same predecessor more than once.
2021-01-08 02:15:25 +03:00
Roman Lebedev
b3822728fa
[SimplifyCFG] ConstantFoldTerminator(): switch to non-permissive DomTree updates in indirectbr handling
... which requires not deleting edges that were just deleted already.
2021-01-08 02:15:25 +03:00
Roman Lebedev
36593a30a4
[SimplifyCFG] ConstantFoldTerminator(): switch to non-permissive DomTree updates in SwitchInst handling
... which requires not deleting edges that will still be present.
2021-01-08 02:15:24 +03:00
Roman Lebedev
16ab8e5f6d
[SimplifyCFG] ConstantFoldTerminator(): handle matching destinations of condbr earlier
We need to handle this case before dealing with the case of constant
branch condition, because if the destinations match, latter fold
would try to remove the DomTree edge that would still be present.

This allows to make that particular DomTree update non-permissive
2021-01-08 02:15:24 +03:00
Francesco Petrogalli
dfd3384fee [InstCombine] Update valueCoversEntireFragment to use TypeSize
* Update valueCoversEntireFragment to use TypeSize.
* Add a regression test.
* Assertions have been added to protect untested codepaths.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D91806
2021-01-06 17:14:59 +00:00
Kazu Hirata
16d20e2554 [Transforms/Utils] Construct SmallVector with iterator ranges (NFC) 2020-12-29 19:23:23 -08:00
Kazu Hirata
46bea9b297 [Local] Remove unused function RemovePredecessorAndSimplify (NFC)
The last use of the function was removed on Sep 29, 2010 in commit
99c985c37dd45dd0fbd03863037d8e93153783e6.
2020-12-25 09:35:20 -08:00
Fangrui Song
b5ad32ef5c Migrate deprecated DebugLoc::get to DILocation::get
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.

The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D93087
2020-12-11 12:45:22 -08:00
Kazu Hirata
43c0e4f665 [Transforms] Use llvm::is_contained (NFC) 2020-11-18 20:42:22 -08:00
Arthur Eubanks
5c31b8b94f Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Arthur Eubanks
10f2a0d662 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Simon Pilgrim
bce770ffa6 Revert rG0905bd5c2fa42bd4c "[InstCombine] collectBitParts - add trunc support."
This reverts commit 0905bd5c2fa42bd4c0e6e0aaa08b966f165b9dfa.

Causing failures in multistage buildbots that I need to investigate
2020-10-27 13:43:54 +00:00
Nico Weber
2a4e704c92 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b.
Makes clang assert when building Chromium, see https://crbug.com/1142813
for a repro.
2020-10-27 09:26:21 -04:00
Simon Pilgrim
0905bd5c2f [InstCombine] collectBitParts - add trunc support.
This should allow us to remove the rather limited matchOrConcat fold and just use recognizeBSwapOrBitReverseIdiom.
2020-10-27 13:14:54 +00:00
Arthur Eubanks
e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Simon Pilgrim
532f3bec3e [InstCombine] collectBitParts - add bitreverse intrinsic support. 2020-10-26 14:36:36 +00:00
Artur Pilipenko
6ec2c5e402 GC-parseable element atomic memcpy/memmove
This change introduces a GC parseable lowering for element atomic
memcpy/memmove intrinsics. This way runtime can provide an
implementation which can take a safepoint during copy operation.

See "GC-parseable element atomic memcpy/memmove" thread on llvm-dev
for the background and details:
https://groups.google.com/g/llvm-dev/c/NnENHzmX-b8/m/3PyN8Y2pCAAJ

Differential Revision: https://reviews.llvm.org/D88861
2020-10-23 14:06:09 -07:00
Simon Pilgrim
aacfe2be53 [InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support
Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).
2020-10-03 16:26:46 +01:00
Simon Pilgrim
347fd9955a [InstCombine] recognizeBSwapOrBitReverseIdiom - use generic CreateIntegerCast
Try to appease buildbots breakages due to D88578
2020-10-03 15:29:22 +01:00
Simon Pilgrim
3aa93f690b [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied)
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type.

Differential Revision: https://reviews.llvm.org/D88578
2020-10-03 14:52:42 +01:00
Simon Pilgrim
0364721e3e Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)"
This reverts commit 3d14a1e982ad27111346471564d575ad5efc6419.

This is breaking on some 2stage clang buildbots
2020-10-02 18:17:14 +01:00
Simon Pilgrim
3d14a1e982 [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Differential Revision: https://reviews.llvm.org/D88578
2020-10-02 17:25:12 +01:00
Simon Pilgrim
29ac9fae54 [InstCombine] collectBitParts - convert to use PatterMatch matchers and avoid IntegerType casts.
Make sure we're using getScalarSizeInBits instead of cast<IntegerType> to get Type bit widths.

This is preliminary cleanup before we can start adding vector support to the bswap/bitreverse (element level) matching.
2020-10-01 16:44:14 +01:00
Simon Pilgrim
bc730b5e43 [InstCombine] collectBitParts - use APInt directly to check for out of range bit shifts. NFCI. 2020-10-01 12:50:36 +01:00
Simon Pilgrim
c722b32596 [InstCombine] recognizeBSwapOrBitReverseIdiom - merge the regular/trunc+zext paths. NFCI.
There doesn't seem to be any good reason for having a separate path for when we bswap/bitreverse at a smaller size than the destination size - so merge these to make the instruction generation a lot clearer.
2020-09-30 14:54:04 +01:00
Simon Pilgrim
d5545a8993 [InstCombine] recognizeBSwapOrBitReverseIdiom - remove unnecessary cast. NFCI. 2020-09-30 14:44:15 +01:00
Simon Pilgrim
621c6c8962 [InstCombine] recognizeBSwapOrBitReverseIdiom - cleanup bswap/bitreverse detection loop. NFCI.
Early out if both pattern matches have failed (or we don't want them). Fix case of bit index iterator (and avoid Wshadow issue).
2020-09-30 14:19:18 +01:00
Simon Pilgrim
413b4998bd [InstCombine] recognizeBSwapOrBitReverseIdiom - use ArrayRef::back() helper. NFCI.
Post-commit feedback on D88316
2020-09-30 13:39:18 +01:00
Simon Pilgrim
05290eead3 InstCombine] collectBitParts - cleanup variable names. NFCI.
Fix a number of WShadow warnings (I was used as the instruction and index......) and fix cases to match style.

Also, replaced the Bit APInt mask check in AND instructions with a direct APInt[] bit check.
2020-09-30 13:25:32 +01:00
Simon Pilgrim
af47d40b9c [InstCombine] recognizeBSwapOrBitReverseIdiom - recognise zext(bswap(trunc(x))) patterns (PR39793)
PR39793 demonstrated an issue where we fail to recognize 'partial' bswap patterns of the lower bytes of an integer source.

In fact, most of this is already in place collectBitParts suitably tags zero bits, so we just need to correctly handle this case by finding the zero'd upper bits and reducing the bswap pattern just to the active demanded bits.

Differential Revision: https://reviews.llvm.org/D88316
2020-09-30 12:07:19 +01:00
Simon Pilgrim
ec3f24d453 [InstCombine] recognizeBSwapOrBitReverseIdiom - assert for correct bit providence indices. NFCI.
As suggested by @spatel on D88316
2020-09-30 11:16:33 +01:00
Simon Pilgrim
2a0ca17f66 [InstCombine] collectBitParts - add fshl/fshr handling
Pulled from D87452, this is a fixed version of the collectBitParts fshl/fshr handling which as @nikic noticed wasn't checking for different providers or had correct bit ordering (which was hid by only testing shift amounts of bitwidth/2).

Differential Revision: https://reviews.llvm.org/D88292
2020-09-25 20:34:59 +01:00
Nikita Popov
f4e5541809 [Local] Clean up enforceKnownAlignment() (NFC)
I want to export this function, and the current API was a bit
weird: It took an additional Alignment argument that didn't really
have anything to do with what the function does. Drop it, and
perform a max at the callsite.

Also rename it to tryEnforceAlignment().
2020-09-19 22:29:40 +02:00
Roman Lebedev
aadf55d1ce
[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%)
This is functionally equivalent to the old implementation.

As per https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions
this is a clear geomean compile-time regression-free win with overall geomean of `-0.08%`

32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse:
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions

If we have more PHI's than that, we fall-back to the original DenseSet-based implementation,
so the not-so-fast cases will still be handled.

However compile-time isn't the main motivation here.
I can name at least 3 limitations of this CSE:
1. Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap)
2. Does not special-handle `undef` incoming values (i don't see how we can do this with hashing)
3. Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87408
2020-09-17 11:29:03 +03:00
Simon Pilgrim
4ff4708d39 collectBitParts - use const references. NFCI.
Fixes clang-tidy warnings first noticed on D87452.
2020-09-14 18:23:00 +01:00
serge-sans-paille
3a6f3fc160 Fix return status of SimplifyCFG
When a switch case is folded into default's case, that's an IR change that
should be reported, update ConstantFoldTerminator accordingly.

Differential Revision: https://reviews.llvm.org/D87142
2020-09-05 07:54:15 +02:00
Roman Lebedev
1dcb936cf6
[NFC][Local] EliminateDuplicatePHINodes(): add STATISTIC() 2020-08-29 22:03:18 +03:00
Roman Lebedev
961483a5ea
[NFCI][Local] Rewrite EliminateDuplicatePHINodes to optionally check hashing invariants
EarlyCSE has a mode to verify the invariant that hash equality equals
key equality, but EliminateDuplicatePHINodes() doesn't.

I've verified that this would have caught the stage2-stage3 mismatches
5ec2b757cc7d37ff0d03b36ee863b0962fe78108 revert has fixed,
that were introduced last time in 3e69871ab5a66fb55913a2a2f5e7f5b42899a4c9.
2020-08-29 22:03:10 +03:00
Roman Lebedev
5ec2b757cc
[Instruction] Speculatively undo isIdenticalToWhenDefined() PHI handling changes
The stage2-stage3 differences persist even without instcombine-based
PHI CSE, so this is the only possible reason.
2020-08-29 19:38:57 +03:00
David Stenberg
e8ebebb0bd [InstCombine] Fix incorrect Modified status
When removing instructions from unreachable blocks, and only debug info
intrinsics were removed, InstCombine could incorrectly return a false
Modified status.

This is fixed by making removeAllNonTerminatorAndEHPadInstructions()
also return how many debug info intrinsics that were removed, and take
that into account.

This was caught using the check introduced by D80916.

Reviewed By: majnemer

Differential Revision: https://reviews.llvm.org/D85839
2020-08-13 15:10:41 +02:00
Vitaly Buka
61cab352e3 [NFC] Move findAllocaForValue into ValueTracking.h
Differential Revision: https://reviews.llvm.org/D84616
2020-07-30 18:22:59 -07:00