33865 Commits

Author SHA1 Message Date
Nikita Popov
282324aa4a [GVN] Fix verifyRemoved() verification
Fix the verification failure reported in
https://reviews.llvm.org/D141712#4413647. We need to remove the
load from the VN table as well, not just the leader table.

Also make sure that this verification always runs when assertions
are enabled, rather than only when -debug is passed.
2023-06-12 15:14:27 +02:00
luxufan
cf79773a90 [SCCP] Replace new value's value state with removed value's
In replaceSignedInst, if a signed instruction can be repalced with
unsigned instruction, we created a new instruction and removed the old
instruction's value state. If the following instructions has this new
instruction as a use operand, transformations like replaceSignedInst and
refineInstruction would be blocked. The reason is there is no value
state for the new instrution.

This patch set the new instruction's value state with the removed
instruction's value state. I believe it is correct bacause when we
repalce a signed instruction with unsigned instruction, the value state
is not changed.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152337
2023-06-12 11:40:47 +08:00
Florian Hahn
df357a71dd
[VPlan] Use step from induction recipe directly. (NFC)
Directly use the step of the transformed induction instead of creating
a new step. This allows replacing all uses of strides in D147783.
2023-06-11 21:40:06 +01:00
Kazu Hirata
17e0369892 [Scalar] Remove RewriteStatepointsForGCLegacyPass
Differential Revision: https://reviews.llvm.org/D152638
2023-06-11 13:19:19 -07:00
Kazu Hirata
ef09abfcf4 [InstCombine] Remove unused function createInstructionCombiningPass
The last use was removed by:

  commit 934c82d31801e65aa3bbe99a0e64f903621c2e04
  Author: Florian Hahn <flo@fhahn.com>
  Date:   Fri Feb 24 13:39:32 2023 +0100

Once I remove createInstructionCombiningPass, then:

InstructionCombiningPass::InstructionCombiningPass(unsigned MaxIterations)

becomes unused.  Once I remove that:

InstructionCombiningPass::MaxIterations is always initialized with
InstCombineDefaultMaxIterations, so this patch does the constant
propagation and removes InstructionCombiningPass::MaxIterations as
well.

Differential Revision: https://reviews.llvm.org/D152641
2023-06-11 07:56:37 -07:00
Antonio Frighetto
1774c14816 [ConstraintElimination] Handle ICMP_EQ predicates
Simplification of equality predicates is now supported by
transferring equalities into inequalities. This is achieved
by separately checking that both `isConditionImplied(A >= B)`
and `isConditionImplied(A <= B)` hold.

Differential Revision: https://reviews.llvm.org/D152067
2023-06-11 15:22:31 +02:00
David Green
2802739dfd [NFC] Replace ;; with ; 2023-06-11 10:25:24 +01:00
Kazu Hirata
c7cf942de3 [Scalar] Remove unused function createLICMPass
The last use was removed by:

  commit d623b2f95fd559901f008a0588dddd0949a8db01
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Fri Mar 10 17:24:19 2023 -0800
2023-06-10 21:52:50 -07:00
Kazu Hirata
2509c93edd [Transforms] Remove AddDiscriminatorsLegacyPass
The last use was removed by:

  commit ae0987d242e266847f21f5fa1bffa97ce3eff586
  Author: Kazu Hirata <kazu@google.com>
  Date:   Sat Jun 10 13:51:35 2023 -0700

Differential Revision: https://reviews.llvm.org/D152636
2023-06-10 15:32:47 -07:00
Kazu Hirata
76294935d3 [Scalar] Remove CallSiteSplittingLegacyPass
The last use was removed by:

  commit fd48d0a0adaa5fcdd24d02a58ba8a6210adafc28
  Author: Kazu Hirata <kazu@google.com>
  Date:   Sat Jun 10 13:51:37 2023 -0700

Differential Revision: https://reviews.llvm.org/D152635
2023-06-10 15:20:40 -07:00
Kazu Hirata
e35cfc03d3 [Transforms] Remove unused function createSimpleLoopUnrollPass
The last use was removed by:

  commit d623b2f95fd559901f008a0588dddd0949a8db01
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Fri Mar 10 17:24:19 2023 -0800
2023-06-10 13:51:38 -07:00
Kazu Hirata
fd48d0a0ad [Transforms] Remove unused function createCallSiteSplittingPass
The last use was removed by:

  commit d623b2f95fd559901f008a0588dddd0949a8db01
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Fri Mar 10 17:24:19 2023 -0800
2023-06-10 13:51:37 -07:00
Kazu Hirata
ae0987d242 [Transforms] Remove unused function createAddDiscriminatorsPass
commit f7ca01333214f934c580c162afdee933e7430b6c
  Author: Nikita Popov <npopov@redhat.com>
  Date:   Tue Feb 28 16:38:45 2023 +0100
2023-06-10 13:51:35 -07:00
Noah Goldstein
5189eff345 [InstCombine] Canonicalize (icmp eq/ne X, rotate(X)) to always use rotate-left
We canonicalize rotate-right -> rotate-left in other places. Makes
sense to do so here as well.
Proof: https://alive2.llvm.org/ce/z/HL3TpK

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152349
2023-06-10 14:38:46 -05:00
Noah Goldstein
2df6309555 [InstCombine] Transform (icmp eq/ne rotate(X,AX),rotate(Y,AY)) -> (icmp eq/ne rotate(Y,AX-AY))
Only do so if we don't create more instructions, so either both
rotates have one use or one of the rotates has one use and both `AX`
and `AY` are constant.
Proof: https://alive2.llvm.org/ce/z/rVmJgz

Differential Revision: https://reviews.llvm.org/D152348
2023-06-10 14:38:46 -05:00
Kazu Hirata
98183da637 [Transforms] Fix an unused variable warning
This patch fixes:

  llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:415:18: error: unused
  variable 'StBuff' [-Werror,-Wunused-variable]
2023-06-10 11:57:48 -07:00
Matt Arsenault
6d2e5c3445 LowerMemIntrinsics: Skip memmove with different address spaces
This is a quick fix for an assert when the source and dest have
different address spaces. The pointer compare needs to have matching
types, but we can't generically introduce addrspacecast and we don't
know if the address spaces alias.
2023-06-10 12:28:05 -04:00
Kazu Hirata
c963892a45 [llvm] Use DenseMapBase::lookup (NFC) 2023-06-10 09:02:25 -07:00
Vikram
631c965483 [AMDGPU] Non hostcall printf support for HIP
This is an alternative to currently existing hostcall implementation and uses printf buffer similar to OpenCL,
The data stored in the buffer (i.e the data frame) for each printf call are as follows,
1. Control DWord - contains info regarding stream, format string constness and size of data frame
2. Hash of the format string (if constant) else the format string itself
3. Printf arguments (each aligned to 8 byte boundary)

The format string Hash is generated using LLVM's MD5 Message-Digest Algorithm implementation and only low 64 bits are used.
The implementation still uses amdhsa metadata and hash is stored as part of format string itself to ensure
minimal changes in runtime.

Differential Revision: https://reviews.llvm.org/D150427
2023-06-10 09:55:00 -04:00
Nuno Lopes
7ffeb8efe8 PromoteMem2Reg: use poison instead of undef as placeholder in phi entries from unreachable predecessors [NFC] 2023-06-10 11:19:03 +01:00
luxufan
25d9fde22e [SCCP] Skip computing intrinsics if one of its args is unknownOrUndef
For constant range supported intrinsics, we got consantrange from args
no matter if they are unknown or undef. And the constant range computed
from unknown or undef value state is full range.

I think compute with full constant range is harmful since although we
can do mergeIn after these args value state are changed, the merge
operation of two constant ranges is union.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152499
2023-06-10 15:48:46 +08:00
khei4
361464c027 [MemCpyOpt] Use memcpy source directly if dest is known to be immutable from attributes
Differential Revision: https://reviews.llvm.org/D150970
2023-06-10 15:46:32 +09:00
John McIver
1001f9031f [InstCombine] Optimize and of icmps with power-of-2 and contiguous masks
Add an instance combine optimization for expressions of the form:

(%arg u< C1) & ((%arg & C2) != C2) -> %arg u< C2

Where C1 is a power-of-2 and C2 is a contiguous mask starting 1 bit below
C1. This commit resolves GitHub missed-optimization issue #54856.

Validation of scalar tests:
  - https://alive2.llvm.org/ce/z/JfKjiU
  - https://alive2.llvm.org/ce/z/AruHY_
  - https://alive2.llvm.org/ce/z/JAiR6t
  - https://alive2.llvm.org/ce/z/S2X2e5
  - https://alive2.llvm.org/ce/z/4cycdE
  - https://alive2.llvm.org/ce/z/NcDiLP

Validation of vector tests:
  - https://alive2.llvm.org/ce/z/ABY6tE
  - https://alive2.llvm.org/ce/z/BTJi3s
  - https://alive2.llvm.org/ce/z/3BKWpu
  - https://alive2.llvm.org/ce/z/RrAbkj
  - https://alive2.llvm.org/ce/z/nM6fsN

Reviewed By: goldstein.w.n

Differential Revision: https://reviews.llvm.org/D125717
2023-06-09 16:07:01 -06:00
Jay Foad
c0ad1b4597 [NewGVN] Fold equivalent freeze instructions
Differential Revision: https://reviews.llvm.org/D152529
2023-06-09 20:57:36 +01:00
Bjorn Pettersson
d2223221e7 [LoadStoreVectorizer] Optimize check for IsAllocaAccess. NFC
Swap order for checking address space and the strip pointer cast
when analyzing if a load/store is accessing an alloca. This to
make sure we do the cheaper check first.

This is done as a follow up to D152386.
2023-06-09 21:43:42 +02:00
Bjorn Pettersson
263bc7f905 [LoadStoreVectorizer] Only upgrade align for alloca
In commit 2be0abb7fe72ed453 (D149893) the load store vectorized was
reimplemented. One thing that can happen with the new LSV is that
it can increase the align of alloca and global objects. However,
the code comments indicate that the intention only was to increase
alignment of alloca.
Now we will use stripPointerCasts to analyse if the load/store really
is accessing an alloca (same as getOrEnforceKnownAlignment is using).
And then we only try to change the align if we find an alloca
instruction. This way the code will match better with code comments,
and we won't change alignment of non-stack variables to use the
"StackAdjustedAlignment".

Differential Revision: https://reviews.llvm.org/D152386
2023-06-09 15:33:35 +02:00
Jay Foad
63901cb082 [Scalarizer] Scalarize freeze instruction
Differential Revision: https://reviews.llvm.org/D152518
2023-06-09 13:54:24 +01:00
Graham Hunter
95bfb1902d [LV][AArch64] Allow (limited) interleaving for scalable vectors
This patch uses the (de)interleaving intrinsics introduced in
D141924 to handle vectorization of interleaving groups with a
factor of 2 for scalable vectors.

Reviewed By: fhahn, reames

Differential Revision: https://reviews.llvm.org/D145163
2023-06-09 11:42:10 +01:00
Nikita Popov
cde681c865 [InstCombine] Replace phi operands in successors of unreachable block
Set these operands to poison, which might allow folding the phi node,
or reduce the use count of an instruction.
2023-06-09 12:31:07 +02:00
Dmitry Makogon
995a26d2c7 [SimpleLoopUnswitch] Verify LoopInfo in turnGuardIntoBranch under a flag
A follow-up for 64397d8. Only do verification if VerifyLoopInfo is
set.
2023-06-09 13:44:55 +07:00
Teresa Johnson
e5479f27f2 [MemProf] Remove stale comment (NFC)
We already do the simplification described in the FIXME comment.
2023-06-08 12:30:23 -07:00
Alexandros Lamprineas
4d13896d8a Reland "[FuncSpec] Improve the accuracy of the cost model"
Instead of blindly traversing the use-def chain of constant arguments,
compute known constants along the way. Stop as soon as a user cannot
be replaced by a constant. Keep it light-weight by handling some basic
instruction types.

Differential Revision: https://reviews.llvm.org/D150464
2023-06-08 17:44:48 +01:00
Alexandros Lamprineas
475ddca56e Reland "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo"
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.

Differential Revision: https://reviews.llvm.org/D150375
2023-06-08 17:44:47 +01:00
Zain Jaffal
d65c0527ab change checking for auto-init metadata to use equalsStr instead of casing MDOperand nodes.
Since `MD_annotation` metadata now supports having mutliple strings in the annotation node. casing Operand to string directly will cause a crash. When checking if `MDOperand` equals str you can use `equalsStr` method.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D152372
2023-06-08 15:58:01 +01:00
Teresa Johnson
4638eb2660 [ThinLTO] Ignore callee edge to global variable
Since the symbols in the ThinLTO summary are indexed by GUID we can end
up in corner cases where a callee edge in the combined index goes to a
summary for a global variable. This could happen in the case of hash
collisions, and in the case of SamplePGO profiles could potentially happen
due to code changes (since we synthesize call edges to GUIDs that were
inlined callees in the profiled code).

Handle this by simply ignoring any non-FunctionSummary callees.

Differential Revision: https://reviews.llvm.org/D152406
2023-06-08 06:44:06 -07:00
Dmitry Makogon
64397d8f25 [SimpleLoopUnswitch] Verify LoopInfo after turning guards to branches
SplitBlockAndInsertIfThen doesn't correctly update LoopInfo when called
with Unreachable=true, which is the case when we turn guards to branches
in SimpleLoopUnswitch.

This adds LoopInfo verification before returning from turnGuardIntoBranch.
2023-06-08 18:29:19 +07:00
Shivam Gupta
46aba711ab [InstCombine] (icmp eq A, -1) & (icmp eq B, -1) --> (icmp eq (A&B), -1)
This patch add another icmp fold for -1 case.

This fixes https://github.com/llvm/llvm-project/issues/62311,
where we want instcombine to merge all compare intructions together so
later passes like simplifycfg and slpvectorize can better optimize this
chained comparison.

Reviewed By: goldstein.w.n

Differential Revision: https://reviews.llvm.org/D151660
2023-06-08 09:00:05 +05:30
Florian Hahn
c10a7772bd
[Matrix] Convert binop operand of dot product to a row vector.
The dot product lowering will use the left operand as row vector.
If the operand is a binary op, convert it to operate on a row vector
instead of a column vector.

Depends on D148428.

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D148429
2023-06-07 20:45:08 +01:00
luxufan
e9ddb584e8 [LoopIdiom] Freeze BitPos if !isGuaranteedNotToBeUndefOrPoison
Fixes: https://github.com/llvm/llvm-project/issues/62873

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151690
2023-06-07 14:50:22 +08:00
Chuanqi Xu
84c033d9ba [LICM] [Coroutines] Don't hoist threadlocals within presplit coroutines
Close https://github.com/llvm/llvm-project/issues/63022

This is the following of https://reviews.llvm.org/D135550, which is
discussed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
In my imagination, we could fix the issue fundamentally after we
introduces new memory kind thread id. But I am not very sure if we can
fix the issue fundamentally in time.

Besides that, I think the correctness is the most important. So it
should not be bad to land this given it is innocent.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151774
2023-06-07 10:25:47 +08:00
Amir Ayupov
b244a4c4c9 [profi][NFC] Get rid of afdo_detail::TypeMap
Parametrize SampleProfileInference and SampleProfileLoaderBaseImpl by function
type (Function/MachineFunction) instead of block type
(BasicBlock/MachineBasicBlock). Move out specializations to appropriate
locations.

This change makes it possible to use GraphTraits instead of a custom TypeMap and
make SampleProfileInference not dependent on LLVM types, paving the way for
generalizing SampleProfileInference interfaces to BOLT IR types
(BinaryFunction/BinaryBasicBlock) in stale profile matching (D144500).

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D152187
2023-06-06 13:48:37 -07:00
Noah Goldstein
e387f49d13 [InstCombine] Remove deadcode in (icmp SignTest(shl/shr X)); NFC
This is dead as of: D145341

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152181
2023-06-06 15:14:10 -05:00
Guozhi Wei
84bcfa0e1b [GVN] Improve PRE on load instructions
This patch implements the enhancement proposed by
https://github.com/llvm/llvm-project/issues/59312.

Suppose we have following code

   v0 = load %addr
   br %LoadBB

LoadBB:
   v1 = load %addr
   ...

PredBB:
   ...
   br %cond, label %LoadBB, label %SuccBB

SuccBB:
   v2 = load %addr
   ...

Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a
critical edge. SuccBB is another successor of PredBB, it contains another load
v2 which is identical to v1. Current GVN splits the critical edge
(PredBB, LoadBB) and inserts a new load in it. A better method is move the load
of v2 into PredBB, then v1 can be changed to a PHI instruction.

If there are two or more similar predecessors, like the test case in the bug
entry, current GVN simply gives up because otherwise it needs to split multiple
critical edges. But we can move all loads in successor blocks into predecessors.

Differential Revision: https://reviews.llvm.org/D141712
2023-06-06 19:45:34 +00:00
Mikhail Goncharov
df3a8f3760 Revert "Reland [MergeICmps] Adapt to non-eq comparisons, bugfix"
Causes miscompile. See https://reviews.llvm.org/D141188.

This reverts commit fb2c98a929aa65603e9d984307a41325e577e9d3
2023-06-06 16:26:52 +02:00
Nikita Popov
e4a589ba5d [InstCombine] Add stats for number of iterations (NFC) 2023-06-06 15:17:47 +02:00
Florian Hahn
8f781b96e2
Revert "[VPlan] Mark recurrence recipes as not having side-effects."
This reverts commit 02369b75fdd7b5fc5d9b47f1b60587c225918511.

At the moment, live-outs used *only* for the resume values in the scalar
loop are not modeled in VPlan yet. This means first-order recurrence
recipes could be removed, when a scalar epilogue is required and the
only use of a FOR is outside the loop.

Keep treating recurrence recipes as having side-effects for now, to
avoid them being removed.

Fixes #62954.
2023-06-06 11:35:26 +02:00
Christian Ulmann
2544d91956 llvm-extract: Replace IFuncs with declarations
This commit ensures that llvm-extract does not copy all IFuncs into the
resulting modules. Before this change, ifuncs were not modified which
could cause the emission unexpected IR files.

Reviewed By: darthscsi

Differential Revision: https://reviews.llvm.org/D152148
2023-06-06 07:18:33 +00:00
khei4
116670d192 [InstCombine] add overflow checking on Add ~X + C --> (C-1) - X
Differential Revision: https://reviews.llvm.org/D152088
2023-06-06 12:24:45 +09:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
Johannes Doerfert
8f4fadd1b4 [OpenMP] Use "kernel" attribute consistently 2023-06-05 16:33:53 -07:00