427 Commits

Author SHA1 Message Date
Antonio Frighetto
d79c4c1119 [CGP] Regenerate revert-constant-ptr-propagation-on-calls.ll test (NFC)
Multiple buildbots were previously failing.
2024-09-02 09:55:43 +02:00
Antonio Frighetto
e4e0dfb0c2 [CGP] Undo constant propagation of pointers across calls
It may be profitable to revert SCCP propagation of C++ static values,
if such constants are pointers, in order to avoid redundant pointer
computation, since the method returning the constant is non-removable.
2024-09-02 09:33:23 +02:00
Antonio Frighetto
ed6d9f6d2a [CGP] Introduce test for PR102926 (NFC) 2024-09-02 09:33:23 +02:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Simon Pilgrim
f673882323
[X86] Allow speculative BSR/BSF instructions on targets with CMOV (#102885)
Currently targets without LZCNT/TZCNT won't speculate with BSR/BSF instructions in case they have a zero value input, meaning we always insert a test+branch for the zero-input case.

This patch proposes we allow speculation if the target has CMOV, and perform a branchless select instead to handle the zero input case. This will predominately help x86-64 targets where we haven't set any particular cpu target. We already always perform BSR/BSF instructions if we were lowering a CTLZ/CTTZ_ZERO_UNDEF instruction.
2024-08-22 11:11:00 +01:00
Sjoerd Meijer
70e8c982d0
[AArch64] Bail out for scalable vecs in areExtractShuffleVectors (#105484)
The added test triggers the following assert in `areExtractShuffleVectors`
that is called from `shouldSinkOperands`:

Assertion `(!isScalable() || isZero()) && "Request for a fixed element count on a scalable object"' failed.

I don't think scalable types can be extract shuffles, so bail early if
this is the case.
2024-08-21 15:27:09 +01:00
Noah Goldstein
e4c67ba67e Recommit "[CodeGenPrepare] Folding urem with loop invariant value"
Was missing remainder on `Start` value.

Also changed logic as as nikic suggested (getting loop from `PN`
instead of `Rem`). The prior impl increased the complexity of the code
and made debugging it more difficult.

Closes #104877
2024-08-20 09:17:49 -07:00
Noah Goldstein
9b25ad818c [CodeGenPrepare][X86] Add tests for fixing urem transform; NFC 2024-08-20 09:17:49 -07:00
Noah Goldstein
731ae694a3 Revert "[CodeGenPrepare] Folding urem with loop invariant value"
This reverts commit c64ce8bf283120fd145a57d0e61f9697f719139d.

Seems to be causing stage2 failures on buildbots. Reverting while I
investigate.
2024-08-18 20:36:35 -07:00
Noah Goldstein
c64ce8bf28 [CodeGenPrepare] Folding urem with loop invariant value
```
for(i = Start; i < End; ++i)
   Rem = (i nuw+ IncrLoopInvariant) u% RemAmtLoopInvariant;
```
 ->
```
Rem = (Start nuw+ IncrLoopInvariant) % RemAmtLoopInvariant;
for(i = Start; i < End; ++i, ++rem)
   Rem = rem == RemAmtLoopInvariant ? 0 : Rem;
```

In its current state, only if `IncrLoopInvariant` and `Start` both
being zero.

Alive2 seemed unable to prove this (see:
https://alive2.llvm.org/ce/z/ATGDp3 which is clearly wrong but still
checks out...) so wrote an exhaustive test here:
https://godbolt.org/z/WYa561388

Closes #96625
2024-08-18 15:58:24 -07:00
Noah Goldstein
f16125a13c [CodeGenPrepare][X86] Add tests for folding urem with loop invariant value; NFC 2024-08-18 15:58:24 -07:00
David Green
0e124537aa
[AArch64] Sink operands to fmuladd. (#102297)
A fmuladd can be treated as a fma when sinking operands to the
intrinsic, similar to D126234.

Addresses a small part of #102195
2024-08-09 11:48:37 +01:00
Fangrui Song
8ea31db272 [CodeGenPrepare] Use MapVector to stabilize iteration order
DenseMap iteration order is not guaranteed to be deterministic.

Without the change,
llvm/test/Transforms/CodeGenPrepare/X86/statepoint-relocate.ll would
fail when `combineHashValue` changes (#95970).

Fixes: dba7329ebb0dbe1fabb3faaedfd31da3b8bd611d
2024-06-18 17:19:51 -07:00
Stephen Tozer
094572701d
[RemoveDIs] Print IR with debug records by default (#91724)
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records. 

If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.

For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-06-14 15:07:27 +01:00
Nikita Popov
d10b76552f
[ConstantFold] Remove notional over-indexing fold (#93697)
The data-layout independent constant folding currently has some rather
gnarly code for canonicalizing GEP indices to reduce "notional
overindexing", and then infers inbounds based on that canonicalization.

Now that we canonicalize to i8 GEPs, this canonicalization is
essentially useless, as we'll discard it as soon as the GEP hits the
data-layout aware constant folder anyway. As such, I'd like to remove
this code entirely.

This shouldn't have any impact on optimization capabilities.
2024-05-30 08:36:44 +02:00
wanglei
9d4f7f44b6 [test][LoongArch] Add -mattr=+d option. NFC
Because most of tests assume target-abi=`lp64d`, adding the
corresponding feature is reasonable.

rg -l loongarch -g '!*.s' | xargs sed -i '/mtriple=loongarch/ {/-mattr=/!{/target-abi/! s/mtriple=loongarch.. /&-mattr=+d /}}'
2024-05-14 20:23:04 +08:00
Yingwei Zheng
ab12bba0aa
[CGP] Drop poison-generating flags after hoisting (#90382)
See the following case:
```
define i8 @src1(i8 %x) {
entry:
  %cmp = icmp eq i8 %x, -1
  br i1 %cmp, label %exit, label %if.then

if.then:
  %inc = add nuw nsw i8 %x, 1
  br label %exit

exit:
  %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ]
  ret i8 %retval
}

define i8 @tgt1(i8 %x) {
entry:
  %inc = add nuw nsw i8 %x, 1
  %0 = icmp eq i8 %inc, 0
  br i1 %0, label %exit, label %if.then

if.then:                                          ; preds = %entry
  br label %exit

exit:                                             ; preds = %if.then, %entry
  %retval = phi i8 [ %inc, %if.then ], [ -1, %entry ]
  ret i8 %retval
}
```
`optimizeBranch` converts `icmp eq X, -1` into cmp to zero on RISC-V and
hoists the add into the entry block. Poison-generating flags should be
dropped as they don't still hold.

Proof: https://alive2.llvm.org/ce/z/sP7mvK
Fixes https://github.com/llvm/llvm-project/issues/90380
2024-04-29 15:51:49 +08:00
Alex Bradbury
1c8410a67d
[CodeGenPrepare] Preserve flags (such as nsw/nuw) in SinkCast (#89904)
As demonstrated in the test change, when deciding to sink a trunc we
were losing its flags. This patch moves to cloning the original
instruction instead.
2024-04-25 15:05:07 +01:00
Alex Bradbury
2554a85c03 [CodeGenPrepare][test] Add test for sinking of truncs demonstrating nsw/nuw are dropped
SinkCast creates a new cast with the same type and inputs, which drops
the nsw/nuw flags.

Reviewed as part of <https://github.com/llvm/llvm-project/pull/89904>
but split out so I can land the test separately.
2024-04-25 15:01:55 +01:00
Matthias Braun
652bcf685c
CodeGenPrepare: Add support for llvm.threadlocal.address address-mode sinking (#87844)
Depending on the TLSMode many thread-local accesses on x86 can be
expressed by adding a %fs: segment register to an addressing mode. Even
if there are mutliple users of a `llvm.threadlocal.address` intrinsic it
is generally not worth sharing the value in a register but instead fold
the %fs access into multiple addressing modes.

Hence this changes CodeGenPrepare to duplicate the
`llvm.threadlocal.address` intrinsic as necessary.

Introduces a new `TargetLowering::addressingModeSupportsTLS` callback
that allows targets to indicate whether TLS accesses can be part of an
addressing mode.

This is fixing a performance problem, as this folding of TLS-accesses
into multiple addressing modes happened naturally before the
introduction of the `llvm.threadlocal.address` intrinsic, but regressed
due to `SelectionDAG` keeping things in registers when accessed across
basic blocks, so CodeGenPrepare needs to duplicate to mitigate this. We
see a ~0.5% recovery in a codebase with heavy TLS usage (HHVM).

This fixes most of #87437
2024-04-17 12:48:02 -07:00
wanglei
8e4b0890a6
[LoongArch] Return true from shouldConsiderGEPOffsetSplit (#88371)
If not performing gep splits can prevent important optimizations, such
as preventing the element indices / member offsets from being         
(partially) folded into load/store instruction immediates.
2024-04-15 09:01:04 +08:00
wanglei
5fc8a190b3 [LoongArch] Pre commit test for #88371. NFC 2024-04-12 17:57:28 +08:00
Yingwei Zheng
38a44bdc93
[CodeGenPrepare] Reverse the canonicalization of isInf/isNanOrInf (#81572)
In commit
2b582440c1,
we canonicalize the isInf/isNanOrInf idiom into fabs+fcmp for better
analysis/codegen (See also the discussion in
https://github.com/llvm/llvm-project/pull/76338).

This patch reverses the fabs+fcmp to `is.fpclass`. If the `is.fpclass`
is not supported by the target, it will be expanded by TLI.

Fixes the regression introduced by
2b582440c1
and
https://github.com/llvm/llvm-project/pull/80414#issuecomment-1936374206.
2024-03-18 18:27:45 +08:00
Stephen Tozer
d128448efd Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)""
Reverted due to some test failures on some buildbots.

https://lab.llvm.org/buildbot/#/builders/67/builds/14669

This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.
2024-02-27 10:17:24 +00:00
Stephen Tozer
aa436493ab Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)"
Fixes the prior issue in which the symbol for a cl-arg was unavailable to
some binaries.

This reverts commit dc06d75ab27b4dcae2940fc386fadd06f70faffe.
2024-02-27 09:59:08 +00:00
Stephen Tozer
dc06d75ab2 Revert "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)"
Reverted due to failures on buildbots, where a new cl flag was placed
in the wrong file, resulting in link errors.

https://lab.llvm.org/buildbot/#/builders/198/builds/8548

This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.
2024-02-26 18:49:18 +00:00
Stephen Tozer
0b398256b3
[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)
This patch adds support for printing the proposed non-instruction debug
info ("RemoveDIs") out to textual IR. This patch does not add any
bitcode support, parsing support, or documentation.

Printing of the new format is controlled by a flag added in this patch,
`--write-experimental-debuginfo`, which defaults to false. The new
format will be printed *iff* this flag is true, so whether we use the IR
format is completely independent of whether we use non-instruction debug
info during LLVM passes (which is controlled by the
`--try-experimental-debuginfo-iterators` flag).

Even with the flag disabled, some existing tests need to be updated, as this
patch causes debug intrinsic declarations to be changed in a round trip,
such that they always appear at the end of a module and have no attributes
(this has no functional change on the module).

The design of this new IR format was proposed previously on
Discourse, and any further discussion about the design can still be
contributed there:

https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491
2024-02-26 18:22:05 +00:00
Nikita Popov
2d69827c5c [Transforms] Convert tests to opaque pointers (NFC) 2024-02-05 11:57:34 +01:00
Nick Anderson
f1ec0d12bb
Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#77182)
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #75380

Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2024-01-09 13:32:59 +07:00
Simon Pilgrim
7648371c25 Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update CodeGenPreparePass reference in CodeGenPassBuilder (#77054)"
Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)"

Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104
2024-01-05 12:28:10 +00:00
Nick Anderson
e0c554ad87
Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #64560

Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2024-01-05 13:47:56 +07:00
Jeremy Morse
f85a38e21c Follow up to d0858bffa11, add missing REQUIRES x86 2023-12-06 17:48:50 +00:00
Jeremy Morse
d0858bffa1
[DebugInfo][RemoveDIs] Maintain DPValues on skipped instrs in CGP (#74602)
It turns out that CodeGenPrepare will skip over consecutive select
instructions as it knows it can optimise them all at the same time. This
is unfortunate for the RemoveDIs project to remove intrinsic-based
debug-info, because that means debug-info attached to those skipped
instructions doesn't get seen by optimizeInst and so updated. Add code
to handle debug-info on those skipped instructions manually.

This code will also have been slower when it had dbg.values stuffed in
between instructions, but with RemoveDIs it'll go faster because the
dbg.values won't break up the select sequence.
2023-12-06 17:25:33 +00:00
zhongyunde 00443407
d6f4d5209f [CGP][AArch64] Rebase the common base offset for better ISel
When all the large const offsets masked with the same value from bit-12 to bit-23.
Fold
  add     x8, x0, #2031, lsl #12
  add     x8, x8, #960
  ldr     x9, [x8, x8]
  ldr     x8, [x8, #2056]

into
  add     x8, x0, #2031, lsl #12
  ldr     x9, [x8, #960]
  ldr     x8, [x8, #3016]
2023-12-05 09:01:41 +08:00
Jeremy Morse
3ef98bcd46 [DebugInfo][RemoveDIs] Support maintaining DPValues in CodeGenPrepare (#73660)
CodeGenPrepare needs to support the maintenence of DPValues, the
non-instruction replacement for dbg.value intrinsics. This means there are
a few functions we need to duplicate or replicate the functionality of:
 * fixupDbgValue for setting users of sunk addr GEPs,
 * The remains of placeDbgValues needs a DPValue implementation for sinking
 * Rollback of RAUWs needs to update DPValues
 * Rollback of instruction removal needs supporting (see github #73350)
 * A few places where we have to use iterators rather than instructions.

There are three places where we have to use the setHeadBit call on
iterators to indicate which portion of debug-info records we're about to
splice around. This is because CodeGenPrepare, unlike other optimisation
passes, is very much concerned with which block an operation occurs in and
where in the block instructions are because it's preparing things to be in
a format that's good for SelectionDAG.

There isn't a large amount of test coverage for debuginfo behaviours in
this pass, hence I've added some more.
2023-11-30 15:29:05 +00:00
Nikita Popov
c9832da350
[CGP] Drop nneg flag when moving zext past instruction (#72103)
Fix the issue by not reusing the zext at all. The code already handles
creation of new zexts if more than one is needed. Always use that
code-path instead of trying to reuse the old zext in some case.
(Alternatively we could also drop poison-generating flags on the old
zext, but it seems cleaner to not reuse it at all, especially if it's
not always possible anyway.)

Fixes https://github.com/llvm/llvm-project/issues/72046.
2023-11-14 09:03:06 +01:00
Alex Richardson
e39f6c1844 [opt] Infer DataLayout from triple if not specified
There are many tests that specify a target triple/CPU flags but no
DataLayout which can lead to IR being generated that has unusual
behaviour. This commit attempts to use the default DataLayout based
on the relevant flags if there is no explicit override on the command
line or in the IR file.

One thing that is not currently possible to differentiate from a missing
datalayout `target datalayout = ""` in the IR file since the current
APIs don't allow detecting this case. If it is considered useful to
support this case (instead of passing "-data-layout=" on the command
line), I can change IR parsers to track whether they have seen such a
directive and change the callback type.

Differential Revision: https://reviews.llvm.org/D141060
2023-10-26 12:07:37 -07:00
Paul Walker
6383785bad
[SVE][CodeGenPrepare] Sink address calculations that match SVE gather/scatter addressing modes. (#66996)
SVE supports scalar+vector and scalar+extw(vector) addressing modes.
However, the masked gather/scatter intrinsics take a vector of
addresses, which means address computations can be hoisted out of
loops.  The is especially true for things like offsets where the
true size of offsets is lost by the time you get to code generation.

This is problematic because it forces the code generator to legalise
towards `<vscale x 2 x ty>` vectors that will not maximise bandwidth
if the main block datatypes is in fact i32 or smaller.

This patch sinks GEPs and extends for cases where one of the above
addressing modes can be used.

NOTE: There are cases where it would be better to split the extend
in two with one half hoisted out of a loop and the other within the
loop.  Whilst true I think this switch of default is still better
than before because the extra extends are an improvement over being
forced to split a gather/scatter.
2023-10-11 13:20:08 +01:00
Jay Foad
7b3bbd83c0 Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"
This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.
2023-10-09 12:31:32 +01:00
Jay Foad
2501ae58e3
[CodeGen] Really renumber slot indexes before register allocation (#67038)
PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
2023-10-09 11:44:41 +01:00
Alex Richardson
83c4227ab7 Auto-generate test checks for tests affected by D141060
These files had manual CHECK lines which make the diff from D141060
very difficult to review.
2023-10-04 10:51:35 -07:00
Jay Foad
e0919b189b [CodeGen] Renumber slot indexes before register allocation (#66334)
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.
2023-09-19 11:18:12 +01:00
Serguei Katkov
a701b7e368 [CGP] Remove dead PHI nodes before elimination of mostly empty blocks
Before elimination of mostly empty block it makes sense to remove dead PHI nodes.
It open more opportunity for elimination plus eliminates dead code itself.

It appeared that change results in failing many unit tests and some of
them I've updated and for another one I disable this optimization.
The pattern I observed in the tests is that there is a infinite loop
without side effects. As a result after elimination of dead phi node all other
related instruction are also removed and tests stops to check what it is expected.

Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D158503
2023-08-29 04:35:06 +00:00
Harvin Iriawan
db158c7c83 [AArch64] Update generic sched model to A510
Refresh of the generic scheduling model to use A510 instead of A55.
  Main benefits are to the little core, and introducing SVE scheduling information.
  Changes tested on various OoO cores, no performance degradation is seen.

  Differential Revision: https://reviews.llvm.org/D156799
2023-08-21 12:25:15 +01:00
Jordan Rupprecht
f5b5a30858 Revert "[CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it"
This reverts commit 0b1d1cdb89322c277baf5221218a830195fef9d4. It causes a clang crash. Details will be posted to D153638.
2023-08-01 23:08:55 -07:00
Momchil Velikov
0b1d1cdb89 [CodeGenPrepare][NFC] Update the dominator tree instead of rebuilding it
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153638
2023-08-01 18:07:03 +01:00
Matthias Braun
02ba5b8c6b Ignore load/store until stack address computation
No longer conservatively assume a load/store accesses the stack when we
can prove that we did not compute any stack-relative address up to this
point in the program.

We do this in a cheap not-quite-a-dataflow-analysis: Assume
`NoStackAddressUsed` when all predecessors of a block already guarantee
it. Process blocks in reverse post order to guarantee that except for
loop headers we have processed all predecessors of a block before
processing the block itself. For loops we accept the conservative answer
as they are unlikely to be shrink-wrappable anyway.

Differential Revision: https://reviews.llvm.org/D152213
2023-06-26 13:50:36 -07:00
Nikita Popov
b7bd3a734c [CGP] Fix infinite loop in icmp operand swapping
Don't swap the operands if they're the same. Fixes the issue reported
at https://reviews.llvm.org/D152541#4427017.
2023-06-16 15:50:12 +02:00
Serguei Katkov
d119c386cd [CGP] Additional tests for removing operand of assume. NFC. 2023-06-16 11:52:46 +07:00
Serguei Katkov
d57ed844fe [CGP] Add test to show the missed case in remove llvm.assume 2023-06-07 17:20:57 +07:00