132 Commits

Author SHA1 Message Date
David Spickett
9c5ca6b0ce Revert "Enable JumpTableToSwitch pass by default (#82546)"
This reverts commit 1069823ce7d154aa8ef87ae5a0fd34b527eca2a0.

This has caused second stage timeouts when building Flang on
AArch64:
https://lab.llvm.org/buildbot/#/builders/179/builds/9442
2024-02-26 13:35:59 +00:00
Alexander Shaposhnikov
1069823ce7
Enable JumpTableToSwitch pass by default (#82546)
Enable JumpTableToSwitch pass by default.

Test plan: ninja check-all
2024-02-22 11:02:47 -08:00
dewen
3b82336188
Revert "[PM] Execute IndVarSimplifyPass precede RessociatePass" (#71617)
Reverts llvm/llvm-project#71054
2023-11-08 09:22:55 +08:00
dewen
e4d27d7f32
[PM] Execute IndVarSimplifyPass precede RessociatePass (#71054)
ReassociatePass may clear nsw/nuw flags of some instructions, which may
have side effects on optimizations in IndVarSimplifyPass.
2023-11-08 09:21:17 +08:00
Amara Emerson
1a2e77cf9e Revert "Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner.""
This reverts commit 86bfeb906e3a95ae428f3e97d78d3d22a7c839f3.

This is a long time coming re-application that was originally reverted due to
regressions, unrelated to the actual inlining change. These regressions have since
been fixed due to another long-in-the-making change of a66051c6 landing.

Original commit message for reference:
---
    We have several situations where it's beneficial for code size to ensure that every
    call to always-inline functions are inlined before normal inlining decisions are
    made. While the normal inliner runs in a "MandatoryOnly" mode to try to do this,
    it only does it on a per-SCC basis, rather than the whole module. Ensuring that
    all mandatory inlinings are done before any heuristic based decisions are made
    just makes sense.

    Despite being referred to the "legacy" AlwaysInliner pass, it's already necessary
    for -O0 because the CGSCC inliner is too expensive in compile time to run at -O0.

    This also fixes an exponential compile time blow up in
    https://github.com/llvm/llvm-project/issues/59126

    Differential Revision: https://reviews.llvm.org/D143624
---
2023-10-28 23:21:11 -07:00
Florian Hahn
e6a1657fa3
[ConstraintElim] Add A < B if A is an increasing phi for A != B.
This patch adds additional logic to add additional facts for A != B, if
A is a monotonically increasing induction phi. The motivating use case
for this is removing checks when using iterators with hardened libc++,
e.g. https://godbolt.org/z/zhKEP37vG.

The patch pulls in SCEV to detect AddRecs. If possible, the patch adds
the following facts for a AddRec phi PN with StartValue as incoming
value from the loo preheader and B being an upper bound for PN from a
condition in the loop header.

 * (ICMP_UGE, PN, StartValue)
 * (ICMP_ULT, PN, B) [if (ICMP_ULE, StartValue, B)]

The patch also adds an optional precondition to FactOrCheck (the new
DoesHold field) , which can be used to only add a fact if the
precondition holds at the point the fact is added to the constraint
system.

Depends on D151799.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152730
2023-09-27 11:00:28 +01:00
Florian Hahn
04f9a8a7d6
[ConstraintElim] Move just before loop simplification pipeline.
Adjust the pipeline slightly to move ConstraintElim just before the loop
simplification pipeline. This increases the number of cases where SCEV
should can preserved in the future.

This also enables slightly more opportunities, by benefiting from
earlier CFG simplifications, which allow more conditions to be added.

Reviewed By: nikic, antoniofrighetto

Differential Revision: https://reviews.llvm.org/D158843
2023-09-22 14:31:08 +01:00
Dhruv Chawla
3e992d81af
[InferAlignment] Enable InferAlignment pass by default
This gives an improvement of 0.6%:
https://llvm-compile-time-tracker.com/compare.php?from=7d35fe6d08e2b9b786e1c8454cd2391463832167&to=0456c8e8a42be06b62ad4c3e3cf34b21f2633d1e&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D158600
2023-09-20 12:08:52 +05:30
Alexandros Lamprineas
475ddca56e Reland "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo"
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.

Differential Revision: https://reviews.llvm.org/D150375
2023-06-08 17:44:47 +01:00
Nikita Popov
96a14f388b Revert "[FuncSpec] Replace LoopInfo with BlockFrequencyInfo"
As reported on https://reviews.llvm.org/D150375#4367861 and
following, this change causes PDT invalidation issues. Revert
it and dependent commits.

This reverts commit 0524534d5220da5ecb2cd424a46520184d2be366.
This reverts commit ced90d1ff64a89a13479a37a3b17a411a3259f9f.
This reverts commit 9f992cc9350a7f7072a6dbf018ea07142ea7a7ed.
This reverts commit 1b1232047e83b69561fd64b9547cb0a0d374473a.
2023-05-30 14:49:03 +02:00
Arthur Eubanks
13e3d4aa5a [Pipeline] Don't run EarlyFPM in LTO post link
EarlyFPM cleans up the output of the frontend. This isn't necessary in post link pipelines as the pre link pipeline already ran this.

~0.4% savings in ThinLTO builds:
https://llvm-compile-time-tracker.com/compare.php?from=8a5d4eb775c644d8683f24817d44c510d2b853b7&to=3580252a2162eadca0da99f1eeaa112f74a0353d&stat=instructions:u

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D145403
2023-05-25 09:32:54 -07:00
Alexandros Lamprineas
1b1232047e [FuncSpec] Replace LoopInfo with BlockFrequencyInfo.
Using AvgLoopIters on any loop is too imprecise making the cost model
favor users inside loop nests regardless of the actual tripcount.

Differential Revision: https://reviews.llvm.org/D150375
2023-05-22 17:49:52 +01:00
Shoaib Meenai
141be5c062 Revert "Reland [Pipeline] Don't limit ArgumentPromotion to -O3"
This reverts commit 6f29d1adf29820daae9ea7a01ae2588b67735b9e.

https://reviews.llvm.org/D149768 is causing size regressions for -Oz
with FullLTO, and I'm reverting that one while investigating. This
commit depends on that one, so it needs to be reverted as well.
2023-05-05 14:26:57 -07:00
Arthur Eubanks
6f29d1adf2 Reland [Pipeline] Don't limit ArgumentPromotion to -O3
This is a cheap pass so there's no need to limit to -O3.
This removes some differences between various pipelines.

Code size regressions should be addressed with
https://reviews.llvm.org/D149768.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148269
2023-05-03 13:17:30 -07:00
Arthur Eubanks
09d27bdb86 Revert "[Pipeline] Don't limit ArgumentPromotion to -O3"
This reverts commit 5b386b864c7619897c51a1da97d78f1cf6f3eff6.

Causes noticeable size increases under -Oz.
2023-05-03 08:57:47 -07:00
serge-sans-paille
afa13ba18d
Reapply Move "auto-init" instructions to the dominator of their users
Original patch (50b2a113db197a97f60ad2aace8b7382dc9b8c31) ignored the
fact that -ftrivial-auto-var-init could affect function parameters with
the sret attribute.
Just do not move instruction that don't affect alloca.
Also add missing test case for volatile instruction.

Differential Revision: https://reviews.llvm.org/D148507
2023-04-24 18:10:10 +02:00
Nikita Popov
e7e4c76320 [Pipelines] Don't run ForceFunctionAttrs post-link
This is effectively a debugging pass to adjust function attributes.
I don't think it makes sense to run it in the post-link pipeline.

Differential Revision: https://reviews.llvm.org/D148904
2023-04-24 09:58:06 +02:00
Nikita Popov
22a408ae51 [Pipelines] Don't explicitly require ORE
LICM does not use ORE from the pass manager, it constructs its
own instance. As such, explicitly requiring the analysis in the
pipeline is unnecessary.
2023-04-21 13:22:04 +02:00
Arthur Eubanks
5b386b864c [Pipeline] Don't limit ArgumentPromotion to -O3
This is a cheap pass so there's no need to limit to -O3.
This removes some differences between various pipelines.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D148269
2023-04-14 10:00:41 -07:00
Arthur Eubanks
4bf9ca5eec [Pipeline] Remove Annotation2Metadata pass in post-link pipelines
The pre-link pipeline already ran the pass and it only needs to be run once.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D145978
2023-04-12 20:34:09 -07:00
Hans Wennborg
a6d9730f40 Revert "Move "auto-init" instructions to the dominator of their users"
This could also move initialization of sret args, causing actually
initialized parts of such return values to be uninitialized. See
discussion on the code review.

> As a result of -ftrivial-auto-var-init, clang generates instructions to
> set alloca'd memory to a given pattern, right after the allocation site.
> In some cases, this (somehow costly) operation could be delayed, leading
> to conditional execution in some cases.
>
> This is not an uncommon situation: it happens ~500 times on the cPython
> code base, and much more on the LLVM codebase. The benefit greatly
> varies on the execution path, but it should not regress on performance.
>
> This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with
> MemorySSA update fixes.
>
> Differential Revision: https://reviews.llvm.org/D137707

This reverts commit 50b2a113db197a97f60ad2aace8b7382dc9b8c31
and follow-up commit ad9ad3735c4821ff4651fab7537a75b8f0bb60f8.
2023-04-12 13:37:21 +02:00
Dávid Bolvanský
05a2f4290e [AggressiveInstCombine] Enable also for -O2
Next step after https://reviews.llvm.org/D113179

Recently a set of patches by @anton-afanasyev improved many cases (better and cleaner vectorized code) thanks to improvements to AIC's TruncInstCombine (IC cannot handle it) motivated by real examples in bug reports. There was a discussion that -O2 could benefit from AIC as well, but discussion then stalled, so I would like restart it, with new numbers from LLVM compile time tracker.

As -O2 pipeline is not tracked by LLVM compile time tracker, I disabled AIC for -O3 to get an idea how expensive is it. Without AIC, I observed that geomean was cca -0.10%. Given that it seems like AIC is quite cheap, heavily tested by -O3 pipeline, I am proposing to enable it also with -O2 and similar to improve quality to vectorized code.

https://llvm-compile-time-tracker.com/compare.php?from=a1df5abef5f27646c809c7b85cf6170eb68f7735&to=e1ba6068f58c6ca862b920b8750faccb42a5843c&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D147604
Reviewed-By: nikic
2023-04-05 16:51:21 +02:00
serge-sans-paille
50b2a113db
Move "auto-init" instructions to the dominator of their users
As a result of -ftrivial-auto-var-init, clang generates instructions to
set alloca'd memory to a given pattern, right after the allocation site.
In some cases, this (somehow costly) operation could be delayed, leading
to conditional execution in some cases.

This is not an uncommon situation: it happens ~500 times on the cPython
code base, and much more on the LLVM codebase. The benefit greatly
varies on the execution path, but it should not regress on performance.

This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with
MemorySSA update fixes.

Differential Revision: https://reviews.llvm.org/D137707
2023-04-04 07:30:03 +02:00
serge-sans-paille
11ae47dfc6
Revert "Move "auto-init" instructions to the dominator of their users"
This reverts commit cca01008cc31a891d0ec70aff2201b25d05d8f1b.

This change breaks memory ssa checks, see https://lab.llvm.org/buildbot#builders/109/builds/60970
2023-04-03 15:46:18 +02:00
serge-sans-paille
cca01008cc
Move "auto-init" instructions to the dominator of their users
As a result of -ftrivial-auto-var-init, clang generates instructions to
set alloca'd memory to a given pattern, right after the allocation site.
In some cases, this (somehow costly) operation could be delayed, leading
to conditional execution in some cases.

This is not an uncommon situation: it happens ~500 times on the cPython
code base, and much more on the LLVM codebase. The benefit greatly
varies on the execution path, but it should not regress on performance.

Differential Revision: https://reviews.llvm.org/D137707
2023-04-03 15:27:27 +02:00
Arthur Eubanks
2b34d59858 [test] Change DAG to NEXT in pipeline tests
These were made consistent in 951a980dc7aa6.
2023-03-21 10:44:33 -07:00
Arthur Eubanks
eecb8c5f06 [SampleProfile] Use LazyCallGraph instead of CallGraph
The function order in some tests had to be changed because they relied on ordering of functions returned in an SCC which is consistent but unspecified.
2023-03-20 13:43:54 -07:00
Arthur Eubanks
361cba22b2 [StandardInstrumentations] Rename -verify-cfg-preserved -> -verify-analysis-invalidation
In preparation for adding more checks under this flag.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D146069
2023-03-15 13:07:55 -07:00
Arthur Eubanks
20ed9cebb6 [Pipeline] Remove early InstCombine in ThinLTO post link sample profile pipeline
With opaque pointers, all function pointer types are the same, meaning there should be no bitcasts.

Internal benchmarks with SampleFDO look neutral.

This was added in D36333.

Reviewed By: tejohnson, davidxl

Differential Revision: https://reviews.llvm.org/D146099
2023-03-14 19:48:31 -07:00
Arthur Eubanks
0d4a709bb8 [Pipeline] Adjust PostOrderFunctionAttrs placement in simplification pipeline
We can infer more attribute information once functions are fully
simplified, so move the PostOrderFunctionAttrs pass after the function
simplification pipeline. However, just doing this can impact
simplification of recursive functions since function simplification
takes advantage of function attributes of callees (some LLVM tests are
actually impacted by this), so keep a copy of PostOrderFunctionAttrs
before the function simplification pipeline that only runs on recursive
functions.

For example, this fixes the small regression noticed in https://reviews.llvm.org/D128830.

This requires some restructuring of the CGSCC NoRerun feature. We need
to cache the ShouldNotRunFunctionPassesAnalysis analysis after the
simplification is done, which now is after the second
PostOrderFunctionAttrs run, rather than after the function
simplification pipeline.

Compile time impact:
https://llvm-compile-time-tracker.com/compare.php?from=33cf40122279342b50f92a3a53f5c185390b6018&to=1bb2a07875634e508a6bdf2ca1b130f55510f060&stat=instructions:u

Compile time increase from unconditionally running the first PostOrderFunctionAttrs:
https://llvm-compile-time-tracker.com/compare.php?from=1bb2a07875634e508a6bdf2ca1b130f55510f060&to=f4f87e89cc7a35c64e3a103a8036192a84ae002b&stat=instructions:u

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D145210
2023-03-06 09:01:45 -08:00
Rong Xu
666731660c [Pass][CHR] Move ControlHeightReduction to module optimization pipeline
This is a modified version of commit b374423304a8 by
Arthur (https://reviews.llvm.org/D143424).

Here we invoke to the pass independent of PGOOPT. We now check if the
profile is available through the program summary. This ensures CHR is
called in distributed ThinLTO BE compilation (where PGOOPT might not
be created).

Differential Revision: https://reviews.llvm.org/D144769
2023-02-27 11:47:54 -08:00
Arthur Eubanks
a628ca4925 Revert "[Pipeline] Move ControlHeightReduction to module optimization pipeline"
This reverts commit b374423304a8d91d590d0ce5ab1b381296d6dfb2.

Causes regressions on some benchmarks.
2023-02-23 10:17:12 -08:00
Arthur Eubanks
b374423304 [Pipeline] Move ControlHeightReduction to module optimization pipeline
This pass isn't a simplification, it's a non-canonical optimization.

This makes it only run once in a (Thin)LTO pipeline during postlink, just like all the other optimization pipeline passes.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D143424
2023-02-16 15:23:38 -08:00
David Green
86bfeb906e Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner."
This seems to cause large regressions in existing code, as much as 75% slower
(4x the time taken). Small always inline functions seem to be used a lot in the
cmsis-dsp library.

I would add a phase ordering test to show the problems, but one already exists!
The llvm/test/Transforms/PhaseOrdering/ARM/arm_mult_q15.ll was just changed by
removing alwaysinline to hide the problems that existed.

This reverts commit cae033dcf227aeecf58fca5af6fc7fde1fd2fb4f.
This reverts commit 8e33c41e72ad42e4c27f8cbc3ad2e02b169637a1.
2023-02-10 15:01:49 +00:00
Amara Emerson
cae033dcf2 Inlining: Run the legacy AlwaysInliner before the regular inliner.
We have several situations where it's beneficial for code size to ensure that every
call to always-inline functions are inlined before normal inlining decisions are
made. While the normal inliner runs in a "MandatoryOnly" mode to try to do this,
it only does it on a per-SCC basis, rather than the whole module. Ensuring that
all mandatory inlinings are done before any heuristic based decisions are made
just makes sense.

Despite being referred to the "legacy" AlwaysInliner pass, it's already necessary
for -O0 because the CGSCC inliner is too expensive in compile time to run at -O0.

This also fixes an exponential compile time blow up in
https://github.com/llvm/llvm-project/issues/59126

Differential Revision: https://reviews.llvm.org/D143624
2023-02-09 16:49:29 -08:00
Florian Hahn
8028263c41
Recommit "[ConstraintElim] Enable pass by default."
This reverts commit 695ce48c63ec582a46bfbda9b066f4d3bcde143f.

The compile-time regression causing the revert has been fixed. Recommit
the original patch.

Original commit message:

   The pass should help to close a functional gap when it comes to
    reasoning about related conditions in a relatively general way.

    It addresses multiple existing issues (linked below) and the need for a
    more powerful reasoning system was also discussed recently in
    https://discourse.llvm.org/t/rfc-alternative-approach-of-dealing-with-implications-from-comparisons-through-pos-analysis/65601/7

    On AArch64, the new pass performs ~2000 simplifications on
    MultiSource,SPEC2006,SPEC2017 with -O3.

    Compile-time impact:

    NewPM-O3: +0.20%
    NewPM-ReleaseThinLTO: +0.32%
    NewPM-ReleaseLTO-g: +0.28%

    https://llvm-compile-time-tracker.com/compare.php?from=f01a3a893c147c1594b9a3fbd817456b209dabbf&to=577688758ef64fb044215ec3e497ea901bb2db28&stat=instructions:u

    Fixes #49344.
    Fixes #47888.
    Fixes #48253.
    Fixes #49229.
    Fixes #58074.

    Reviewed By: asbirlea

    Differential Revision: https://reviews.llvm.org/D135915
2023-02-06 18:09:43 +00:00
Florian Hahn
695ce48c63
Revert "[ConstraintElim] Enable pass by default."
This reverts commit fb13dcf3431cd83911fe56899d2fade808dc5b8d.

A large compile-time regression for code generated by sanitizers has
been reported. Revert while I investigate the issue. Details and
reproducers are available here: https://reviews.llvm.org/D135915
2023-01-18 14:25:00 +00:00
Alexandros Lamprineas
572a757fa7 [IPSCCP] Enable specialization of functions.
Re-enable the optimization after having fixed the compilation error
found in SPEC/CINT2017rate/502.gcc_r when both LTO and PGO are in use
(see https://reviews.llvm.org/D141474).

Differential Revision: https://reviews.llvm.org/D140210
2023-01-13 14:04:17 +00:00
Florian Hahn
fb13dcf343
[ConstraintElim] Enable pass by default.
The pass should help to close a functional gap when it comes to
reasoning about related conditions in a relatively general way.

It addresses multiple existing issues (linked below) and the need for a
more powerful reasoning system was also discussed recently in
https://discourse.llvm.org/t/rfc-alternative-approach-of-dealing-with-implications-from-comparisons-through-pos-analysis/65601/7

On AArch64, the new pass performs ~2000 simplifications on
MultiSource,SPEC2006,SPEC2017 with -O3.

Compile-time impact:

NewPM-O3: +0.20%
NewPM-ReleaseThinLTO: +0.32%
NewPM-ReleaseLTO-g: +0.28%

https://llvm-compile-time-tracker.com/compare.php?from=f01a3a893c147c1594b9a3fbd817456b209dabbf&to=577688758ef64fb044215ec3e497ea901bb2db28&stat=instructions:u

Fixes #49344.
Fixes #47888.
Fixes #48253.
Fixes #49229.
Fixes #58074.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D135915
2023-01-04 18:00:37 +00:00
Florian Hahn
60359f56aa
Revert "[IPSCCP] Enable specialization of functions."
This reverts commit 2656572d485127cc30b8fe9752024d2a0f1c50db.

It looks like CINT2017rate/502.gcc_r gets mis-compiled with LTO + PGO on
AArch64 with function specialization.
2022-12-26 16:02:59 +00:00
Alexandros Lamprineas
2656572d48 [IPSCCP] Enable specialization of functions.
This patch enables Function Specialization by default at all
optimization levels except Os, Oz.

Compilation Time Overhead:
--------------------------
Measured the Instruction Count increase (Geomean) for CTMark from
the llvm-testsuite as in https://llvm-compile-time-tracker.com.
 * {-O3, Non-LTO}: +0.136% Instruction Count
 * {-O3, LTO}: +0.346% Instruction Count

Performance Uplift:
-------------------
Measured +9.121% score increase for 505.mcf_r from SPEC Int 2017
(Tested on Neoverse N1 with -O3 + LTO)

Correctness Testing:
--------------------
 * Passes bootstrap Clang with ASAN + LTO + FuncSpec aggressive options:
   { MaxClonesThreshold=10,
     SmallFunctionThreshold=10,
     AvgLoopIterationCount=30,
     SpecializeOnAddresses=true,
     EnableSpecializationForLiteralConstant=true,
     FuncSpecializationMaxIters=10 }
 * Builds Chromium and passes its unittests with the above options + ThinLTO.

For more info please refer to
https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518

Differential Revision: https://reviews.llvm.org/D140210
2022-12-25 10:05:21 +02:00
Nikita Popov
243acd5dcb [BasicAA] Remove support for PhiValues analysis
BasicAA currently has an optional dependency on the PhiValues
analysis. However, at least with our current pipeline setup, we
never actually make use of it. It's possible that this used to work
with the legacy pass manager, but I'm not sure of that either.

Given that this analysis has not actually been in use for a long
time, and nobody noticed or complained, I think we should drop
support for it and focus on one code path. It is worth noting that
analysis quality for the non-PhiValues case has significantly
improved in the meantime.

If we really wanted to make use of PhiValues, the right way would
probably be to pass it in via AAQI in places we want to use it,
rather than using an optional pass manager dependency (which are
an unpredictable PITA and should really only ever be used for
analyses that are only preserved and not used).

Differential Revision: https://reviews.llvm.org/D139719
2022-12-12 09:47:30 +01:00
Sjoerd Meijer
8250180238 Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit 3ea6a9a469fde168c527b1c34c09f6d684ec86af because of the
reported miscompilation in: https://github.com/llvm/llvm-project/issues/59339
2022-12-05 15:14:12 +00:00
Sjoerd Meijer
3ea6a9a469 Recommit "[LoopFlatten] Enable it by default"
The problem in 58441 that was reported after enabling this last time was fixed
in 8e9e22f07bcbe2ee95478684cf31948370e4e51e.
2022-11-29 10:45:13 +00:00
Arthur Eubanks
4b3202e639 [opt] Remove "new-pm" from some cl::opt names 2022-11-28 11:00:45 -08:00
Sanjay Patel
163bb6d64e [Passes][VectorCombine] enable early run generally and try load folds
An early run of VectorCombine was added with D102496 specifically to
deal with unnecessary vector ops produced with the C matrix extension.
This patch is proposing to try those folds in general and add a pair
of load folds to the menu.

The load transform will partly solve (see PhaseOrdering diffs) a
longstanding vectorization perf bug by removing redundant loads via GVN:
issue #17113

The main reason for not enabling the extra pass generally in the initial
patch was compile-time cost. The cost of VectorCombine was significantly
(surprisingly) improved with:
87debdadaf18
https://llvm-compile-time-tracker.com/compare.php?from=ffe05b8f57d97bc4340f791cb386c8d00e0739f2&to=87debdadaf18f8a5c7e5d563889e10731dc3554d&stat=instructions:u

...so the extra run is going to cost very little now - the total cost of
the 2 runs should be less than the 1 run before that micro-optimization:
https://llvm-compile-time-tracker.com/compare.php?from=5e8c2026d10e8e2c93c038c776853bed0e7c8fc1&to=2c4b68eab5ae969811f422714e0eba44c5f7eefb&stat=instructions:u

It may be possible to reduce the cost slightly more with a few more
earlier-exits like that, but it's probably in the noise based on timing
experiments.

Differential Revision: https://reviews.llvm.org/D138353
2022-11-21 13:57:55 -05:00
Roman Lebedev
8adfa29706
[Pipelines] Introduce SROA after (final, run-time) loop unrolling
Now that we are done with loop unrolling, be it either by LoopVectorizer,
or LoopUnroll passes, some variable-offset GEP's into alloca's could have
become constant-offset, thus enabling SROA and alloca promotion,
yet we don't capitalize on that, which is surprizing.

While it would be good to not introduce one more SROA invocation,
but instead move the one from `PassBuilder::buildFunctionSimplificationPipeline()`,
the existing test coverage says that is a bad idea,
though it would be fine compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=b150d34c47efbd8fa09604bce805c0920360f8d7&to=5a9a5c855158b482552be8c7af3e73d67fa44805&stat=instructions

So instead, i add yet another SROA run.
I have checked, and it needs to be at least after said final loop unrolling.
This is still fine compile-time wise: https://llvm-compile-time-tracker.com/compare.php?from=70324cd88328c0924e605fa81b696572560aa5c9&to=fb489bbef687ad821c3173a931709f9cad9aee8a&stat=instructions

I've encountered this in a real code, `SROA-after-final-loop-unrolling.ll` has been reduced from https://godbolt.org/z/fsdMhETh3

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D136806
2022-11-17 21:31:30 +03:00
Sanjay Patel
f44e846402 [Passes] reduce duplicated CHECK lines in tests; NFC 2022-11-17 13:10:57 -05:00
Sjoerd Meijer
f7c42a278b Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit 5b9597f59a445523bd59b5251ab1c2865e74919f.

A miscompilation was reported:
https://github.com/llvm/llvm-project/issues/58441

Reverting this while I look at that.
2022-10-18 23:36:36 +05:30
Sjoerd Meijer
5b9597f59a Recommit "[LoopFlatten] Enable it by default"
The sanitizer bots turned green again after another change went in, i.e.
revert 26dd64ba9cfabe5474bb207f3b7099965f81fed7, so I don't think this
patch was causing the problems.
2022-10-17 23:27:19 +05:30