50 Commits

Author SHA1 Message Date
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Kazu Hirata
186dc9a4df
[CodeGen] Avoid repeated hash lookups (NFC) (#113414) 2024-11-04 09:41:09 -08:00
Ellis Hoag
6ab26eab4f
Check hasOptSize() in shouldOptimizeForSize() (#112626) 2024-10-28 09:45:03 -07:00
Kazu Hirata
66cd2e0f9a
[CodeGen] Use range-based for loops (NFC) (#98706) 2024-07-13 13:29:47 -07:00
David Green
0a62a99aa6
[SelectOpt] Add handling for not conditions. (#92517)
This patch attempts to help the SelectOpt pass detect select groups made
up of conditions and not(conditions). Usually these are canonicalized in
instcombine to remove the not and invert the true/false values, but this
will not happen for Loginal operations, which can be beneficial to
convert if they are part of a larger select group. The handling for
not's are mostly handled in the SelectLike, which can be marked as
Inverted in order to reverse the TrueValue and FalseValue.

This helps fix a regression in fortran minloc constructs, after #84628
helped simplify a loop with branches into a loop with selects.
2024-05-22 19:28:24 +01:00
Stephen Tozer
ffd08c7759
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:

- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.

Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:

```
  DPValue -> DbgVariableRecord
  DPVal -> DbgVarRec
  DPV -> DVR
```

Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
2024-03-19 20:07:07 +00:00
Stephen Tozer
360da83858
[RemoveDI][NFC] Rename DPValue->DbgRecord in comments and varnames (#84939)
This patch continues the ongoing rename work, replacing DPValue with
DbgRecord in comments and the names of variables, both members and
fn-local. This is the most labour-intensive part of the rename, as it is
where the most decisions have to be made about whether a given comment
or variable is referring to DPValues (equivalent to debug variable
intrinsics) or DbgRecords (a catch-all for all debug intrinsics); these
decisions are not individually difficult, but comprise a fairly large
amount of text to review.

This patch still largely performs basic string substitutions followed by
clang-format; there are almost* no places where, for example, a comment
has been expanded or modified to reflect the semantic difference between
DPValues and DbgRecords. I don't believe such a change is generally
necessary in LLVM, but it may be useful in the docs, and so I'll be
submitting docs changes as a separate patch.

*In a few places, `dbg.values` was replaced with `debug intrinsics`.
2024-03-13 16:39:35 +00:00
Stephen Tozer
15f3f446c5
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.

This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
2024-03-12 14:53:13 +00:00
Orlando Cazalet-Hyams
a50bd0d799
[RemoveDIs] Replicate dbg intrinsic movement pattern in SelectOptimize (#81737)
Fix crash mentioned in comments on
d759618df76361a8e490eeae5c5399e0738cbfd0.

The assertion being hit was complaining that we had dangling DPValues;
the DPValues attached to the terminator of StartBlock become dangling
after the terminator is erased, and they're never "flushed" back onto
the new terminator once it's added. Doing that makes the crash go away,
but doesn't replicate existing dbg.* behaviour. See the comment in the
patch.

This change both fixes the crash (because there are now no DPValues left
on the terminator to dangle) and replicates existing behaviour (moves
those DPValues down to the new block).
2024-02-14 14:48:16 +00:00
wangpc
995d21bc6f [SelectOpt] Print instruction instead of pointer
Pull Request: https://github.com/llvm/llvm-project/pull/80125
2024-02-01 13:10:52 +08:00
David Green
a2d68b4bec
[SelectOpt] Add handling for Select-like operations. (#77284)
Some operations behave like selects. For example `or(zext(c), y)` is the
same as select(c, y|1, y)` and instcombine can canonicalize the select
to the or form. These operations can still be worthwhile converting to
branch as opposed to keeping as a select or or instruction.

This patch attempts to add some basic handling for them, creating a
SelectLike abstraction in the select optimization pass. The backend can
opt into handling `or(zext(c),x)` as a select if it could be profitable,
and the select optimization pass attempts to handle them in much the
same way as a `select(c, x|1, x)`. The Or(x, 1) may need to be added as
a new instruction, generated as the or is converted to branches.

This helps fix a regression from selects being converted to or's
recently.
2024-01-22 23:46:58 +00:00
Jeremy Morse
d7fb9eb818
[DebugInfo][RemoveDIs] Handle DPValues in SelectOptimize (#79005)
When there are debug intrinsics in-between groups of select
instructions, select-optimise sinks them into the "end" block. This
needs to be replicated for DPValues, the non-instruction variable
assignment object. Implement that and add a RUN line to a test that was
sensitive to this to ensure it gets tested.

(The exact range of instructions being transformed here is a little
fiddly, hence I've gone with a helper lambda).
2024-01-22 18:12:24 +00:00
paperchalice
ce08c7ee1e
[CodeGen] Port SelectOptimize to new pass manager (#74920)
- Use `BlockFrequencyInfoWrapperPass` in legacy pass so member
`std::unique_ptr<BranchProbabilityInfo> BPI` could be removed.
- Member `DominatorTree *DT = nullptr` is unused, remove it.
2023-12-12 12:09:30 +08:00
Kazu Hirata
57eb4826e5 [llvm] Stop including string (NFC)
Identified with clangd.
2023-12-03 16:24:43 -08:00
Matthias Braun
5181156b37
Use BlockFrequency type in more places (NFC) (#68266)
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.

- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
2023-10-05 11:40:17 -07:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Jeremy Morse
4427407a29 [NFC][RemoveDIs] Create a new spelling of the moveBefore method
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.

The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D156369
2023-09-07 18:37:57 +01:00
Elliot Goodrich
ac73c48e09 [llvm] Reduce ComplexDeinterleavingPass.h includes
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.

Add missing includes that were transitively included by this header to 3
other source files.

This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.
2023-05-20 17:49:18 +01:00
Elliot Goodrich
b7fb2a3fec Revert "[llvm] Reduce ComplexDeinterleavingPass.h includes"
This reverts commit 058ca5c07106d38ad66e3ec4972a613a64e88151.
2023-05-20 14:21:07 +01:00
Elliot Goodrich
058ca5c071 [llvm] Reduce ComplexDeinterleavingPass.h includes
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.

Add missing includes that were transitively included by this header to 2
other source files.

This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.

Differential Revision: https://reviews.llvm.org/D150514
2023-05-20 13:36:50 +01:00
Akshay Khadse
8bf7f86d79 Fix uninitialized pointer members in CodeGen
This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148303
2023-04-17 16:32:46 +08:00
Fangrui Song
51b685734b [Transforms,CodeGen] std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
2022-12-16 23:21:27 +00:00
Kazu Hirata
6eb0b0a045 Don't include Optional.h
These files no longer use llvm::Optional.
2022-12-14 21:16:22 -08:00
Phoebe Wang
7168e501e4 [NFC] Add checks for potential null returns 2022-12-13 22:30:31 +08:00
Fangrui Song
67819a72c6 [CodeGen] llvm::Optional => std::optional 2022-12-13 09:06:36 +00:00
David Green
16a72a0f87 [AArch64] Enable the select optimize pass for AArch64
This enabled the select optimize patch for ARM Out of order AArch64
cores. It is trying to solve a problem that is difficult for the
compiler to fix. The criteria for when a csel is better or worse than a
branch depends heavily on whether the branch is well predicted and the
amount of ILP in the loop (as well as other criteria like the core in
question and the relative performance of the branch predictor).  The
pass seems to do a decent job though, with the inner loop heuristics
being well implemented and doing a better job than I had expected in
general, even without PGO information.

I've been doing quite a bit of benchmarking. The headline numbers are
these for SPEC2017 on a Neoverse N1:
  500.perlbench_r   -0.12%
  502.gcc_r         0.02%
  505.mcf_r         6.02%
  520.omnetpp_r     0.32%
  523.xalancbmk_r   0.20%
  525.x264_r        0.02%
  531.deepsjeng_r   0.00%
  541.leela_r       -0.09%
  548.exchange2_r   0.00%
  557.xz_r          -0.20%

Running benchmarks with a combination of the llvm-test-suite plus
several versions of SPEC gave between a 0.2% and 0.4% geomean
improvement depending on the core/run. The instruction count went down
by 0.1% too, which is a good sign, but the results can be a little
noisy.  Some issues from other benchmarks I had ran were improved in
rGca78b5601466f8515f5f958ef8e63d787d9d812e. In summary well predicted
branches will see in improvement, badly predicted branches may get
worse, and on average performance seems to be a little better overall.

This patch enables the pass for AArch64 under -O3 for cores that will
benefit for it. i.e. not in-order cores that do not fit into the "Assume
infinite resources that allow to fully exploit the available
instruction-level parallelism" cost model. It uses a subtarget feature
for specifying when the pass will be enabled, which I have enabled under
cpu=generic as the performance increases for out of order cores seems
larger than any decreases for inorder, which were minor.

Differential Revision: https://reviews.llvm.org/D138990
2022-12-03 16:08:58 +00:00
Kazu Hirata
998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
David Green
ca78b56014 [SelectOpt] Don't treat LogicalAnd/LogicalOr as selects
A `select i1 %c, i1 true, i1 %d` is just an or and a `select i1 %c, i1 %d, i1 false`
is just an and. There are better treated as such in the logic of SelectOpt, allowing
the backend to optimize them to and/or directly.

Differential Revision: https://reviews.llvm.org/D138490
2022-11-24 14:29:57 +00:00
David Green
7d098988bc [SelectOptimize] Add some debug logging. NFC
This is some quick debug messages for the SelectOptimize pass, adding
some information for the costs that are measured from getInstructionCost
calls, and re-using the existing optimization remarks to print some
information about if transforms were performed or not.

Differential Revision: https://reviews.llvm.org/D138108
2022-11-22 13:47:56 +00:00
Kazu Hirata
6ba4b62af8 Return None instead of Optional<T>() (NFC)
This patch replaces:

  return Optional<T>();

with:

  return None;

to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.

Note that "return None" far outnumbers "return Optional<T>();".  There
are more than 2000 instances of "return None" in our source tree.

All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>.  Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138464
2022-11-21 19:06:42 -08:00
Sotiris Apostolakis
b827e7c600 [SelectOpti] Restrict load sinking
This is a follow-up to D133777, which resolved a use-after-free case but
did not cover all possible memory bugs due to misplacement of loads.

In short, the overall problem was that sinked loads could be moved after
state-modifying instructions leading to memory bugs.

The solution is to restrict load sinking unless it is found to be sound.
i) Within a basic block (to-be-sinked load and select-user are in the same BB),
loads can be sinked only if there is no intervening state-modifying instruction.
This is a conservative approach to avoid resorting to alias analysis to detect
potential memory overlap.
ii) Across basic blocks, sinking of loads is avoided. This is because going over
multiple basic blocks looking for memory conflicts could be computationally
expensive and also unlikely to allow loads to sink. Further, experiments showed
that not sinking these loads has a slight positive performance effect.
Maybe for some of these loads, having some separation allows enough time
for the load to be executed in time for its user. This is not the case for
floating point operations that benefit more from sinking.

The solution in D133777 was essentially undone in this patch,
since the latter is a complete solution to the observed problem.

Overall, the performance impact of this patch is minimal.
Tested on two internal Google workloads with instrPGO.
Search application showed <0.05% perf difference,
while the database one showed a slight improvement,
but not statistically significant.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D133999
2022-09-16 20:50:46 +00:00
Sotiris Apostolakis
eda61fb656 [SelectOpti] Fix lifetime intrinsic bug
When a select is converted to a branch and load instructions are sinked to the true/false blocks,
lifetime intrinsics (if present) could be made unsound if not moved.

This conservatively moves all lifetime intrinsics in a transformed BB to the end block to ensure
preserved lifetime semantics.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D133777
2022-09-13 19:00:18 +00:00
Kazu Hirata
258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Paul Kirth
d434e40f39 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-08-03 00:09:45 +00:00
Paul Kirth
6e9bab71b6 Revert "[llvm][NFC] Refactor code to use ProfDataUtils"
This reverts commit 300c9a78819b4608b96bb26f9320bea6b8a0c4d0.

We will reland once these issues are ironed out.
2022-07-27 21:38:11 +00:00
Paul Kirth
300c9a7881 [llvm][NFC] Refactor code to use ProfDataUtils
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.

Reviewed By: bogner, davidxl

Differential Revision: https://reviews.llvm.org/D128860
2022-07-27 21:13:54 +00:00
Kazu Hirata
9e6d1f4b5d [CodeGen] Qualify auto variables in for loops (NFC) 2022-07-17 01:33:28 -07:00
Kazu Hirata
611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata
7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata
e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Kazu Hirata
b254d67160 [llvm] Call *set::insert without checking membership first (NFC) 2022-06-18 08:32:54 -07:00
Sotiris Apostolakis
67be40df6e Recommit "[SelectOpti][5/5] Optimize select-to-branch transformation"
Use container::size_type directly to avoid type mismatch causing build failures in Windows.

Original commit message:
This patch optimizes the transformation of selects to a branch when the heuristics deemed it profitable.
It aggressively sinks eligible instructions to the newly created true/false blocks to prevent their
execution on the common path and interleaves dependence slices to maximize ILP.

Depends on D120232

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120233
2022-05-24 14:08:09 -04:00
Sotiris Apostolakis
1786e70bd8 Revert "[SelectOpti][5/5] Optimize select-to-branch transformation"
This reverts commit a111fb960108df910a864500f3b98d75d37f083c.
2022-05-24 00:02:00 -04:00
Sotiris Apostolakis
a111fb9601 [SelectOpti][5/5] Optimize select-to-branch transformation
This patch optimizes the transformation of selects to a branch when the heuristics deemed it profitable.
It aggressively sinks eligible instructions to the newly created true/false blocks to prevent their
execution on the common path and interleaves dependence slices to maximize ILP.

Depends on D120232

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120233
2022-05-23 23:31:27 -04:00
Sotiris Apostolakis
d7ebb74611 [SelectOpti][4/5] Loop Heuristics
This patch adds the loop-level heuristics for determining whether branches are more profitable than conditional moves.
These heuristics apply to only inner-most loops.

Depends on D120231

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120232
2022-05-23 22:05:41 -04:00
Sotiris Apostolakis
8b42bc5662 [SelectOpti][3/5] Base Heuristics
This patch adds the base heuristics for determining whether branches are more profitable than conditional moves.
Base heuristics apply to all code apart from inner-most loops.

Depends on D122259

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D120231
2022-05-23 22:01:12 -04:00
Sotiris Apostolakis
97c3ef5c8a [SelectOpti][2/5] Select-to-branch base transformation
This patch implements the actual transformation of selects to branches.
It includes only the base transformation without any sinking.

Depends on D120230

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D122259
2022-05-23 16:11:40 -04:00
Sotiris Apostolakis
ca7c307d18 [SelectOpti][1/5] Setup new select-optimize pass
This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.

Initially, this new pass is expected to be enabled only for instrumentation-based PGO.

RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D120230
2022-05-19 16:31:10 +00:00