437 Commits

Author SHA1 Message Date
Yingwei Zheng
b6c0ce0bb6
[IR][NFC] Use SwitchInst::defaultDestUnreachable (#134199) 2025-04-03 14:47:47 +08:00
Matt Arsenault
9a913a3944
InlineCostAnnotationPrinter: Fix constructing random TargetTransformInfo (#133637)
Query the correct TTI for the current target instead of constructing
some random default one. Also query the pass manager for
ProfileSummaryInfo.
This should only change the printing, not the actual result.
2025-03-30 22:04:59 +07:00
Kazu Hirata
1617f90a61
[Analysis] Use *Set::insert_range (NFC) (#132878)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-25 07:51:39 -07:00
Vasileios Porpodas
973ea045aa Revert "[Analysis][EphemeralValuesAnalysis][NFCI] Remove EphemeralValuesCache class (#132454)"
This reverts commit 4adefcfb856aa304b7b0a9de1eec1814f3820e83.
2025-03-22 10:13:39 -07:00
vporpo
4adefcfb85
[Analysis][EphemeralValuesAnalysis][NFCI] Remove EphemeralValuesCache class (#132454)
This is a follow-up to https://github.com/llvm/llvm-project/pull/130210.
The EphemeralValuesAnalysis pass used to return an EphemeralValuesCache
object which used to hold the ephemeral values and used to provide a
lazy collection of the ephemeral values, and an invalidation using the
`clear()` function.

This patch removes the EphemeralValuesCache class completely and instead
returns the SmallVector containing the ephemeral values.
2025-03-21 18:18:03 -07:00
vporpo
08dda4dcbf
[Analysis][EphemeralValuesCache][InlineCost] Ephemeral values caching for the CallAnalyzer (#130210)
This patch does two things:

1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.

2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
2025-03-19 18:18:45 -07:00
Kazu Hirata
e0ed5e8db9
[Analysis] Avoid repeated hash lookups (NFC) (#127574) 2025-02-18 08:38:50 -08:00
Kazu Hirata
8e85b77f6a
[Analysis] Avoid repeated hash lookups (NFC) (#123446)
Co-authored-by: Nikita Popov <github@npopov.com>
2025-01-18 10:32:14 -08:00
Marina Taylor
e6bd00c0f7
[Inliner] Add a helper around SimplifiedValues.lookup. NFCI (#118646) 2024-12-04 21:27:02 +00:00
Marina Taylor
8fb748b4a7
[Inliner] Don't count a call penalty for foldable __memcpy_chk and similar (#117876)
When the size is an appropriate constant, __memcpy_chk will turn into a
memcpy that gets folded away by InstCombine. Therefore this patch avoids
counting these as calls for purposes of inlining costs.

This is only really relevant on platforms whose headers redirect memcpy
to __memcpy_chk (such as Darwin). On platforms that use intrinsics,
memcpy and similar functions are already exempt from call penalties.
2024-11-29 18:28:39 +00:00
Min-Yih Hsu
64314dedeb
[InlineCost] Print inline cost for invoke call sites as well (#114476)
Previously InlineCostAnnotationPrinter only prints inline cost for call
instructions. I don't think there is any reason not to analyze invoke
and its callee, and this patch adds such support.
2024-11-01 09:55:17 -07:00
Shilei Tian
e34e27f198
[TTI][AMDGPU] Allow targets to adjust LastCallToStaticBonus via getInliningLastCallToStaticBonus (#111311)
Currently we will not be able to inline a large function even if it only
has one live use because the inline cost is still very high after
applying `LastCallToStaticBonus`, which is a constant. This could
significantly impact the performance because CSR spill is very
expensive.

This PR adds a new function `getInliningLastCallToStaticBonus` to TTI to
allow targets to customize this value.

Fixes SWDEV-471398.
2024-10-11 10:19:54 -04:00
Kazu Hirata
8a9e9a89f0
[Analysis] Avoid repeated hash lookups (NFC) (#110397) 2024-09-29 08:50:37 -07:00
Sergei Barannikov
75c7bca740
[DataLayout] Remove constructor accepting a pointer to Module (#102841)
The constructor initializes `*this` with `M->getDataLayout()`, which
is effectively the same as calling the copy constructor.
There does not seem to be a case where a copy would be necessary.

Pull Request: https://github.com/llvm/llvm-project/pull/102841
2024-08-13 04:00:19 +03:00
Yingwei Zheng
be7239e5a6
[Inline] Remove bitcast handling in CallAnalyzer::stripAndComputeInBoundsConstantOffsets (#97988)
As we are now using opaque pointers, bitcast handling is no longer
needed.

Closes https://github.com/llvm/llvm-project/issues/97590.
2024-07-09 15:08:04 +08:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
DianQK
d48bf8aef2
Reapply "[InlineCost] Correct the default branch cost for the switch statement (#85160)"
This reverts commit c6e4f6309184814dfc4bb855ddbdb5375cc971e0.
2024-05-10 21:18:53 +08:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
DianQK
c6e4f63091
Revert "[InlineCost] Correct the default branch cost for the switch statement (#85160)"
This reverts commit 882814edd33cab853859f07b1dd4c4fa1393e0ea.
2024-05-05 21:54:30 +08:00
Quentin Dian
882814edd3
[InlineCost] Correct the default branch cost for the switch statement (#85160)
Fixes #81723.

The earliest commit of the related code is:
919f9e8d65.
I tried to understand the following code with
https://github.com/llvm/llvm-project/pull/77856#issuecomment-1993499085.

5932fcc478/llvm/lib/Analysis/InlineCost.cpp (L709-L720)

I think only scenarios where there is a default branch were considered.
2024-05-05 21:28:31 +08:00
Xiangyang (Mark) Guo
1607e8212c
[InlineCost] Disable cost-benefit when sample based PGO is used (#86626)
#66457 makes InlineCost to use cost-benefit by default, which causes
0.4-0.5% performance regression on multiple internal workloads. See
discussions https://github.com/llvm/llvm-project/pull/66457. This pull
request reverts it.

Co-authored-by: helloguo <helloguo@meta.com>
2024-03-28 10:11:57 -07:00
Quentin Dian
5932fcc478
[InlineCost] Consider the default branch when calculating cost (#77856)
First step in fixing #76772.

This PR considers the default branch as a case branch. This will give
the unreachable default branch fair consideration.
2024-02-11 18:24:59 +08:00
Kazu Hirata
eeb0e9f82e [Analysis] Use llvm::successors (NFC) 2024-01-25 18:17:22 -08:00
Jannik Silvanus
7954c57124
[IR] Fix GEP offset computations for vector GEPs (#75448)
Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of
vector GEPs need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which
the (bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new test is fixed.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-01-04 10:08:21 +01:00
Sander de Smalen
00a831421f
[AArch64][SME] Extend Inliner cost-model with custom penalty for calls. (#68416)
This is a stacked PR following on from #68415 

This patch has two purposes:
(1) It tries to make inlining more likely when it can avoid a
streaming-mode change.
(2) It avoids inlining when inlining causes more streaming-mode changes.

An example of (1) is:
```
  void streaming_compatible_bar(void);

  void foo(void) __arm_streaming {
    /* other code */
    streaming_compatible_bar();
    /* other code */
  }

  void f(void) {
    foo();            // expensive streaming mode change
  }

  ->

  void f(void) {
    /* other code */
    streaming_compatible_bar();
    /* other code */
  }
```
where it wouldn't have inlined the function when foo would be a
non-streaming function.

An example of (2) is:
```
  void streaming_bar(void) __arm_streaming;

  void foo(void) __arm_streaming {
    streaming_bar();
    streaming_bar();
  }

  void f(void) {
    foo();            // expensive streaming mode change
  }

  -> (do not inline into)

  void f(void) {
    streaming_bar();  // these are now two expensive streaming mode changes
    streaming_bar();
  }```
2023-10-31 10:28:40 +00:00
Mingming Liu
aa6ee03709 [NFC][Inliner] Introduce another multiplier for cost benefit analysis and make multipliers overriddable in TargetTransformInfo.
- The motivation is to expose tunable knobs to control the aggressiveness of inlines for different backend (e.g., machines with different icache size, and workload with different icache/itlb PMU counters). Tuning inline aggressiveness shows a small (~+0.3%) but stable improvement on workload/hardware that is more frontend bound.
- Both multipliers could be overridden from command line.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D153154
2023-10-02 21:27:07 -07:00
Kazu Hirata
ea91ae5bc0 [Analysis] Use std::clamp (NFC) 2023-09-22 00:21:06 -07:00
Kazu Hirata
b4301df61f Revert "[InlineCost] Check for conflicting target attributes early"
This reverts commit d6f994acb3d545b80161e24ab742c9c69d4bbf33.

Several people have reported breakage resulting from this patch:

- https://github.com/llvm/llvm-project/issues/65152
- https://github.com/llvm/llvm-project/issues/65205
2023-09-21 10:29:46 -07:00
HaohaiWen
954979d681 Reland [InlineCost] Enable the cost benefit analysis for Sample PGO (#66457)
Enables the cost-benefit-analysis-based inliner by default if we have
sample profile.

No extra fix is required.
2023-09-21 12:44:24 +08:00
Haohai Wen
486fc81583 Revert "[InlineCost] Enable the cost benefit analysis for Sample PGO (#66457)"
This reverts commit 2f2319cf2406d9830a331cbf015881c55ae78806.
2023-09-21 10:30:39 +08:00
HaohaiWen
2f2319cf24
[InlineCost] Enable the cost benefit analysis for Sample PGO (#66457)
Enables the cost-benefit-analysis-based inliner by default if we have
sample profile.
2023-09-21 09:21:55 +08:00
Mingming Liu
2639fdc5dd [InlineCost]Account for switch instructons when the switch condition could be simplified as a result of inlines.
Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D152053
2023-09-20 14:49:06 -07:00
Matthias Braun
b0c8c45423
Avoid BlockFrequency overflow problems (#66280)
Multiplying raw block frequency with an integer carries a high risk
of overflow.

- Add `BlockFrequency::mul` return an std::optional with the product
  or `nullopt` to indicate an overflow.
- Fix two instances where overflow was likely.
2023-09-14 11:11:27 -07:00
Kazu Hirata
53445db170 [Analysis] Fix a comment typo (NFC) 2023-08-25 13:18:57 -07:00
Juan Manuel MARTINEZ CAAMAÑO
cc8a346e3f [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (1/2)
On AMDGPU, alloca instructions have penalty that can
be avoided when SROA is applied after inlining.

This patch introduces the default implementation of
TargetTransformInfo::getCallerAllocaCost.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D149740
2023-06-29 09:49:16 +02:00
Dhruv Chawla
8e580b7edd
[NFC][SetVector] Update some usages of SetVector to SmallSetVector
This patch is a continuation of D152497. It updates usages of SetVector
that were found in llvm/ and clang/ which were originally specifying either
SmallPtrSet or SmallVector to just using SmallSetVector, as the overhead
of SetVector is reduced with D152497.

This also helps clean up the code a fair bit, and gives a decent speed
boost at -O0 (~0.2%):
https://llvm-compile-time-tracker.com/compare.php?from=9ffdabecabcddde298ff313f5353f9e06590af62&to=97f1c0cde42ba85eaa67cbe89bec8fe45b801f21&stat=instructions%3Au

Differential Revision: https://reviews.llvm.org/D152522
2023-06-10 12:36:43 +05:30
Kazu Hirata
d6f994acb3 [InlineCost] Check for conflicting target attributes early
When we inline a callee into a caller, the compiler needs to make sure
that the caller supports a superset of instruction sets that the
callee is allowed to use.  Normally, we check for the compatibility of
target features via functionsHaveCompatibleAttributes, but that
happens after we decide to honor call site attribute
Attribute::AlwaysInline.  If the caller contains a call marked with
Attribute::AlwaysInline, which can happen with
__attribute__((flatten)) placed on the caller, the caller could end up
with code that cannot be lowered to assembly code.

This patch fixes the problem by checking the target feature
compatibility before we honor Attribute::AlwaysInline.

Fixes https://github.com/llvm/llvm-project/issues/62664

Differential Revision: https://reviews.llvm.org/D150396
2023-06-02 16:00:47 -07:00
Denis Antrushin
291223409c [InlineCost] Consider branches with !make.implicit metadata as free.
!make.implicit metadata attached to branch means it will very likely
be eliminated (together with associated cmp instruction).

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D149747
2023-05-25 18:43:16 +03:00
Jacob Hegna
f9b3e3411c Adjust macros which define the ML inlining features.
This aligns the inlining macros more closely with how the regalloc
macros are defined.

 - Explicitly specify the dtype/shape
 - Remove separate names for python/C++
 - Add docstring for inline cost features

Differential Revision: https://reviews.llvm.org/D149384
2023-04-27 22:47:12 +00:00
Dávid Bolvanský
e1f94336e9 Revert "[InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute"
This reverts commit 3b5ff3a67c1f0450a100dca34d899ecd3744cb36.
2023-04-06 16:54:26 +02:00
Dávid Bolvanský
3b5ff3a67c [InlineCost] isKnownNonNullInCallee - handle also dereferenceable attribute 2023-04-06 16:51:28 +02:00
Kazu Hirata
96118f1b0a [Inliner] clang-format InlineCost.cpp and Inliner.cpp (NFC) 2023-03-16 14:55:28 -07:00
Kazu Hirata
11efd1cb04 [Analysis] Use *{Set,Map}::contains (NFC) 2023-03-14 00:32:40 -07:00
Nikita Popov
86d1ed9e48 [InlineCost] Avoid ConstantExpr::getSelect()
Instead use ConstantFoldSelectInstruction(), which will return
nullptr if it cannot be folded and a constant expression would
be produced instead.

In preparation for removing select constant expressions.
2023-02-27 16:47:16 +01:00
Nick Desaulniers
f1764d5b59 [InlineCost] model calls to llvm.objectsize.*
Very similar to https://reviews.llvm.org/D111272. We very often can
evaluate calls to llvm.objectsize.* regardless of inlining. Don't count
calls to llvm.objectsize.* against the InlineCost when we can evaluate
the call to a constant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1302

Reviewed By: manojgupta

Differential Revision: https://reviews.llvm.org/D111456
2023-01-24 15:09:57 -08:00
Guillaume Chatelet
48f5d77eee [NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:36:39 +00:00
Kazu Hirata
c08fad8193 [llvm] Remove redundant initialization of std::optional (NFC) 2022-12-20 15:53:38 -08:00
Fangrui Song
2fa744e631 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This commit fixes LLVMAnalysis and its dependencies.
2022-12-16 22:44:08 +00:00
Fangrui Song
d4b6fcb32e [Analysis] llvm::Optional => std::optional 2022-12-14 07:32:24 +00:00
Kazu Hirata
9f252e5567 [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:31:17 -08:00