951 Commits

Author SHA1 Message Date
Jeremy Morse
c672ba7dde
[DebugInfo][RemoveDIs] Instrument inliner for non-instr debug-info (#72884)
With intrinsics representing debug-info, we just clone all the
intrinsics when inlining a function and don't think about it any
further. With non-instruction debug-info however we need to be a bit
more careful and manually move the debug-info from one place to another.
For the most part, this means keeping a "cursor" during block cloning of
where we last copied debug-info from, and performing debug-info copying
whenever we successfully clone another instruction.

There are several utilities in LLVM for doing this, all of which now
need to manually call cloneDebugInfo. The testing story for this is not
well covered as we could rely on normal instruction-cloning mechanisms
to do all the hard stuff. Thus, I've added a few tests to explicitly
test dbg.value behaviours, ahead of them becoming not-instructions.
2023-11-26 21:24:29 +00:00
Anna Thomas
4ba50a783b Update test to consider incompatible align attribute 2023-11-10 10:50:35 -05:00
Sander de Smalen
00a831421f
[AArch64][SME] Extend Inliner cost-model with custom penalty for calls. (#68416)
This is a stacked PR following on from #68415 

This patch has two purposes:
(1) It tries to make inlining more likely when it can avoid a
streaming-mode change.
(2) It avoids inlining when inlining causes more streaming-mode changes.

An example of (1) is:
```
  void streaming_compatible_bar(void);

  void foo(void) __arm_streaming {
    /* other code */
    streaming_compatible_bar();
    /* other code */
  }

  void f(void) {
    foo();            // expensive streaming mode change
  }

  ->

  void f(void) {
    /* other code */
    streaming_compatible_bar();
    /* other code */
  }
```
where it wouldn't have inlined the function when foo would be a
non-streaming function.

An example of (2) is:
```
  void streaming_bar(void) __arm_streaming;

  void foo(void) __arm_streaming {
    streaming_bar();
    streaming_bar();
  }

  void f(void) {
    foo();            // expensive streaming mode change
  }

  -> (do not inline into)

  void f(void) {
    streaming_bar();  // these are now two expensive streaming mode changes
    streaming_bar();
  }```
2023-10-31 10:28:40 +00:00
Sander de Smalen
6d30bc0085
[AArch64][SME] Allow inlining when streaming-mode attributes dont match up. (#68415)
The use-case here is to support things like:

  int foo(int x, int y) __arm_streaming { return std::max<int>(x, y); }

where the call to non-streaming `std::max<int>(x, y)` can be safely
inlined into the streaming function.

This is a first step and will need further work to allow more cases
(e.g. more finegrained analysis of the function calls to ensure they
don't result in any incompatible instructions for the requested mode).
2023-10-30 10:47:07 +00:00
Aiden Grossman
f39c38584e [MLGO] Fix tests post 1a2e77c
This patch switched the default value of the mandatory-inlining-first
flag from true to false. This broke one of the MLGO tests that relied on
the default value of this flag. This patch explicitly sets the value to
fix the test and avoid future breakages.
2023-10-29 08:41:11 +00:00
Amara Emerson
1a2e77cf9e Revert "Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner.""
This reverts commit 86bfeb906e3a95ae428f3e97d78d3d22a7c839f3.

This is a long time coming re-application that was originally reverted due to
regressions, unrelated to the actual inlining change. These regressions have since
been fixed due to another long-in-the-making change of a66051c6 landing.

Original commit message for reference:
---
    We have several situations where it's beneficial for code size to ensure that every
    call to always-inline functions are inlined before normal inlining decisions are
    made. While the normal inliner runs in a "MandatoryOnly" mode to try to do this,
    it only does it on a per-SCC basis, rather than the whole module. Ensuring that
    all mandatory inlinings are done before any heuristic based decisions are made
    just makes sense.

    Despite being referred to the "legacy" AlwaysInliner pass, it's already necessary
    for -O0 because the CGSCC inliner is too expensive in compile time to run at -O0.

    This also fixes an exponential compile time blow up in
    https://github.com/llvm/llvm-project/issues/59126

    Differential Revision: https://reviews.llvm.org/D143624
---
2023-10-28 23:21:11 -07:00
Sander de Smalen
0e099faff1 [AArch64][SME] NFC: use update_test_checks.py for sme-pstate(sm|za)-attrs.ll 2023-10-06 12:46:20 +00:00
Mircea Trofin
a4765c6a02 [mlgo] Fix state-tracking-coro.ll test
Post #68263, the inline advisor printer tries to print SCC Nodes' names,
but if we perform a full pipeline (like O1), there'll be some DCE-ing
happening and the Node pointers kept in the advisor for this (printing)
purpose are dangling. Using the more eager printer post each scc inline
pass is sufficient.
2023-10-04 22:07:44 -07:00
Mircea Trofin
1b3fc40586
[mlgo][coro] Assign coro split-ed functions a FunctionLevel (#68263) 2023-10-04 21:20:00 -07:00
Noah Goldstein
2da4960f20 [Inliner] Also propagate noundef and align ret attributes during inlining
Both of these can potentially be lost otherwise.
2023-10-03 16:12:19 -05:00
Noah Goldstein
2d037f5aed [Inliner] Use "best" ret attribute when propagating attributes during inlining
For attributes assosiated with a value (like `dereferenceable(N)`)
instead of always using the attribute from the to-be inlined caller,
it should keep using the value at existing callsites that have the
attribute if the value is higher (provides more information).
2023-10-03 16:12:16 -05:00
Noah Goldstein
733f373ebe [Inliner] Regen checks for old test; NFC 2023-10-03 16:12:06 -05:00
Mingming Liu
aa6ee03709 [NFC][Inliner] Introduce another multiplier for cost benefit analysis and make multipliers overriddable in TargetTransformInfo.
- The motivation is to expose tunable knobs to control the aggressiveness of inlines for different backend (e.g., machines with different icache size, and workload with different icache/itlb PMU counters). Tuning inline aggressiveness shows a small (~+0.3%) but stable improvement on workload/hardware that is more frontend bound.
- Both multipliers could be overridden from command line.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D153154
2023-10-02 21:27:07 -07:00
Noah Goldstein
2f3b7d33f4 [Inliner] Fix bug when propagating poison generating return attributes
Poison generating return attributes can't be propagated the same as
others, as they can change the behavior of other uses and/or create UB
where it otherwise wouldn't have occurred.

For example:
```
define nonnull ptr @foo() {
    %p = call ptr @bar()
    call void @use(ptr %p)
    ret ptr %p
}
```

If we inline `@foo` and propagate `nonnull` to `@bar`, it could change
the behavior of `@use` as instead of taking `null`, `@use` will
now be passed `poison`.

This can be even worth in a case like:
```
define nonnull ptr @foo() {
    %p = call noundef ptr @bar()
    ret ptr %p
}
```

Where propagating `nonnull` to `@bar` will cause UB on `null` return
of `@bar` (`noundef` + `poison`) where it previously wouldn't
have occurred.

To fix this, we only propagate poison generating return attributes if
either 1) The only use of the callsite to propagate too is return and
the callsite to propagate too doesn't have `noundef`. Or 2) the
callsite to be be inlined has `noundef`.

The former case ensures no new UB or `poison` values will be
added. The latter is UB anyways if the value is `poison` so we can go
ahead without worrying about behavior changes.
2023-09-28 17:27:42 -05:00
Noah Goldstein
bf8d03921d [Inliner] Add some additional tests for progagating attributes before inlining; NFC 2023-09-28 17:27:41 -05:00
Kazu Hirata
b4301df61f Revert "[InlineCost] Check for conflicting target attributes early"
This reverts commit d6f994acb3d545b80161e24ab742c9c69d4bbf33.

Several people have reported breakage resulting from this patch:

- https://github.com/llvm/llvm-project/issues/65152
- https://github.com/llvm/llvm-project/issues/65205
2023-09-21 10:29:46 -07:00
Anna Thomas
23f08af2be [Inline] Avoid incompatible return attributes on deoptimize
When updating the return type of deoptimize call during inline, we need
to drop incompatible return attributes.  This bug was exposed once we
relaxed the contraint of adding the attributes through D156844. With
that change deoptimize (are not willreturn) will start having return
attributes added to it.

Fixes https://github.com/llvm/llvm-project/issues/64804.

Differential Revision: https://reviews.llvm.org/D158286
2023-08-18 12:55:51 -04:00
Sameer Sahasrabuddhe
8dce4c56dd [Inliner] Handle convergence control when inlining a call
When a convergencectrl token is passed to a convergent call, and the called
function in turn calls the entry intrinsic, the intrinsic is now now replaced
with the convergencectrl token.

The spec requires the following check:
  A call from function F to function G can be inlined only if:
  - at least one of F or G does not make any convergent calls, or,
  - both F and G make the same kind of convergent calls: controlled or
    uncontrolled.

But this change does not implement this complete check. A proper implemenation
require a whole new analysis that identifies convergence in every function. For
now, we skip that and just do a cursory check for the entry intrinsic. The
underlying assumption is that in a compiler flow that fully implements
convergence control tokens, there is no mixing of controlled and uncontrolled
convergent operations in the whole program.

This is a reboot of the original change D85606 by
Nicolai Haehnle <nicolai.haehnle@amd.com>.

Reviewed By: arsenm, nhaehnle

Differential Revision: https://reviews.llvm.org/D152431
2023-08-17 09:56:25 +05:30
Noah Goldstein
4d51c6258e [Inliner] Add return attributes to callsites not marked willreturn/nounwind
The actual callsite we are adding to doesn't need to be
`willreturn`/`nounwind`, only ever instructions between the callsite
and the return.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D156844
2023-08-16 22:43:04 -05:00
Noah Goldstein
612a7f0b15 [Inliner] Add the callsites called function return attributes to set addable attributes
We can do this by just querying attribute in the callsite itself. This
is both cleaner code and produces bette results.

Differential Revision: https://reviews.llvm.org/D156843
2023-08-16 22:43:04 -05:00
Noah Goldstein
74c4d1e422 [Inliner] Add more tests for deducing return attributes of callsites when inlining; NFC
Differential Revision: https://reviews.llvm.org/D156842
2023-08-16 22:43:04 -05:00
Nikita Popov
6f3f600b2a [Inline] Add test for simplification in loop (NFC)
This would have been miscompiled by D157816.
2023-08-16 09:27:01 +02:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Matt Arsenault
acc163d4ab Inliner: Regenerate test
Test claims to be autogenerated but some functions are inexplicibly
missing checks.
2023-07-31 08:05:12 -04:00
Matt Arsenault
d873a14e93 ValueTracking: Implement computeKnownFPClass for frexp
Work around the lack of proper multiple return values by looking
at the extractvalue.

https://reviews.llvm.org/D150982
2023-07-21 16:04:13 -04:00
Matt Arsenault
e1ac984a10 ValueTracking: Implement computeKnownFPClass for ldexp
https://reviews.llvm.org/D149590
2023-07-11 09:26:41 -04:00
Juan Manuel MARTINEZ CAAMAÑO
dd1df099ae [InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)
Before this patch, the compiler gave a bump to the inline-threshold
when the total size of the allocas passed as arguments to the
callee was below 256 bytes.
This heuristic ignores that some of these allocas could have be removed
by SROA if inlining was applied.

Ideally, this bonus would be attributed to the threshold once the
size of all the allocas that could not be handled by SROA is known:
at the end of the InlineCost analysis.
However, we may never reach this point if the inline-cost analysis exits
early when the inline cost goes over the threshold mid-analysis.

This patch proposes:
* Attribute the bonus in the inline-threshold when allocas are passed
  as arguments (regardless of their total size).
* Assigns a cost to each alloca proportional to its size,
  such that the cost of all the allocas cancels the bonus.

Potential problems:
* This patch assumes that removing alloca instructions with SROA is
  always profitable. This may not be the case if the total size of the
  allocas is still too big to be promoted to registers/LDS.
* Redundant calls to getTotalAllocaSize
* Awkwardly, the threshold attributed contributes to the single-bb and
  vector bonus.

Reviewed By: scchan

Differential Revision: https://reviews.llvm.org/D149741
2023-06-29 09:49:16 +02:00
Arthur Eubanks
ff4fcbb5f4 [test] Add test for null_pointer_is_valid and Inliner instsimplify interaction
As requested in D151254

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D153435
2023-06-21 14:00:53 -07:00
Nikita Popov
650041a7f1 [Inline] Convert tests to opaque pointers (NFC) 2023-06-21 11:32:45 +02:00
Nikita Popov
4c51f0dee5 [Inline] Regenerate test checks (NFC) 2023-06-21 11:32:45 +02:00
Arthur Eubanks
f4f826bcd4 Revert "Revert "ValueTracking: Fix nan result handling for fmul""
This reverts commit 464dcab8a6c823c9cb462bf4107797b8173de088.

Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
2023-06-16 13:53:32 -07:00
Arthur Eubanks
3e39cfe5b4 Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 0c03f48480f69b854f86d31235425b5cb71ac921.

Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.
2023-06-16 13:53:31 -07:00
Arthur Eubanks
0c03f48480 Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b.

Causes large binary size regressions, see comments on https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b.
2023-06-16 11:24:29 -07:00
Arthur Eubanks
464dcab8a6 Revert "ValueTracking: Fix nan result handling for fmul"
This reverts commit a632ca4b00279baf18e72a171ec0ce526e9d80aa.

Dependent commit to be reverted
2023-06-16 11:24:28 -07:00
Alan Zhao
d6b4f6786b Revert "Revert "InstSimplify: Require instruction be parented""
This reverts commit 00264eac4d0938ae8a0826da38e4777be269124c.

Reason: caused a bunch of bots to break
2023-06-16 10:58:54 -07:00
Alan Zhao
00264eac4d Revert "InstSimplify: Require instruction be parented"
This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b.

Reason: causes a regression in the inliner (see https://crbug.com/1454531 and https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b#1217141)
2023-06-16 10:36:49 -07:00
Matt Arsenault
a632ca4b00 ValueTracking: Fix nan result handling for fmul
This was mishandling maybe 0 * inf.

Fixes issue #63316
2023-06-15 09:35:12 -04:00
Matt Arsenault
19293b82c1 Inline: Fix case of not inlining with denormal-fp-math-f32
This was failing to inline the opencl libraries with daz enabled. As a
modifier to the base mode, denormal-fp-mode-f32 is weird and has no
meaning if it's missing.
2023-06-09 19:09:48 -04:00
Matt Arsenault
d0b9cb1f65 AMDGPU: Add inlining testcases for denormal-fp-math
Somehow missed this one and it's not working correctly
2023-06-09 19:09:48 -04:00
Kazu Hirata
d6f994acb3 [InlineCost] Check for conflicting target attributes early
When we inline a callee into a caller, the compiler needs to make sure
that the caller supports a superset of instruction sets that the
callee is allowed to use.  Normally, we check for the compatibility of
target features via functionsHaveCompatibleAttributes, but that
happens after we decide to honor call site attribute
Attribute::AlwaysInline.  If the caller contains a call marked with
Attribute::AlwaysInline, which can happen with
__attribute__((flatten)) placed on the caller, the caller could end up
with code that cannot be lowered to assembly code.

This patch fixes the problem by checking the target feature
compatibility before we honor Attribute::AlwaysInline.

Fixes https://github.com/llvm/llvm-project/issues/62664

Differential Revision: https://reviews.llvm.org/D150396
2023-06-02 16:00:47 -07:00
Matt Arsenault
1536e299e6 InstSimplify: Require instruction be parented
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
2023-06-02 18:14:28 -04:00
Arthur Eubanks
aceaea6784 [Inliner] Mark inlinings stopped with inlining history as noinline
The inline history makes sure that we don't keep inlining due to mutual devirtualization. But this gets forgotten between inliner invocations.

So mark the inlined calls as noinline so we respect previous inline history decisions.

This overlaps with D121084, but they're not redundant since we may not inline completely through a child SCC, but we still want a cost multiplier when that happens.

See discussions in D145516.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D150989
2023-05-25 09:55:53 -07:00
Denis Antrushin
291223409c [InlineCost] Consider branches with !make.implicit metadata as free.
!make.implicit metadata attached to branch means it will very likely
be eliminated (together with associated cmp instruction).

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D149747
2023-05-25 18:43:16 +03:00
Matt Arsenault
ca6aa47585 Inline: Convert test to generated checks 2023-05-24 15:40:56 +01:00
Matt Arsenault
abf1abbfbe Inline: Convert test to generated checks 2023-05-24 08:49:04 +01:00
Arthur Eubanks
94063cac47 [test] Make mut-rec-scc.ll a bit more robust
By adding noinline

Also make the SCC have 3 functions to prevent test changes with an upcoming change.
2023-05-19 12:25:44 -07:00
Matt Arsenault
4130ccc8be ValueTracking: Check context instruction is in a function 2023-05-18 14:40:13 +01:00
Matt Arsenault
f42136d4d6 ValueTracking: Check instruction is in a parent in computeKnownFPClass
For some reason the inliner calls simplifyInstruction with disembodied
instructions. I consider this to be an API defect. Either the instruction
should always be inserted prior to simplification, or we at least
should pass in the new function for the context.
2023-05-18 12:21:47 +01:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Tobias Hieta
b71edfaa4e
[NFC][Py Reformat] Reformat python files in llvm
This is the first commit in a series that will reformat
all the python files in the LLVM repository.

Reformatting is done with `black`.

See more information here:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: jhenderson, JDevlieghere, MatzeB

Differential Revision: https://reviews.llvm.org/D150545
2023-05-17 10:48:52 +02:00