672 Commits

Author SHA1 Message Date
Matt Arsenault
287c69b469
Attributor: Add denormal-fp-math to attributor-light (#79576) 2026-02-03 08:57:03 +01:00
Shilei Tian
09eea2256e
[Attributor] Check range size before constant fold load (#151359)
If the range size doesn't match the type size, it might read wrong data.
2025-10-25 10:36:31 -04:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Andreas Jonson
0a067dc107
[Attributor] Swap range metadata to attribute for calls. (#108835) 2025-07-05 16:47:03 +02:00
zGoldthorpe
f393211454
[Reland][IPO] Added attributor for identifying invariant loads (#146584)
Patched and tested the `AAInvariantLoadPointer` attributor from #141800,
which identifies pointers whose loads are eligible to be marked as
`!invariant.load`.

The bug in the attributor was due to `AAMemoryBehavior` always
identifying pointers obtained from `alloca`s as having no writes. I'm
not entirely sure why `AAMemoryBehavior` behaves this way, but it seems
to be beceause it identifies the scope of an `alloca` to be limited to
only that instruction (and, certainly, no memory writes occur within the
`alloca` instructin). This patch just adds a check to disallow all loads
from `alloca` pointers from being marked `!invariant.load` (since any
well-defined program will have to write to stack pointers at some
point).
2025-07-01 17:46:19 -04:00
zGoldthorpe
00ae89a1cb
Revert "[IPO] Added attributor for identifying invariant loads" (#144808)
Reverts llvm/llvm-project#141800

The implementation critically misunderstands the `AAMemoryBehavior`
attributor, which it relies on heavily.

@shiltian, since I do not have commit permissions.
2025-06-18 18:35:01 -04:00
zGoldthorpe
25dcd231bf
[IPO] Added attributor for identifying invariant loads (#141800)
The attributor conservatively marks pointers whose loads are eligible to
be marked as `!invariant.load`.
It does so by identifying:
1. Pointers marked `noalias` and `readonly`
2. Pointers whose underlying objects are all eligible for invariant
loads.

The attributor then manifests this attribute at non-atomic non-volatile
load instructions.
2025-06-16 11:16:47 -05:00
Nikita Popov
3824a2dbce [MemoryBuiltins] Support allocas in getInitialValueOfAllocation (NFC) 2025-06-16 11:52:16 +02:00
Jeremy Morse
97ac6483aa
[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746)
This flag was used to let us incrementally introduce debug records
into LLVM, however everything is now using records. It serves no
purpose now, so delete it.
2025-06-12 11:51:58 +01:00
Kazu Hirata
92cebab210
[IPO] Teach AbstractAttribute::getName to return StringRef (NFC) (#141313)
This patch addresses clang-tidy's readability-const-return-type by
dropping const from the return type while switching to StringRef at
the same time because these functions just return string constants.
2025-05-23 23:58:49 -07:00
Shilei Tian
d2992423e3
[Attributor] Don't replace addrspacecast (ptr null to ptr addrspace(x)) with ptr addrspace(x) null (#126779)
`ConstantPointerNull` represents a pointer with value 0, but it doesn’t
necessarily mean a `nullptr`. `ptr addrspace(x) null` is not the same as
`addrspacecast (ptr null to ptr addrspace(x))` if the `nullptr` in AS X
is not
zero. Therefore, we can't simply replace it.

Fixes #115083.
2025-05-20 18:08:42 -04:00
Kazu Hirata
806a79abd0
[llvm] Drop "const" from "const ArrayRef" (NFC) (#138818) 2025-05-07 09:53:20 -07:00
Nikita Popov
4109bac330
[IR] Do not store Function inside BlockAddress (#137958)
Currently BlockAddresses store both the Function and the BasicBlock they
reference, and the BlockAddress is part of the use list of both the
Function and BasicBlock.

This is quite awkward, because this is not really a use of the function
itself (and walks of function uses generally skip block addresses for
that reason). This also has weird implications on function RAUW (as that
will replace the function in block addresses in a way that generally
doesn't make sense), and causes other peculiar issues, like the ability
to have multiple block addresses for one block (with different
functions).

Instead, I believe it makes more sense to specify only the basic block
and let the function be implied by the BB parent. This does mean that we
may have block addresses without a function (if the BB is not inserted),
but this should only happen during IR construction.
2025-05-02 09:40:50 +02:00
Matt Arsenault
cfc035a2b1
Attributor: Use use_empty instead of getNumUses == 0 (#136339) 2025-04-18 21:14:13 +02:00
Matt Arsenault
783201b184
Attributor: Don't follow uses of ConstantData (#134573)
These should not really have uselists, and it's not worth the compile
time of looking at all uses of trivial constants. The main observable
change of this is it no longer adds align attributes on constant null
uses, but those are not useful. Some of these cases should potentially
be more aggressive and not look at any Constant users.
2025-04-07 23:59:53 +07:00
Nick Sarnie
48b7530273
[clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU function to Triple (#126956)
I'm adding support for SPIR-V, so let's consolidate these checks.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-03-28 14:19:20 +00:00
Kazu Hirata
0dcc201ac4
[Transforms] Use *Set::insert_range (NFC) (#132056)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-19 15:35:01 -07:00
Pierre van Houtryve
5470dffda2
[Attributor] Do not optimize away externally_initialized loads. (#128170)
Fixes SWDEV-515029
2025-03-03 14:58:47 +01:00
Johannes Doerfert
9f28621fae
[Attributor][NFC] Clang format (#129163) 2025-02-27 23:59:08 -05:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
macurtis-amd
d1a6eaa478
[Attributor][NFC] Performance improvements (#122923)
` forallInterferingAccesses` is a hotspot and for large modules these
changes make a measurable improvement in compilation time.

For LTO kernel compilation of 519.clvleaf (SPEChpc 2021) I measured the
following:
```
                    |   Measured times (s)   | Average | speedup
--------------------+------------------------+---------+---------
Baseline            | 33.268  33.332  33.275 |  33.292 |      0%
Cache "kernel"      | 30.543  30.339  30.607 |  30.496 |    9.2%
templatize callback | 30.981  30.97   30.964 |  30.972 |    7.5%
Both changes        | 29.284  29.201  29.053 |  29.179 |   14.1%
```
2025-01-14 12:51:25 -06:00
Kazu Hirata
98ea1a81a2
[IPO] Remove unused includes (NFC) (#114716)
Identified with misc-include-cleaner.
2024-11-03 13:48:55 -08:00
Shilei Tian
0b7a18bd4a
[Attributor] Use more appropriate approach to check flat address space (#108713) 2024-09-27 18:26:55 -04:00
Johannes Doerfert
56a033462e
[Attributor] Keep track of reached returns in AAPointerInfo (#107479)
Instead of visiting call sites in Attribute::checkForAllUses, we now
keep track of returns in AAPointerInfo and use the call site return
information as required. This way, the user of
AAPointerInfo(CallSite)Argument can determine if the call return should
be visited. We do not collect them as "may accesses" in the
AAPointerInfo(CallSite)Argument itself in case a return user is found.
2024-09-10 08:13:21 -07:00
Shilei Tian
1ca9fe6db3 Reapply "[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952)"
This reverts commit 36467bfe89f231458eafda3edb916c028f1f0619.
2024-08-14 17:16:47 -04:00
Shilei Tian
36467bfe89 Revert "Reapply "[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952)""
This reverts commit 7a68449a82ab1c1ab005caa72c1d986ca5deca36.

https://lab.llvm.org/buildbot/#/builders/123/builds/3205
2024-08-07 09:22:48 -04:00
Shilei Tian
7a68449a82 Reapply "[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952)"
This reverts commit 874cd100a076f3b98aaae09f90ef224682501538.
2024-08-06 22:46:32 -04:00
Shilei Tian
874cd100a0 Revert "[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952)"
This reverts commit ab819d7cf86932e4a47b5bf6aadea9d714a313a9.
2024-08-02 18:31:21 -04:00
Shilei Tian
ab819d7cf8
[Attributor][AMDGPU] Enable AAIndirectCallInfo for AMDAttributor (#100952) 2024-08-02 17:23:18 -04:00
NAKAMURA Takumi
c8f2ee77d2 Fix a warning in #98362 [-Wunused-but-set-variable] 2024-07-13 20:57:09 +09:00
Arthur Eubanks
58bc98cd3a
[CallGraphUpdater] Remove some legacy pass manager support (#98362)
We don't have any legacy pass manager CGSCC passes that modify the call
graph (we only use it in the codegen pipeline to run function passes in
call graph order). This is the beginning of removing CallGraphUpdater
and making all the relevant CGSCC passes directly use the new pass
manager APIs.
2024-07-12 10:02:50 -07:00
Kazu Hirata
4b28b3fae4
[Transforms] Use range-based for loops (NFC) (#97195) 2024-07-02 16:20:44 -07:00
Ethan Luis McDonough
b629d4b912
[Attributor] Prevent infinite loop in AAGlobalValueInfoFloating (#94941)
Global variables that reference themselves alongside a function that is
called indirectly can cause an infinite loop in
`AAGlobalValueInfoFloating`. The recursive reference is continually
pushed back into the workload, causing the attributor to hang
indefinitely.
2024-06-18 09:36:42 -07:00
Jay Foad
d4a0154902
[llvm-project] Fix typo "seperate" (#95373) 2024-06-13 20:20:27 +01:00
Johannes Doerfert
5ec91b392d
[AttributorLight] Without liveness checks, look at all functions (#91004) 2024-05-23 07:28:07 +02:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Björn Pettersson
7677453886
[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854)
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.

Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.

Main problem solved would be something like this:
  store i17 -1, ptr %p, align 4
  %v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.

If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.

Fixes: https://github.com/llvm/llvm-project/issues/81793
2024-02-15 15:40:21 +01:00
Jeremy Morse
0065d06760
[NFC][DebugInfo] Maintain RemoveDIs flag when attributor creates functions (#79143)
We're using this flag (IsNewDbgInfoFormat) to detect the boundaries in
LLVM of what's treating debug-info as intrinsics (i.e. dbg.value), and
what's using DPValue objects (the non-intrinsic replacement). The
attributor tends to create new wrapper functions and doesn't insert them
into Modules in the usual way, thus we have to manually update that flag
to signal what debug-info mode it's using.

I've added some --try-experimental-debuginfo-iterators RUN lines to
tests that would otherwise crash because of this, so that they're
exercised by our new-debuginfo-iterators buildbot.

NB: there's an attributor test with a dbg.value in it, however
attributes re-order themselves in RemoveDIs mode for various reasons, so
we're going to address that in a different patch.
2024-01-24 15:20:05 +00:00
Vidhush Singhal
754b93e466
[Attributor] New attribute to identify what byte ranges are alive for an allocation (#66148)
Changes the size of allocations automatically.
For now, implements the case when a single range from start of the
allocation is alive and the allocation can be reduced.
2023-11-10 16:26:37 -08:00
Nikita Popov
16a595e398 [Attributor] Avoid use of ConstantExpr::getFPTrunc() (NFC)
Use the constant folding API instead. For simplificity I'm using
the DL-independent API here.
2023-11-06 15:27:01 +01:00
Johannes Doerfert
499fb1b8d8 [Attributor][FIX] Interposable constants cannot be propagated 2023-10-20 19:28:09 -07:00
Johannes Doerfert
73a836a464
[Attributor] Look through indirect calls (#65197)
Through the new `Attributor::checkForAllCallees` we can look through
indirect calls and visit all potential callees if they are known. Most
AAs will do that implicitly now via `AACalleeToCallSite`, thus, most AAs
are able to deal with missing callees for call site IR positions.

Differential Revision: https://reviews.llvm.org/D112290
2023-09-08 12:14:38 -07:00
Johannes Doerfert
209496b766 [Core] Allow hasAddressTaken to ignore "casted direct calls"
A direct call to a function casted to a different type is still not
really an address taken event. We allow the user to opt out of these
now.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D159149
2023-08-31 19:32:52 -07:00
Matt Arsenault
9536bbe464 Attributor: Don't pass ArrayRef by const reference 2023-08-31 08:41:08 -04:00
Matt Arsenault
850ec7bbb1 Attributor: Try to propagate concrete denormal-fp-math{-f32}
Allow specialization of functions with "dynamic" denormal modes to a
known IEEE or DAZ mode based on callers. This should make it possible
to implement a is-denormal-flushing-enabled test using
llvm.canonicalize and have it be free after LTO.

https://reviews.llvm.org/D156129
2023-08-31 08:26:32 -04:00
Johannes Doerfert
498887ae8a [Attributor] Introduce the closed world flag
The Attributor user can now set the closed world flag
(`AttributorConfig.IsClosedWorldModule` or
`-attributor-assume-closed-world`) in order to specialize call edges
based only on available callees. That means, we assume all functions are
known and hence all potential callees must be declared/defined in the
module. We will use this for GPUs and LTO cases, but for now the user
has to set it via a flag.
2023-08-29 22:35:17 -07:00
Johannes Doerfert
936661084c [Attributor][NFC] Add querying AA to shouldSpecializeCallSiteForCallee
The callback might require an AA, e.g., to ask other AAs for information
in a way that will enfore dependences.
2023-08-29 22:35:16 -07:00
Johannes Doerfert
d0b5523632 [Attributor] Introduce limit for indirect call specialization
The user can now limit the number of indirect calls specialized for a
given call site with `-attributor-max-specializations-per-call-base=N`
or the AttributorConfig callback. We further attach the `!callee`
metadata if all remaining callees are known.
2023-08-25 14:36:42 -07:00
Johannes Doerfert
9c08e76f3e [Attributor] Introduce AAIndirectCallInfo
AAIndirectCallInfo will collect information and specialize indirect call
sites. It is similar to our IndirectCallPromotion but runs as part of
the Attributor (so with assumed callee information). It also expands
more calls and let's the rest of the pipeline figure out what is UB, for
now. We use existing call promotion logic to improve the result,
otherwise we rely on the (implicit) function pointer cast.

This effectively "fixes" #60327 as it will undo the type punning early
enough for the inliner to work with the (now specialized, thus direct)
call.

Fixes: https://github.com/llvm/llvm-project/issues/60327
2023-08-18 16:44:05 -07:00
Johannes Doerfert
dfc821ae89 [OpenMPOpt][FIX] Ensure a dependence for KernelEnvC queries
When other AAs query the current value of KernelEnvC via the callback
KernelConfigurationSimplifyCB we need to ensure they are now dependent
on the AAKernelInfo that is in charge of the KernelEnvC.
2023-08-10 23:16:25 -07:00