776 Commits

Author SHA1 Message Date
Owen Anderson
69e4514978
[GlobalOpt] Do not fold away addrspacecasts which may be runtime operations (#153753)
Specifically in the context of the once-stored transformation, GlobalOpt
would strip
all pointer casts unconditionally, even though addrspacecasts might be
runtime operations.
This manifested particularly on CHERI targets.

This patch was inspired by an existing change in CHERI LLVM
(91afa60f17),
but has been reimplemented with updated conventions, and a testcase
constructed from scratch.
2025-08-18 02:11:51 +00:00
Alexandros Lamprineas
3ab64c5b29
[NFC][Clang][FMV] Make FMV priority data type future proof. (#150079)
FMV priority is the returned value of a polymorphic function. On RISC-V
and X86 targets a 32-bit value is enough. On AArch64 we currently need
64 bits and we will soon exceed that. APInt seems to be a suitable
replacement for uint64_t, presumably with minimal compile time overhead.
It allows bit manipulation, comparison and variable bit width.
2025-07-23 10:37:29 +01:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Nikita Popov
183acdd279
[GlobalOpt] Revert global widening transform (#144652)
Partially reverts e37d736def5b95a2710f92881b5fc8b0494d8a05.

The transform has a number of correctness and code quality issues, and
will benefit from a from-scratch re-review more than incremental fixes.

The correctness issues are hinted at in
https://github.com/llvm/llvm-project/pull/144641, but I think it needs a
larger rework to stop working on ArrayTypes and the implementation could
use some other improvements (like callInstIsMemcpy should just be
`dyn_cast<MemCpyInst>`). I can comment in more detail on a resubmission
of the patch.
2025-06-30 14:48:37 +02:00
Kazu Hirata
2a35414e98
[Transforms] Use range-based for loops (NFC) (#145252)
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-06-25 10:08:26 -07:00
Craig Topper
255b55c602
[GlobalOpt] Use cast instead of dyn_cast. NFC (#144634)
The dyn_cast was not checked for null, and the cast is guaranteed to
succeed by an earlier check.
2025-06-18 01:35:56 -07:00
Stephen Tozer
aa8a1fa6f5
[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)
Following the work in PR #107279, this patch applies the annotative
DebugLocs, which indicate that a particular instruction is intentionally
missing a location for a given reason, to existing sites in the compiler
where their conditions apply. This is NFC in ordinary LLVM builds (each
function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the
instruction in coverage-tracking builds so that it will be ignored by
Debugify, allowing only real errors to be reported. From a developer
standpoint, it also communicates the intentionality and reason for a
missing DebugLoc.

Some notes for reviewers:

- The difference between `I->dropLocation()` and
`I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide
to keep some debug info alive, while the latter will always be empty; in
this patch, I always used the latter (even if the former could
technically be correct), because the former could result in some
(barely) different output, and I'd prefer to keep this patch purely NFC.
- I've generally documented the uses of `DebugLoc::getUnknown()`, with
the exception of the vectorizers - in summary, they are a huge cause of
dropped source locations, and I don't have the time or the domain
knowledge currently to solve that, so I've plastered it all over them as
a form of "fixme".
2025-06-11 17:42:10 +01:00
Nikita Popov
4109bac330
[IR] Do not store Function inside BlockAddress (#137958)
Currently BlockAddresses store both the Function and the BasicBlock they
reference, and the BlockAddress is part of the use list of both the
Function and BasicBlock.

This is quite awkward, because this is not really a use of the function
itself (and walks of function uses generally skip block addresses for
that reason). This also has weird implications on function RAUW (as that
will replace the function in block addresses in a way that generally
doesn't make sense), and causes other peculiar issues, like the ability
to have multiple block addresses for one block (with different
functions).

Instead, I believe it makes more sense to specify only the basic block
and let the function be implied by the BB parent. This does mean that we
may have block addresses without a function (if the BB is not inserted),
but this should only happen during IR construction.
2025-05-02 09:40:50 +02:00
Nikita Popov
57530c23a5
[GlobalOpt] Do not promote malloc if there are atomic loads/stores (#137158)
When converting a malloc stored to a global into a global, we will
introduce an i1 flag to track whether the global has been initialized.

In case of atomic loads/stores, this will result in verifier failures,
because atomic ops on i1 are illegal. Even if we changed this to i8, I
don't think it is a good idea to change atomic types in that way.

Instead, bail out of the transform is we encounter any atomic
loads/stores of the global.

Fixes https://github.com/llvm/llvm-project/issues/137152.
2025-04-24 15:15:47 +02:00
Stephen Tozer
224cd50e00
[DebugInfo][GlobalOpt] Preserve source locs for optimized loads (#134828)
Some optimizations in globalopt simplify uses of a global value to uses
of a generated global bool value; in some cases where this happens, the
newly-generated instructions would not have the original source
location(s) of the instructions they replaced propagated to them; this
patch properly preserves those source locations.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-24 14:09:53 +01:00
Kazu Hirata
cde58bfc16
[Transforms] Use range constructors of *Set (NFC) (#133203) 2025-03-27 07:51:58 -07:00
Eli Friedman
e16e93a4c7
Revert "[GlobalOpt] Handle operators separately when removing GV users" (#132971)
Reverts llvm/llvm-project#84694 .  Review was incomplete.
2025-03-25 11:20:14 -07:00
Anshil Gandhi
51dad714e8
[GlobalOpt] Handle operators separately when removing GV users (#84694)
Refactor globalopt by eliminating redundant code. Fix
https://github.com/llvm/llvm-project/issues/64680.
2025-03-25 11:14:33 -07:00
Craig Topper
1761066fc6
[GlobalOpt] Remove Function* argument from tryWidenGlobalArrayAndDests. NFC (#127848)
This is only used to get the Module and the LLVMContext. We can get both
of those from the GlobalVariable*.
2025-02-19 12:37:54 -08:00
Craig Topper
2bf473bd54
[GlobalOpt] Don't query TTI on a llvm.memcpy declaration. (#127760)
Querying TTI creates a Subtarget object, but an llvm.memcpy declaration
doesn't have target-cpu and target-feature attributes like functions
with definitions. This can cause a warning to be printed on RISC-V
because the target-abi in the Module requires floating point, but the
subtarget features don't enable floating point. So far we've only seen
this in LTO when an -mcpu is not supplied for the TargetMachine.

To fix this, get TTI for the calling function instead.

Fixes the issue reported here
https://github.com/llvm/llvm-project/issues/69780#issuecomment-2665273161
2025-02-19 10:17:07 -08:00
Alexandros Lamprineas
1b1270f30b
[FMV][GlobalOpt] Enable static resolution of non-FMV callers. (#124314)
The undetectable FMV features predres and ls64 have been removed,
therefore the optimization is now re-enabled. The llvm testsuite
Graviton4 bots are expected to remain green.
2025-01-24 19:48:40 +00:00
Alexandros Lamprineas
cf6d79ad6e
[FMV][GlobalOpt] Add an option for static resolution of non-FMV callers. (#123757)
Adds `optimize-non-fmv-callers` (disabled by default) as a short term
workaround to keep the llvm testsuite bots green, until we decide what
is the right solution for the problem which was previously addressed
with https://github.com/llvm/llvm-project/pull/123383.
2025-01-21 17:02:52 +00:00
Alexandros Lamprineas
dce5d1fcb9
[FMV][GlobalOpt] Do not statically resolve non-FMV callers. (#123383)
This fixes a runtime regression in the llvm testsuite:

https://lab.llvm.org/buildbot/#/builders/198/builds/1237

On clang-aarch64-sve2-vla:

predres
        FAIL

A 'predres' version is unexpectedly trapping on GravitonG4. My
explanation is that when the caller in not a versioned function, the
compiler exclusively relies on the command line option, or target
attribute to deduce whether a feature is available. However, there is no
guarantee that in reality the host supports those implied features.

This is a quickfix. We may rather change the mcpu option in the llvm
testsuite build instead.
2025-01-17 20:33:11 +00:00
Alexandros Lamprineas
831527a5ef
[FMV][GlobalOpt] Statically resolve calls to versioned functions. (#87939)
To deduce whether the optimization is legal we need to compare the target
features between caller and callee versions. The criteria for bypassing
the resolver are the following:

 * If the callee's feature set is a subset of the caller's feature set,
   then the callee is a candidate for direct call.

 * Among such candidates the one of highest priority is the best match
   and it shall be picked, unless there is a version of the callee with
   higher priority than the best match which cannot be picked from a
   higher priority caller (directly or through the resolver).

 * For every higher priority callee version than the best match, there
   is a higher priority caller version whose feature set availability
   is implied by the callee's feature set.

Example:

Callers and Callees are ordered in decreasing priority.
The arrows indicate successful call redirections.

  Caller        Callee      Explanation
=========================================================================
mops+sve2 --+--> mops       all the callee versions are subsets of the
            |               caller but mops has the highest priority
            |
     mops --+    sve2       between mops and default callees, mops wins

      sve        sve        between sve and default callees, sve wins
                            but sve2 does not have a high priority caller

  default -----> default    sve (callee) implies sve (caller),
                            sve2(callee) implies sve (caller),
                            mops(callee) implies mops(caller)
2025-01-17 10:49:43 +00:00
Ryan Mansfield
67efbd0bf1
[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955) 2025-01-08 11:07:23 +01:00
Owen Anderson
14a259f85b
GlobalOpt: Use the correct address space when creating a "*.init" global. (#118562) 2024-12-04 14:01:16 +13:00
Bruno De Fraine
20d8f8ca1a
[GlobalOpt] Fix global SRA incorrect alignment on some elements (#115328)
The logic had a flaw where the alignment from the original aggregate is
unintentionally retained for elements when the calculated known
alignment is not higher than the element's ABI type alignment.

Fixes #115282.
2024-11-18 10:49:50 +01:00
Nashe Mncube
e37d736def
Recommit: [llvm][ARM][GlobalOpt]Add widen global arrays pass (#113289)
This is a recommit of #107120 . The original PR was approved but failed
buildbot. The newly added tests should only be run for compilers that
support the ARM target. This has been resolved by adding a config file
for these tests.

- Pass optimizes memcpy's by padding out destinations and sources to a
  full word to make ARM backend generate full word loads instead of
  loading a single byte (ldrb) and/or half word (ldrh). Only pads
  destination when it's a stack allocated constant size array and source
  when it's constant string. Heuristic to decide whether to pad or not
  is very basic and could be improved to allow more examples to be
  padded.
- Pass works at the midend level
2024-10-24 10:12:01 +01:00
Nashe Mncube
370fd74361
Revert "[llvm][ARM]Add widen global arrays pass" (#112701)
Reverts llvm/llvm-project#107120 

Unexpected build failures in post-commit pipelines. Needs investigation
2024-10-17 13:38:01 +01:00
Nashe Mncube
ab90d2793c
[llvm][ARM]Add widen global arrays pass (#107120)
- Pass optimizes memcpy's by padding out destinations and sources to a
full word to make backend generate full word loads instead of loading a
single byte (ldrb) and/or half word (ldrh). Only pads destination when
it's a stack allocated constant size array and source when it's constant
array. Heuristic to decide whether to pad or not is very basic and could
be improved to allow more examples to be padded.
- Pass works within GlobalOpt but is disabled by default on all targets
except ARM.
2024-10-17 11:56:00 +01:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
ykhatav
46bf5d547f
Globalopt pass produces invalid debug info (#100654)
This patch fixes an issue in the GlobalOpt pass where deleting a global
variable fails to update the corresponding dbg.value and it references
an empty metadata entry. The SalvageDebugInfo() function has been
updated to handle dbg.value intrinsic when globals are deleted.
2024-08-08 09:07:39 -04:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Alexandros Lamprineas
7c946f04cf
[GlobalOpt] Don't resolve aliased ifuncs with undefined resolvees. (#96220)
Fixes https://github.com/llvm/llvm-project/issues/96197.

A global alias should always point to a definition. Ifuncs are
definitions, so far so good. However an ifunc may be statically resolved
to a function that is declared but not defined in the translation unit.

With this patch we perform static resolution if:
 * the resolvee is defined, else if
 * none of the ifunc users is a global alias
2024-06-21 09:33:41 +01:00
DianQK
c79690040a
[GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483)
Fixes #91312.

Don't perform the transform if the alias may be replaced at link time.
2024-05-17 05:51:49 +08:00
Max Winkler
ce3485a0cd
[llvm][GlobalOpt] Remove empty atexit destructors/handlers (#88836)
https://godbolt.org/z/frjhqMKqc for an example.

Removal of allocations due to empty `__cxa_atexit` destructor calls is
done by the following globalopt pass.
This pass currently does not look for `atexit` handlers generated for
platforms that do not use `__cxa_atexit`.
By default Win32 and AIX use `atexit`.

I don't see an easy way to only remove `atexit` calls that the compiler
generated without looking at the generated mangled name of the atexit
handler that is being registered.
However we can easily remove all `atexit` calls that register empty
handlers since it is trivial to ensure the removed call still returns
`0` which is the value for success.
2024-04-29 20:29:34 -04:00
Matthias Braun
dede19caf0
GlobalOpt: Handle threadlocal.address intrinsic (#88454)
This changes `GlobalOpt` to skip/look-through `threadlocal.address`
intrinsic where apropriate.

This fixes issue #73314
2024-04-29 09:21:50 -07:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Björn Pettersson
7677453886
[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854)
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.

Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.

Main problem solved would be something like this:
  store i17 -1, ptr %p, align 4
  %v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.

If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.

Fixes: https://github.com/llvm/llvm-project/issues/81793
2024-02-15 15:40:21 +01:00
Jon Roelofs
e976385415
[llvm][GlobalOpt] Optimize statically resolvable IFuncs (#80606) 2024-02-06 13:58:58 -08:00
Kazu Hirata
84a48ee9fb [llvm] Stop including llvm/ADT/SetVector.h (NFC)
Identified with clangd.
2023-11-10 23:50:23 -08:00
Nikita Popov
e360a16fee
[GlobalOpt] Cache whether CC is changeable (#71381)
The hasAddressTaken() call in hasOnlyColdCalls() has quadratic
complexity if there are many cold calls to a function: We're going to
visit each call of the function, and then for each of them iterate all
the users of the function.

We've recently encountered a case where GlobalOpt spends more than an
hour in these hasAddressTaken() checks when full LTO is used.

Avoid this by moving the hasAddressTaken() check into hasChangeableCC()
and caching its result, so it is only computed once per function.
2023-11-07 10:36:45 +01:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Paul Walker
c7d65e4466 [IR] Enable load/store/alloca for arrays of scalable vectors.
Differential Revision: https://reviews.llvm.org/D158517
2023-09-14 13:49:01 +00:00
Tyler Lanphear
52f6f418c7
[GlobalOpt] Handle DL.getAllocaAddrSpace() != 0 (#65847)
Fix crash on RAUW due to locals and globals having different address
spaces. This is the intent of the original code, but it assumes the
alloca address space is 0. This patch fixes the code to check that the
global's address space matches `DL.getAllocaAddrSpace()` instead.

Fixes #65155
2023-09-09 10:12:42 -07:00
Nikita Popov
869d231bc4 [GlobalOpt] Remove unnecessary bitcasts (NFC) 2023-08-23 13:03:28 +02:00
Nikita Popov
1de568d129 [GlobalOpt] Simplify replacement of call with global
With opaque pointers, we no longer need to insert the bitcast.
2023-08-22 09:50:04 +02:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Bjorn Pettersson
e6e9a87534 Drop some typed pointer handling
Differential Revision: https://reviews.llvm.org/D156739
2023-08-02 12:08:37 +02:00
Matt Arsenault
f3059e223d GlobalOpt: Drop code for handling pointer bitcasts
I got confused and must have closed the wrong revision for
https://reviews.llvm.org/D149291
2023-07-11 21:05:18 -04:00
Matt Arsenault
6923a67db8 GlobalOpt: Improve addrspacecast handling
Handle addrspacecast when looking at uses.
2023-05-16 16:32:30 +01:00
Matt Arsenault
b52db60cbb GlobalOpt: Drop code to handle typed pointers
Fixes assert with pointers with different address spaces. We
could keep looking through addrspacecast, but it would require
checking for null handling of the access address space.

Fixes #62384
2023-04-29 09:48:21 -04:00
DianQK
533b7c1f6c
[GlobalOpt] Don't replace the aliasee if it has other references.
As long as aliasee has `@llvm.used` or `@llvm.compiler.used` references, we cannot do the related replace or delete operations. Even if it is a Local Linkage, we cannot infer if there is no other use for it, such as asm or other future added cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D145293
2023-04-27 09:53:47 +08:00
Nawal Copty
6e54a57c61 Preserve the address space for llvm.used and llvm.compiler.used global variables in GlobalOpt pass.
The llvm.used (or llvm.compiler.used) global variable is an array that contains a list of pointers to global variables and functions.

The GlobalOpt (Global Variable Optimizer) pass is not preserving the address space for llvm.used and llvm.compiler.used global variables.This patch updates the setUsedInitializer() function in GlobalOpt.cpp, so the address space is preserved.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D144518
2023-04-25 13:07:01 -07:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00