389 Commits

Author SHA1 Message Date
Jameson Nash
f4b77e6750
[InstCombine] Replace getAllocatedType() with getAllocationSize() (#177435)
Replace uses of getAllocatedType() with the more semantic
getAllocationSize() method in the alloca dereferenceability check
and zero-size alloca merging logic.

This simplifies the code by:
- Eliminating manual isArrayAllocation() checks (handled by
getAllocationSize)
- Eliminating superfluous isSized() checks (the verifier rejects them
already)
- Using TypeSize::isScalable() for scalable vector handling (before
casting to uint64_t)
- Using TypeSize::isZero() for zero-size checks

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:31:29 -05:00
Jameson Nash
d10b2b566a
[NFCI] replace getValueType with new getGlobalSize query (#177186)
Returns uint64_t to simplify callers. The goal is eventually replace
getValueType with this query, which should return the known minimum
reference-able size, as provided (instead of a Type) during create.
Additionally the common isSized query would be replaced with an
isExactKnownSize query to test if that size is an exact definition.
2026-01-22 13:55:53 -05:00
Aiden Grossman
a81d2bf933
[InstCombine] Propagate profiles when folding addrscast through loads (#177214)
#176352 introduced a new fold and a new test for this functionality.
Given the select condition is the same before and after, we can
propagate any profile information that may be attached to the select
instruction. We should not need to explicitly drop any metadata off the
select.
2026-01-22 06:07:03 -08:00
Theodoros Theodoridis
9d7317fe8b
[InstCombine] fold addrcast+load through selects (#176352)
Add support for:
    
    load(addrspacecast(select(Cond, &V1, &V2))) =>
    select(Cond, load(addrspacecast(&V1)), load(addrspacecast(&V2)))
    
Note: alive does not support addrspacecasts and thus proofs are omitted.
2026-01-21 09:33:42 +00:00
Jameson Nash
ba2bd3fbba
Use AllocaInst::getAllocationSize instead of manual size calculations (#176486)
Replace patterns that manually compute allocation sizes by multiplying
getTypeAllocSize(getAllocatedType()) by the array size with calls to the
getAllocationSize(DL) API, which handles this correctly and concisely,
returning nullopt for VLAs.

This fixes several places that were not accounting for array allocations
when computing sizes, simplifies code that was doing this manually, and
adds some explicit isFixed checks where implied convert was being used.

This PR is because now that we have opaque pointers, I hate that some
AllocaInst still has type information being consumed by some passes
instead of just using the size, since passes rarely handle that type
information well or correctly. I hope this will grow into a sequence of
commits to slowly eliminate uses of getAllocatedType from AllocaInst.
And similarly later to remove type information from GlobalValue too (it
can be replaced with just dereferenceable bytes, similar to arguments).

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 09:55:52 -05:00
Miloš Poletanović
44a52ea8be
[InstCombine] Fix unsafe PHINode cast and simplify logic in PointerReplacer (#172332)
Fixes #171883.

Basically, if the operand of the phi is an Instruction but it's not
available, the [condition
](1847a4efae/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp (L300))would
just break, and when we reach the[ deferral
check](1847a4efae/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp (L313)),
execution would continue even though there is a non-Instruction operand,
leading to a crash in the [subsequent processing
loop](1847a4efae/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp (L320)).
2025-12-17 12:07:40 +00:00
Drew Kersnar
9c78bc5de4
Revert "[LSV] Merge contiguous chains across scalar types" (#170381)
Reverts llvm/llvm-project#154069. I pointed out a number of issues
post-merge, most importantly examples of miscompiles:
https://github.com/llvm/llvm-project/pull/154069#issuecomment-3603854626.

While the motivation of the change is clear, I think the implementation
approach is flawed. It seems like the goal is to allow elements like
`load <2xi16>` and `load i32` to be vectorized together despite the
current algorithm not grouping them into the same equivalence classes. I
personally think that if we want to attempt this it should be a more
wholistic approach, maybe even redefining the concept of an equivalence
class. This current solution seems like it would be really hard to do
bug-free, and even if the bugs were not present, it is only able to
merge chains that happen to be adjacent to each other after
`splitChainByContiguity`, which seems like it is leaving things up to
chance whether this optimization kicks in. But we can discuss more in
the re-land. Maybe the broader approach I'm proposing is too difficult,
and a narrow optimization is worthwhile. Regardless, this should be
reverted, it needs more iteration before it is correct.
2025-12-02 18:27:58 -05:00
Anshil Gandhi
fbdf8ab590
[LSV] Merge contiguous chains across scalar types (#154069)
This change enables the LoadStoreVectorizer to merge and vectorize
contiguous chains even when their scalar element types differ, as long
as the total bitwidth matches. To do so, we rebase offsets between
chains, normalize value types to a common integer type, and insert the
necessary casts around loads and stores. This uncovers more
vectorization opportunities and explains the expected codegen updates
across AMDGPU tests.

Key changes:
- Chain merging
  - Build contiguous subchains and then merge adjacent ones when:
- They refer to the same underlying pointer object and address space.
    - They are either all loads or all stores.
    - A constant leader-to-leader delta exists.
- Rebasing one chain into the other's coordinate space does not overlap.
    - All elements have equal total bit width.
- Rebase the second chain by the computed delta and append it to the
first.

- Type normalization and casting
- Normalize merged chains to a common integer type sized to the total
bits.
- For loads: create a new load of the normalized type, copy metadata,
and cast back to the original type for uses if needed.
  - For stores: bitcast the value to the normalized type and store that.
- Insert zext/trunc for integer size changes; use bit-or-pointer casts
when sizes match.

- Cleanups
  - Erase replaced instructions and DCE pointer operands when safe.
- New helpers: computeLeaderDelta, chainsOverlapAfterRebase,
rebaseChain, normalizeChainToType, and allElemsMatchTotalBits.

Impact:
- Increases vectorization opportunities across mixed-typed but
size-compatible access chains.
- Large set of expected AMDGPU codegen diffs due to more/changed
vectorization.

This PR resolves #97715.
2025-12-01 23:05:17 -05:00
Matt Arsenault
072cf57a6a
InstCombine: Check GEP operand is available (#160438)
Logic copied from the select case.

Fixes #160302
2025-09-25 17:20:20 +09:00
Nikita Popov
3371375131
[InstCombine] Read-only call without return can capture (#157878)
The copied from constant memory analysis had a special case where
nocapture was not required for read-only calls without (or unused)
return. This is not correct, as the address can still be captured though
means other than memory and the return value, for example using
divergence. This code should not be trying to do its own nocapture
inference.
2025-09-15 09:34:04 +02:00
Vedant Paranjape
44df9826f3
[InstCombine] Propagate invariant.load metadata across unpacked loads (#152186)
For loads that operate on aggregate type, instcombine unpacks the loads.
It does not preserve the invariant.load metadata. This patch fixes that,
it looks for the metadata in the parent load and attaches the metadata
to the unpacked loads.

```
%struct.double2 = type { double, double }
%struct.double1 = type { double }

define %struct.double2 @func1(ptr %a) {
  %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1
  ret %struct.double2 %1
}

!1 = !{}
```
Reproducer: https://godbolt.org/z/hcY8MMvYh
2025-08-14 10:08:26 -07:00
Nikita Popov
be6bed4dc6 [InstCombine] Remove instructions before+after unreachable at same time
There is no need to first remove the instructions before and then
the ones after in two different worklist iterations. We don't need
to worry about change reporting here, as the functions do that
themselves.

This avoids the issue in #150338, but not really in a principled
way. It's possible that we will have to allow poison arguments
to lifetime.start/lifetime.end again if this turns out to be a
recurring problem.
2025-07-24 11:10:22 +02:00
Kazu Hirata
3e53d4d386
[llvm] Remove unused includes (NFC) (#150265)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:46 -07:00
Pierre van Houtryve
f223411e2e
[InstCombine]PtrReplacer: Correctly handle select with unavailable operands (#148829)
The testcase I added previously failed because a SelectInst with invalid
operands was created (one side `addrspace(4)`, the other
`addrspace(5)`).
PointerReplacer needs to dig deeper if the true and/or false
instructions of the select are not available.

Fixes SWDEV-542957
2025-07-16 09:32:05 +02:00
Anshil Gandhi
a314ac4d22
[Reland][InstCombine] Iterative replacement in PtrReplacer (#145410)
This patch enhances the PtrReplacer as follows:

1. Users are now collected iteratively to be generous on the stack. In
the case of PHIs with incoming values which have not yet been visited,
they are pushed back into the stack for reconsideration.
2. Replace users of the pointer root in a reverse-postorder traversal,
instead of a simple traversal over the collected users. This reordering
ensures that the uses of an instruction are replaced before replacing
the instruction itself.
3. During the replacement of PHI, use the same incoming value if it does
not have a replacement.

This patch specifically fixes the case when an incoming value of a PHI
is addrspacecasted.

This reland PR includes a fix for an assertion failure caused by
https://github.com/llvm/llvm-project/pull/137215, which was reverted.
The failing test involved a phi and gep depending on each other, in
which case the PtrReplacer did not order them correctly for replacement.
This patch fixes it by adding a check during the definition of
`PostOrderWorklist`.
2025-06-23 20:35:40 -04:00
Anshil Gandhi
72979093e7
Revert "[Reland][InstCombine] Iterative replacement in PtrReplacer" (#145137)
Reverts llvm/llvm-project#144626
2025-06-20 22:51:23 -04:00
Anshil Gandhi
94865edfa8
[Reland][InstCombine] Iterative replacement in PtrReplacer (#144626)
This patch enhances the PtrReplacer as follows:
1. Users are now collected iteratively to be generous on the stack. In the case of PHIs with incoming values which have not yet been visited, they are pushed back into the stack for reconsideration.
2. Replace users of the pointer root in a reverse-postorder traversal, instead of a simpletraversal over the collected users. This reordering ensures that the uses of an instruction are replaced before replacing the instruction itself.
3. During the replacement of PHI, use the same incoming value if it does not have a replacement.

This patch specifically fixes the case when an incoming value of a PHI
is addrspacecasted.

This is a reland of https://github.com/llvm/llvm-project/pull/137215.
2025-06-20 18:03:54 -04:00
Anshil Gandhi
c62a6138d9
Revert "[InstCombine] Iterative replacement in PtrReplacer" (#144394)
Reverts llvm/llvm-project#137215

This commit caused a failure in the LLVM CI:
https://lab.llvm.org/buildbot/#/builders/10/builds/7442
2025-06-16 13:05:31 -04:00
Anshil Gandhi
8bbef3d1c9
[InstCombine] Iterative replacement in PtrReplacer (#137215)
This patch enhances the PtrReplacer as follows:
1. Users are now collected iteratively to be generous on the stack. In
the case of PHIs with incoming values which have not yet been visited,
they are pushed back into the stack for reconsideration.
2. Replace users of the pointer root in a reverse-postorder traversal,
instead of a simple traversal over the collected users. This reordering
ensures that the operands of an instruction are replaced before
replacing the instruction itself.
3. During the replacement of PHI, use the same incoming value if it does
not have a replacement.

This patch specifically fixes the case when an incoming value of a PHI
is addrspacecasted.
2025-06-16 12:46:54 -04:00
Kazu Hirata
05cd32adb7
[llvm] Remove unused includes (NFC) (#144293)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-16 08:59:18 -07:00
Stephen Tozer
a08a831515
[DLCov][NFC] Propagate annotated DebugLocs through transformations (#138047)
Part of the coverage-tracking feature, following #107279.

In order for DebugLoc coverage testing to work, we firstly have to set
annotations for intentionally-empty DebugLocs, and secondly we have to
ensure that we do not drop these annotations as we propagate DebugLocs
throughout compilation. As the annotations exist as part of the DebugLoc
class, and not the underlying DILocation, they will not survive a
DebugLoc->DILocation->DebugLoc roundtrip. Therefore this patch modifies
a number of places in the compiler to propagate DebugLocs directly
rather than via the underlying DILocation. This has no effect on the
output of normal builds; it only ensures that during coverage builds, we
do not drop incorrectly annotations and therefore create false
positives.

The bulk of these changes are in replacing
DILocation::getMergedLocation(s) with a DebugLoc equivalent, and in
changing the IRBuilder to store a DebugLoc directly rather than storing
DILocations in its general Metadata array. We also use a new function,
`DebugLoc::orElse`, which selects the "best" DebugLoc out of a pair
(valid location > annotated > empty), preferring the current DebugLoc on
a tie - this encapsulates the existing behaviour at a few sites where we
_may_ assign a DebugLoc to an existing instruction, while extending the
logic to handle annotation DebugLocs at the same time.
2025-06-12 14:06:27 +01:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Changpeng Fang
fa45bf4300
InstCombine: Fix a crash in PointerReplacer when constructing a new PHI (#130256)
When constructing a PHI node in `PointerReplacer::replace`, the incoming
operands are expected to have already been replaced and in the
replacement map. However, when one of the incoming operands is a load,
the search of the map is unsuccessful, and a nullptr is returned from
`getReplacement`. The reason is that, when a load is replaced, all the
uses of the load has been actually replaced by the new load. It is
useless to insert the original load into the map. Instead, we should
place the new load into the map to meet the expectation of the later map
search.

Fixes: SWDEV-516420
2025-03-09 20:21:36 -07:00
Yingwei Zheng
37374fbcd3
[InstCombine] Simplify nonnull phi nodes (#128466)
Fix some regressions caused by
https://github.com/llvm/llvm-project/pull/128111.
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=1e0e4169dd00bf8a37cef8d74d0add7861982c4e&to=3a27268e264826ef9cf493f645507e490f05e7f3&stat=instructions%3Au
2025-03-01 23:11:55 +08:00
Yingwei Zheng
d23da7d630
[InstCombine] Increase recursion limit to 3 in simplifyNonNullOperand (#128695)
Address review comment
https://github.com/llvm/llvm-project/pull/128466#discussion_r1967228790

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=72781f58efddecee19feb07fec4e6104ef4c4812&to=3853aee61626b0eda06671b4cbbc4cdd1344440c&stat=instructions:u
2025-02-25 22:19:25 +08:00
Yingwei Zheng
2ebc69a521
[InstCombine] Add support for GEPs in simplifyNonNullOperand (#128365)
Alive2: https://alive2.llvm.org/ce/z/2KE8zG
2025-02-23 17:19:31 +08:00
Yingwei Zheng
126016b662
[InstCombine] Simplify nonnull pointers (#128111)
This patch is the follow-up of
https://github.com/llvm/llvm-project/pull/127979. It introduces a helper
`simplifyNonNullOperand` to avoid duplicate logic. It also addresses the
one-use issue in `visitLoadInst`, as discussed in
https://github.com/llvm/llvm-project/pull/127979#issuecomment-2671013972.
The `nonnull` attribute is also supported. Proof:
https://alive2.llvm.org/ce/z/MCKgT9
2025-02-22 15:30:04 +08:00
Yingwei Zheng
1b78ff6972
[InstCombine] Simplify the pointer operand of store if writing to null is UB (#127979)
Proof: https://alive2.llvm.org/ce/z/mzVj-u
I will add some follow-up patches to avoid duplicate code, support more
memory instructions, and bypass gep instructions.
2025-02-20 23:53:45 +08:00
Jeremy Morse
81d18ad864
[NFC][DebugInfo] Make some block-start-position methods return iterators (#124287)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators. A
number of these (such as getFirstNonPHIOrDbg) are sufficiently
infrequently used that we can just replace the pointer-returning version
with an iterator-returning version, hopefully without much/any
disruption.

Thus this patch has getFirstNonPHIOrDbg and
getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all
call-sites. There are no concerns about the iterators returned being
converted to Instruction*'s and losing the debug-info bit: because the
methods skip debug intrinsics, the iterator head bit is always false
anyway.
2025-01-27 16:27:54 +00:00
David Green
bec4c7f5f7
[InstCombine] Unpack scalable struct loads/stores. (#123986)
This teaches unpackLoadToAggregate and unpackStoreToAggregate to unpack
scalable structs to individual loads/stores with insertvalues /
extractvalues. The gep used for the offsets uses an i8 ptradd as opposed
to a struct gep, as the geps for scalable structs are not supported and
we canonicalize to i8.
2025-01-23 18:04:27 +00:00
Florian Hahn
edd1360208
[InstCombine] Preserve metadata from orig load in select fold. (#115605)
When replacing load with a select on the address with a select and 2
loads of the values, copy poison-generating metadata from the original
load to the newly created loads, which are placed at the same place as
the original loads. We cannot copy metadata that may trigger UB.

PR: https://github.com/llvm/llvm-project/pull/115605
2025-01-16 22:44:40 +00:00
Alex MacLean
1a56360cc6
[IR] Treat calls with byval ptrs as read-only (#122961) 2025-01-15 10:25:55 -08:00
Arthur Eubanks
e34d614e7d
[Passes] Remove -enable-infer-alignment-pass flag (#111873)
This flag has been on for a while without any complaints.
2024-10-10 12:28:46 -07:00
Kazu Hirata
4ac42afbcc
[InstCombine] Use llvm::set_is_subset (NFC) (#102778) 2024-08-10 22:46:03 -07:00
Shilei Tian
f38baad3e7
[InstCombine] Fix a crash in PointerReplacer (#98987)
A crash could happen in `PointerReplacer::replace` when constructing a
new
select instruction and there is no replacement for one of its operand.
This can
happen when the operand is a load instruction that has been replaced
earlier
such that the operand itself is already the new value. In this case, it
is not
in the replacement map and `getReplacement` simply returns nullptr.

Fix SWDEV-472192.
2024-07-16 13:17:24 -04:00
Nikita Popov
434a8a08a2 [InstCombine] Preserve all gep nowrap flags in PointerReplacer 2024-06-04 09:30:43 +02:00
Matt Arsenault
8cb19ebd21
InstCombine: Stop handling bitcast in PointerReplacer (#92937)
These should be irrelevant since opaque pointers.
2024-05-21 20:49:21 +02:00
Matt Arsenault
847c83f7cc
InstCombine: Process addrspacecast uses in PointerReplacer (#91953)
This was looking through an addrspacecast, and not finding a later
unfoldable cast to another address space. Fixes improperly deleting
a required alloca + memcpy and introducing an illegal addrspacecast.
    
This also required fixing some worklist management issues with
addrspacecast, and assuming that only memcpy sources could need
replacement.
    
Regresses one test function, but this looks like it optimized
before by accident. It never saw the pointer use by the call
to readonly_callee, which should require insertion of a new cast.
    
Fixes #68120
2024-05-15 07:02:31 +02:00
Matt Arsenault
8823abea6f InstCombine: Simplify vector initialization 2024-05-13 13:59:45 +02:00
Matt Arsenault
c5b0da9d83
InstCombine: Preserve inbounds in PointerReplacer (#91735)
This avoids spurious test changes in a future commit.
2024-05-13 13:49:09 +02:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Paul Walker
28fb2b33c2
[LLVM][SelectionDAG] Reduce number of ComputeValueVTs variants. (#75614)
This is another step in the direction of fixing the `Fixed(0) !=
Scalable(0)` bugbear, although whilst weird I don't believe it's causing
us any real issues.
2024-02-21 13:03:24 +00:00
Nikita Popov
89dae798cc [Loads] Use BatchAAResults for available value APIs (NFCI)
This allows caching AA queries both within and across the calls,
and enables us to use a custom AAQI configuration.
2024-01-24 14:04:21 +01:00
Nikita Popov
97efd8aa43 [InstCombine] Preserve inalloca tag when transforming alloca
This is not meaningful in any practical sense, and just makes sure
we don't cause verifier failures.
2023-12-11 14:27:00 +01:00
Nikita Popov
ae7bffd71c [InstCombine] Don't create unnecessary zero-index GEP (NFCI)
Note needed with opaque pointers.
2023-12-11 13:09:09 +01:00
Nikita Popov
6e3e21d203 [InstCombine] Remove unnecessary removeBitcastsFromLoadStoreOnMinMax() fold (NFCI)
This optimizes a very specific pointer bitcast pattern, and as
such is no longer relevant with opaque pointers.
2023-10-24 17:31:35 +02:00
Fangrui Song
2d854dd3e7 Move global namespace cl::opt inside llvm:: or internalize them 2023-10-10 19:58:03 -07:00
Dhruv Chawla
515a826326
[NFC][InferAlignment] Swap extern declaration and definition of EnableInferAlignmentPass
This prevents a linker issue when only InstCombine is linked without
PassBuilder, like in the case of bugpoint.
2023-09-20 13:07:13 +05:30
Dhruv Chawla
0104f37f16
[InstCombine] Use a cl::opt to control calls to getOrEnforceKnownAlignment in LoadInst and StoreInst
This is in preparation for the InferAlignment pass which handles
inferring alignment for instructions separately. It is better to handle
this as a separate pass as inferring alignment is quite costly, and
InstCombine running multiple times in the pass pipeline makes it even
more so.

Differential Revision: https://reviews.llvm.org/D158527
2023-09-20 12:08:14 +05:30
Paul Walker
c7d65e4466 [IR] Enable load/store/alloca for arrays of scalable vectors.
Differential Revision: https://reviews.llvm.org/D158517
2023-09-14 13:49:01 +00:00