38986 Commits

Author SHA1 Message Date
Veera
9d1fbbd2b9
[SROA][NFC] Remove Unused Parameter in promoteAllocas() (#128382)
Removing it because `Function &F` is not used by `promoteAllocas()`.
2025-02-23 11:17:43 -05:00
Florian Hahn
b72bbfc293 [VPlan] Remove fixHeaderPhis (NFC).
Removes unneeded code after https://github.com/llvm/llvm-project/pull/124432.
2025-02-23 10:51:20 +00:00
Yingwei Zheng
2ebc69a521
[InstCombine] Add support for GEPs in simplifyNonNullOperand (#128365)
Alive2: https://alive2.llvm.org/ce/z/2KE8zG
2025-02-23 17:19:31 +08:00
Florian Hahn
0859df4e42 [VPlan] Use operands from initial VPInstructions directly (NFC).
Use operands from VPInstructions directly during recipe creation.

Follow-up as discussed and planned after
https://github.com/llvm/llvm-project/pull/124432.
2025-02-22 22:34:35 +00:00
Florian Hahn
30f44c9627 [VPlan] Set values for non-header phis at construction. (NFC)
Update HCFG builder to set the incoming values directly at construction
for non-header phis.

Simplification/clarification as suggested independently in
https://github.com/llvm/llvm-project/pull/126388.
2025-02-22 17:27:10 +00:00
Teresa Johnson
eb92157399
[MemProf] Add ability to export or highlight only a portion of graph (#128255)
To simplify debugging and analysis, particularly for very large
applications with large graphs, this patch adds support for either
highlighting a single context id or allocation's context ids, and/or
only exporting the nodes/edges for a single context id or allocation's
context ids. When highlighting, the specified nodes and edges are a
brighter color and larger.

This can be controlled by the new -memprof-dot-scope={all,alloc,context}
flag which controls how much to export, along with two companion flags:
	-memprof-dot-alloc-id=ID
	-memprof-dot-context-id=ID
These two are interpreted differently depending on the value of
-memprof-dot-scope (where "all" is the default).

If exporting all, one of the above flags can optionally be passed to
highlight the nodes/edges for the given context id or allocation's
context ids.

If exporting alloc scope, an alloc id must be provided. A context id can
optionally be provided to highlight that context.

If exporting context scope, a context id must be provided.

The ids to use can be obtained either by looking at the full graph, or a
context id can be identified from the -memprof-report-hinted-sizes
output after PR128188 is merged.
2025-02-22 05:42:46 -08:00
Teresa Johnson
9d6f2647de
[MemProf] Print internal context id when reporting bytes hinted (#128188)
During the whole program reporting of contexts when hinted byte
reporting is enabled via -memprof-report-hinted-sizes, also print the
internal context id. This is useful for debugging, as well as for
guiding the dot file dumping with some upcoming changes that will
accept a context id to focus the graph on a context of interest.
2025-02-22 05:42:28 -08:00
Luke Lau
e23ab73335
[VPlan] Don't convert widen recipes to VP intrinsics in EVL transform (#127180)
This is a copy of #126177, since it was automatically and permanently
closed because I messed up the source branch on my remote

This patch proposes to avoid converting widening recipes to VP
intrinsics during the EVL transform.

IIUC we initially did this to avoid `vl` toggles on RISC-V. However we
now have the RISCVVLOptimizer pass which mostly makes this redundant.

Emitting regular IR instead of VP intrinsics allows more generic
optimisations, both in the middle end and DAGCombiner, and we generally
have better patterns in the RISC-V backend for non-VP nodes. Sticking to
regular IR instructions is likely a lot less work than reimplementing
all of these optimisations for VP intrinsics, and on SPEC CPU 2017 we get
noticeably better code generation.
2025-02-22 19:38:11 +08:00
Florian Hahn
b74413bf91 [VPlan] Use VPSingleDef instead of VPValue in HCFG builder (NFC).
Use VPSingleDef to remove unneeded casts to a recipe type.
2025-02-22 11:15:37 +00:00
Mikhail Gudim
f5d153ef26
[VectorCombine] Fold binary op of reductions. (#121567)
Replace binary of of two reductions with one reduction of the binary op
applied to vectors. For example:

```
%v0_red = tail call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> %v0)
%v1_red = tail call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> %v1)
%res = add i32 %v0_red, %v1_red
```
gets transformed to:

```
%1 = add <16 x i32> %v0, %v1
%res = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> %1)
```
2025-02-22 06:11:33 -05:00
Florian Hahn
26afa2deea [VPlan] Create VPInstructions after setting preds in HCFG builder (NFC)
Set VPBBs predecessors before creating VPInstructions, as setting
incoming values for non-header phis directly there will require
predecessors to be available.
2025-02-22 10:33:17 +00:00
Antonio Frighetto
93b263a01c [SimplifyCFG] Drop unused LockstepReverseIterator class (NFC)
Unmaintained code has been removed.
2025-02-22 11:26:13 +01:00
Antonio Frighetto
48a6df3604 Reapply "[Utils] Consolidate LockstepReverseIterator into own header (NFC)"
Common code has been unified and generalized.

Original commit: 123dca9b56e1359d8ec7771ea3bd0afd4b1ea6af

Previously reverted due to accidentally merged incompletely. The issue has
been addressed by restoring missing code.
2025-02-22 11:21:36 +01:00
cooperp
f4e8f6da41
[Reassociate] Use a reference to DataLayout instead of copying the underlying string data (NFC) (#128269)
I noticed this when looking at all allocations by clang. For a medium
sized file this was around 6000 calls to operator new, although i
suspect there were more allocations in total as the SmallVectors in
DataLayout may have their own allocations in some cases.

In a follow-up i'm tempted to make the DataLayout copy constructor
private, to avoid this in future. There are a few tests which copy the
DataLayout, and perhaps need to (I didn't check yet), but we could
provide a clone() method for them if needed. Its only accidental copying
I think we should consider avoiding, not people who really do need to
copy it for reasons.
2025-02-22 10:37:24 +01:00
Yingwei Zheng
126016b662
[InstCombine] Simplify nonnull pointers (#128111)
This patch is the follow-up of
https://github.com/llvm/llvm-project/pull/127979. It introduces a helper
`simplifyNonNullOperand` to avoid duplicate logic. It also addresses the
one-use issue in `visitLoadInst`, as discussed in
https://github.com/llvm/llvm-project/pull/127979#issuecomment-2671013972.
The `nonnull` attribute is also supported. Proof:
https://alive2.llvm.org/ce/z/MCKgT9
2025-02-22 15:30:04 +08:00
Alexey Bataev
8ffdc3b207 [SLP]Fix a crash when checking a scalar in a reordered buildvector node
Need to check reordered scalars, not the original ones, to correctly
check proper scalar.
2025-02-21 14:59:43 -08:00
Teresa Johnson
92e02ad9dc
[MemProf] Display backedges with dotted line in dot graphs (#128235)
Add checking of this behavior in the postbuild dot graphs, facilitated
by PR128226 which marked these edges at the end of the graph building.
2025-02-21 14:49:28 -08:00
Florian Hahn
236fa506d4 Revert "[Utils] Consolidate LockstepReverseIterator into own header (NFC) (#116657)"
This reverts commit 123dca9b56e1359d8ec7771ea3bd0afd4b1ea6af.

This breaks building on macOS with clang and multiple build bots,
including https://lab.llvm.org/buildbot/#/builders/175/builds/13585

    llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp: In function ‘bool sinkCommonCodeFromPredecessors(llvm::BasicBlock*, llvm::DomTreeUpdater*)’:
    /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:2503:3: error: reference to ‘LockstepReverseIterator’ is ambiguous
     2503 |   LockstepReverseIterator<true> LRI(UnconditionalPreds);
          |   ^~~~~~~~~~~~~~~~~~~~~~~
2025-02-21 21:00:28 +00:00
Teresa Johnson
c3d5070086
[MemProf] Refactor backedge computation and invoke earlier (#128226)
Invoke the backedge computation (refactored as a new method) at the end
of the graph construction, instead of at the start of cloning. That
makes more logical sense, and it also makes it easier to look at the
results in the postbuild dot graph with a follow on change to display
those differently.
2025-02-21 12:57:40 -08:00
Antonio Frighetto
123dca9b56
[Utils] Consolidate LockstepReverseIterator into own header (NFC) (#116657)
Common code has been unified and generalized. Not sure if it may be
worth to generalize this further, since it looks closely tied to Blocks
(might make sense to rename it in `LockstepReverseInstructionIterator`).
2025-02-21 12:21:33 -08:00
Teresa Johnson
741f923fac
[MemProf] Minor fixes to dot graph printing (#128217)
Two misc cleanup/improvements to the dot printing.

Remove a redundant "style=filled" in the Node attributes. No effect on
resulting graph.

Add a "color" attribute to the Edge, with the same color name as
"fillcolor". The latter only fills in the arrowhead, and the former is
what affects the line. This makes the edge colors more visible,
previously it was a black edge with a colored in arrowhead.

For the second change, I added the new Edge color attributes to the
checking in the two "basic.ll" tests, so we get some testing coverage of
the full printing. For the other affected tests I removed the final "]'"
after the fillcolor so it matches up through that attribute and ignores
the rest of the line.
2025-02-21 12:02:06 -08:00
Kazu Hirata
34cebaf73a
[Instrumentation] Avoid repeated hash lookups (NFC) (#128128) 2025-02-21 11:08:12 -08:00
Ramkumar Ramachandra
2d38be5fd4
[LV] Strip redundant casts (NFC) (#128177) 2025-02-21 17:37:39 +00:00
Alexey Bataev
894935cb51
[SLP]Represent SLP graph as a tree
We can stop using a graph representation of the SLP structure and switch
directly to tree by relying on a single user of each tree node. If the
node has multiple uses, other uses must be represented as a separate
gather/buildvector node, which then will be combined with the existing
vectorized node(s) uoon cost estimation/codegen.
This allow to simplify inner structure and turn in some extra
optimizations, which could not be turned on for the nodes with multi
users (reordering, minbitwidth analysis).

AVX512, -O3+LTO
Metric: size..text
                                                                               results     results0    diff
         test-suite :: MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000.test   253453.00   254253.00  0.3%
                    test-suite :: External/SPEC/CFP2006/444.namd/444.namd.test   251411.00   252051.00  0.3%
                      test-suite :: SingleSource/Benchmarks/Misc/oourafft.test    19114.00    19146.00  0.2%
     test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  1399200.00  1399520.00  0.0%
      test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  1399200.00  1399520.00  0.0%
      test-suite :: MicroBenchmarks/LCALS/SubsetALambdaLoops/lcalsALambda.test   304310.00   304326.00  0.0%
            test-suite :: MicroBenchmarks/LCALS/SubsetARawLoops/lcalsARaw.test   304662.00   304678.00  0.0%
      test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12566919.00 12567511.00  0.0%
                test-suite :: External/SPEC/CFP2006/453.povray/453.povray.test  1146300.00  1146316.00  0.0%
        test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test  1159864.00  1159880.00  0.0%
             test-suite :: External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r.test  9407880.00  9407864.00 -0.0%
            test-suite :: External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s.test  9407880.00  9407864.00 -0.0%
               test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test  1011612.00  1011596.00 -0.0%
test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test   280584.00   280536.00 -0.0%
     test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test    93016.00    93000.00 -0.0%

ASCI_Purple/SMG2000 - extra code vectorized, small variations
CFP2006/444.namd - small variations, less shuffles
Benchmarks/Misc/oourafft - small variations
CFP2017rate/538.imagick_r
CFP2017speed/638.imagick_s - small variations, less shuffles
LCALS/SubsetALambdaLoops - less shuffles
LCALS/SubsetARawLoops - less shuffles
CFP2017rate/526.blender_r - small variations, extra vector code
CFP2006/453.povray - small variations
CFP2017rate/511.povray_r - small variations
CINT2017rate/502.gcc_r
CINT2017speed/602.gcc_s - small variations
Benchmarks/tramp3d-v4 - small variations
Prolangs-C/TimberWolfMC - small variations
DOE-ProxyApps-C++/miniFE - extra code vectorized, small variations
DOE-ProxyApps-C++/CLAMR - extra code vectorized, small variations
ASCI_Purple/SMG2000 - no significant changes

RISCV, -O3+LTO
Metric: size..text
                                                                                          results    results0   diff
test-suite :: SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-pr28982b.test    1812.00    1866.00  3.0%
                            test-suite :: MultiSource/Benchmarks/Olden/health/health.test    3946.00    4016.00  1.8%
                     test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test  513180.00  513550.00  0.1%
                      test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test  513180.00  513550.00  0.1%
                        test-suite :: External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r.test 7672198.00 7672202.00  0.0%
                       test-suite :: External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s.test 7672198.00 7672202.00  0.0%
                       test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test  746060.00  746044.00 -0.0%
                 test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 9497716.00 9497364.00 -0.0%
                           test-suite :: External/SPEC/CFP2006/453.povray/453.povray.test  948266.00  948214.00 -0.0%
                               test-suite :: External/SPEC/CFP2006/433.milc/433.milc.test   89874.00   89862.00 -0.0%
                            test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test  835492.00  835346.00 -0.0%
                test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test   66230.00   66202.00 -0.0%
                   test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test  946090.00  944206.00 -0.2%
                test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1136404.00 1131854.00 -0.4%
                 test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1136404.00 1131854.00 -0.4%

gcc-c-torture/execute/GCC-C-execute-pr28982b - better vector code
Olden/health - extra vector code
CINT2017speed/625.x264_s
CINT2017rate/525.x264_r - small variation + improvements in reordering, @pixel_hadamard_ac stopped
being vectorized because of some non-effective shuffle recognition by
the compiler
CINT2017rate/502.gcc_r
CINT2017speed/602.gcc_s - small variations
CFP2017rate/508.namd_r - small variations
CFP2017rate/526.blender_r - small variations
CFP2006/453.povray - extra vector code
Benchmarks/7zip - extra vector code
DOE-ProxyApps-C++/miniFE - small variations
CFP2017rate/511.povray_r - extra vector code
CFP2017speed/638.imagick_s
CFP2017rate/538.imagick_r - extra vector code

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/126771
2025-02-21 07:15:02 -05:00
Aleksandr Popov
41437a6067
[LoopSimplifyCFG] Fix SCEV invalidation after removing dead exit (#127536)
Fixes #127534
2025-02-21 12:26:39 +01:00
vporpo
4d92975b5c
[SandboxVec][Scheduler] Don't allow rescheduling of already scheduled (#128050)
This patch implements the check for not allowing re-scheduling of
instructions that have already been scheduled in a scheduling bundle.
Rescheduling should only happen if the instructions were temporarily
scheduled in singleton bundles during a previous call to
`trySchedule()`.
2025-02-20 16:16:34 -08:00
Vasileios Porpodas
2ff80d2448 [SandboxVec][Scheduler] Fix reassignment of SchedBundle to DGNode
When assigning a bundle to a DAG Node that is already assigned to a
SchedBundle we need to remove the node from the old bundle.
2025-02-20 15:28:16 -08:00
vporpo
10b99e97ff
[SandboxVec][BottomUpVec] Separate vectorization decisions from code generation (#127727)
Up until now the generation of vector instructions was taking place
during the top-down post-order traversal of vectorizeRec(). The issue
with this approach is that the vector instructions emitted during the
traversal can be reordered by the scheduler, making it challenging to
place them without breaking the def-before-uses rule.

With this patch we separate the vectorization decisions (done in
`vectorizeRec()`) from the code generation phase (`emitVectors()`). The
vectorization decisions are stored in the `Actions` vector and are used
by `emitVectors()` to drive code generation.
2025-02-20 10:21:25 -08:00
Simon Pilgrim
2fab6db728
[VectorCombine] foldSelectShuffle - remove extra adds of old shuffles to worklist (#127999)
We already push the old shuffles to the worklist as part of the replaceValue calls, so we shouldn't need to add them to the deferred list as well - my guess is this was to ensure that the instructions got erased first to help cleanup unused instructions, but eraseInstruction should handle this now.
2025-02-20 18:02:34 +00:00
Kazu Hirata
4a8f414565
[Utils] Avoid repeated hash lookups (NFC) (#127959) 2025-02-20 08:56:56 -08:00
Kazu Hirata
506b31ec36
[IPO] Avoid repeated hash lookups (NFC) (#127957) 2025-02-20 08:55:52 -08:00
Florian Hahn
404af37175 [VPlan] Remove stale assertion in HCFG builder.
The assertion was left over from a time when VPBBs still had an
associated condition bit. This is not the case any more (comment was
stale). In case a branch on condition is needed, a BranchOnCond
VPInstruction is added when constructing recipes. That's also where it
is checked if the condition is available.

Exposed by 38376dee9.
2025-02-20 17:01:49 +01:00
Yingwei Zheng
1b78ff6972
[InstCombine] Simplify the pointer operand of store if writing to null is UB (#127979)
Proof: https://alive2.llvm.org/ce/z/mzVj-u
I will add some follow-up patches to avoid duplicate code, support more
memory instructions, and bypass gep instructions.
2025-02-20 23:53:45 +08:00
Kazu Hirata
2130b9cea4
[Coroutines] Avoid repeated hash lookups (NFC) (#127956) 2025-02-19 23:29:46 -08:00
Kazu Hirata
6342095bce [memprof] Fix a warning
This patch fixes:

  llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:3409:8:
  error: unused variable 'I' [-Werror,-Wunused-variable]
2025-02-19 14:28:13 -08:00
Teresa Johnson
92b07520bc
[MemProf] Support cloning through recursive cycles (#127429)
In order to facilitate cloning of recursive cycles, we first identify
backedges using a standard DFS search from the root callers, then
initially defer recursively invoking the cloning function via those
edges. This is because the cloning opportunity along the backedge may
not be exposed until the current node is cloned for other non-backedge
callers that are cold after the earlier recursive cloning, resulting
in a cold predecessor of the backedge. So we recursively invoke the
cloning function for the backedges during the cloning of the current
node for its caller edges (which were sorted to enable handling cold
callers first).

There was no significant time or memory overhead measured for several
large applications.
2025-02-19 12:44:33 -08:00
Craig Topper
1761066fc6
[GlobalOpt] Remove Function* argument from tryWidenGlobalArrayAndDests. NFC (#127848)
This is only used to get the Module and the LLVMContext. We can get both
of those from the GlobalVariable*.
2025-02-19 12:37:54 -08:00
Björn Pettersson
c833746c6c
[DSE] Make iter order deterministic in removePartiallyOverlappedStores. NFC (#127678)
In removePartiallyOverlappedStores we iterate over
InstOverlapIntervalsTy which is a DenseMap. Change that map into using
MapVector to ensure that we apply the transforms in a deterministic
order. I've only seen that the order matters if starting to use names
for the instructions created when doing the transforms. But such things
are a bit annoying when debugging etc.
2025-02-19 21:24:49 +01:00
Craig Topper
2bf473bd54
[GlobalOpt] Don't query TTI on a llvm.memcpy declaration. (#127760)
Querying TTI creates a Subtarget object, but an llvm.memcpy declaration
doesn't have target-cpu and target-feature attributes like functions
with definitions. This can cause a warning to be printed on RISC-V
because the target-abi in the Module requires floating point, but the
subtarget features don't enable floating point. So far we've only seen
this in LTO when an -mcpu is not supplied for the TargetMachine.

To fix this, get TTI for the calling function instead.

Fixes the issue reported here
https://github.com/llvm/llvm-project/issues/69780#issuecomment-2665273161
2025-02-19 10:17:07 -08:00
Florian Hahn
a96444af44 [VPlan] Remove dead exit block handling code in HCFGBuilder.
The mapping of IR ExitBB to a VPBB isn't used. It also sets an incorrect
VPBB for the ExitBB; the regions successor is the middle block, no the
exit block.

It also unnecessarily triggers an assertion after 38376dee922.
2025-02-19 18:51:45 +01:00
Andreas Jonson
aa847ced07
[InstCombine] handle trunc to i1 in foldSelectICmpAndBinOp (#127390)
for `trunc nuw` saves a instruction and otherwise only other
instructions without the select, same behavior as for bit test before.

proof: https://alive2.llvm.org/ce/z/a6QmyV
2025-02-19 18:29:47 +01:00
Andreas Jonson
8fc03e4ff1
[InstCombine] avoid extra instructions in foldSelectICmpAnd (#127398)
Disable fold when it will result in more instructions.
2025-02-19 18:09:24 +01:00
Nico Weber
e2ba1b6ffd Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
2025-02-19 11:32:57 -05:00
Yingwei Zheng
b2659ca44b
[InstCombine] Propagate flags in foldSelectICmpAndBinOp (#127437)
It is always safe to add poison-generating flags for `BinOp Y,
Identity`.
Proof: https://alive2.llvm.org/ce/z/8BLEpq
and https://alive2.llvm.org/ce/z/584Bb4

Then we can propagate flags from one of the arms:
```
select Cond, Y, (BinOp flags Y, Z) ->
select Cond, (BinOp flags Y, Identity), (BinOp flags Y, Z) ->
BinOp flags Y, (select Cond, Identity, Z)
```
This patch is proposed to avoid information loss caused by
https://github.com/llvm/llvm-project/pull/127390.
2025-02-19 09:22:15 +08:00
vporpo
0cc7381543
[SandboxVec][Scheduler] Don't insert scheduled instrs into the ready list (#127688)
In a particular scenario (see test) we used to insert scheduled
instructions into the ready list. This patch fixes this by fixing the
trimSchedule() function.
2025-02-18 16:17:46 -08:00
vporpo
0f6c18e8c6
[SandboxVec] Replace hard-coded context save() with transaction-save pass (#127690)
This patch implements a small region pass that saves the context's
state. The patch is now used in the default pipeline to save the context
state instead of the hard-coded call to Context::save().

The concept behind this is that the passes themselves should not have to
do the actual saving/restoring of the IR state, because that would make
it challenging to reorder them in the pipeline. Having separate
save/restore passes makes the transformation passes more composable as
parts of arbitrary pipelines.
2025-02-18 13:34:51 -08:00
vporpo
5ecce45ea2
[SandboxVec] Move seed collection into its own separate pass (#127132)
This patch moves the seed collection logic from the BottomUpVec pass
into a new Sandbox IR Function pass. The new "seed-collection" pass
collects the seeds, builds a region and runs the region pass pipeline.
2025-02-18 11:11:07 -08:00
vporpo
426148b269
[SandboxVec][DAG] Implement DAG maintainance on Instruction removal (#127361)
This patch implements dependency maintenance upon receiveing the
notification that an instruction gets deleted.
2025-02-18 10:59:31 -08:00
Kazu Hirata
5d4eb08379
[Analysis] Remove skipSCC (#127412)
The last use was removed in:

  commit fa6ea7a419f37befbed04368bcb8af4c718facbb
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Mon Mar 20 11:18:35 2023 -0700
2025-02-18 09:59:12 -08:00
Björn Pettersson
74016728e3
[DSE] Update dereferenceable attributes when adjusting memintrinsic ptr (#125073)
Consider IR like this
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p, i8 0, i64 28,
i1 false)
  store i32 1, ptr %p

In the past it has been optimized like this:
  %p2 = getelementptr inbounds i8, ptr %p, i64 4
call void @llvm.memset.p0.i64(ptr dereferenceable(28) %p2, i8 0, i64 24,
i1 false)
  store i32 1, ptr %p

As the input IR doesn't guarantee that it is OK to deref 28 bytes
starting at the adjusted pointer %p2 the transformation has been a bit
flawed.

With this patch we make sure to drop any
dereferenceable/dereferenceable_or_null attributes when doing such
transforms. An alternative would have been to adjust the amount of
dereferenceable bytes, but since a memset with a constant length already
implies dereferenceability by itself it is simpler to just drop the
attributes.

The new filtering of attributes is done using a helper that only keep
attributes that we explicitly handle. For the adjusted mem instrinsic
pointers that currently involve "NonNull", "NoUndef" and "Alignment"
(when the alignment is known to be fulfilled also after offsetting the
pointer).

Fixes #115976
2025-02-18 17:51:14 +01:00