1178 Commits

Author SHA1 Message Date
Joseph Huber
4c9b7ff04c
[LLVM] Introduce 'llvm-offload-wrapper' tool (#153504)
Summary:
This is a standalone tool that does the wrapper stage of the
`clang-linker-wrapper`. We want this to be an external tool because
currently there's no easy way to split apart what the
clang-linker-wrapper is doing under the hood. With this tool, users can
manually extract files with `clang-offload-packager`, feed them through
`clang --target=<triple>` and then use this tool to generate a `.bc`
file they can give to the linker. The goal here is to make reproducing
the linker wrapper steps easier.
2025-08-19 11:05:48 -05:00
Nikita Popov
35bad229c1
[PredicateInfo] Use bitcast instead of ssa.copy (#151174)
PredicateInfo needs some no-op to which the predicate can be attached.
Currently this is an ssa.copy intrinsic. This PR replaces it with a
no-op bitcast.
    
Using a bitcast is more efficient because we don't have the overhead of
an overloaded intrinsic. It also makes things slightly simpler overall.
2025-08-11 09:25:01 +02:00
Nathan Gauër
d64371b819
[tools] Cleanup spirv-sim (#151705)
spirv-sim was supposed to be used to test cross-lane interactions. This
utility was in the end never used for testing, and as we move to proper
end-to-end testing through the llvm/offload-test-suite project, this
becomes obsolete.

Cleaning this up.
2025-08-04 10:42:33 +02:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
Madhur Amilkanthwar
2320cddfc2
Reapply "[GVN] memoryssa implies no-memdep (#149473)" (#149767)
Enabling one of MemorySSA or MD implies the other is off.

Already approved in https://github.com/llvm/llvm-project/pull/149473 but
I had to revert as I missed updating one test.
2025-07-21 14:05:29 +05:30
Cristian Assaiante
81eb7defa2
[OptBisect][IR] Adding a new OptPassGate for disabling passes via name (#145059)
This commit adds a new pass gate that allows selective disabling
of one or more passes via the clang command line using the
`-opt-disable` option. Passes to be disabled should be specified as a
comma-separated list of their names.
The implementation resides in the same file as the bisection tool. The
`getGlobalPassGate()` function returns the currently enabled gate.

Example: `-opt-disable="PassA,PassB"`

Pass names are matched using case-insensitive comparisons. However, note
that special characters, including spaces, must be included exactly as
they appear in the pass names.

Additionally, a `-opt-disable-enable-verbosity` flag has been introduced to
enable verbose output when this functionality is in use. When enabled,
it prints the status of all passes (either running or NOT running),
similar to the default behavior of `-opt-bisect-limit`. This flag is
disabled by default, which is the opposite of the `-opt-bisect-verbose`
flag (which defaults to enabled).

To validate this functionality, a test file has also been provided. It reuses
the same infrastructure as the opt-bisect test, but disables three
specific passes and checks the output to ensure the expected behavior.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-07-16 16:51:58 -07:00
Tcc100
7daa1defd2 Reland "[CodeGen] Expose the extensibility of PassConfig to plugins (#139059)"
Add missing dependencies to unittest target
Original patch broke BUILD_SHARED bots and required revert #147947
2025-07-10 15:26:48 +02:00
Jan Patrick Lehr
0481d2a161
Revert "[CodeGen] Expose the extensibility of PassConfig to plugins" (#147947)
Reverts llvm/llvm-project#139059

This broke
https://lab.llvm.org/buildbot/#/builders/10/builds/9125/steps/8/logs/stdio

The bot does a SHARED_LIBS=ON build. I can reproduce locally with the
CMake cache file in offload/cmake/caches/AMDGPUBot.cmake as the build
config.
2025-07-10 14:00:55 +02:00
Tcc100
56a8655f4a
[CodeGen] Expose the extensibility of PassConfig to plugins (#139059)
This PR exposes the backend pass config to plugins via a callback.
Plugin authors can register a callback that is being triggered before
the target backend adds their passes to the pipeline. In the callback
they then get access to the `TargetMachine`, the `PassManager`, and the
`TargetPassConfig`. This allows plugins to call
`TargetPassConfig::insertPass`, which is honored in the subsequent
`addPass` of the main backend. We implemented this using the legacy pass
manager since backends still use it as the default.
2025-07-10 12:43:09 +02:00
Nikita Popov
102c22cb2c [FatLTO] Disable analysis verification in pipeline test (NFC)
To fix test failure with expensive checks reports at:
https://github.com/llvm/llvm-project/pull/146048#issuecomment-3022421122
2025-07-01 10:47:23 +02:00
Nikita Popov
3a7d60860d [FatLTO] Relax checks for fatlto pipeline test
EmbedBitcodePass now reports that it modified the IR, so there
are more analysis invalidations in between. Convert CHECK-NEXT
to CHECK.
2025-06-30 12:59:25 +02:00
Nikita Popov
d7a3bdffb9
[PassBuilder][FatLTO] Expose FatLTO pipeline via pipeline string (#146048)
Expose the FatLTO pipeline via `-passes="fatlto-pre-link<Ox>"`, similar
to all the other optimization pipelines. This is to allow reproducing it
outside clang. (Possibly also useful for C API users.)
2025-06-30 12:04:42 +02:00
Nikita Popov
7f223d121d
[PassBuilder] Treat pipeline aliases as normal passes (#146038)
Pipelines like `-passes="default<O3>"` are currently parsed in a special
way. Switch them to work like normal, parameterized module passes.
2025-06-27 12:07:09 +02:00
Peter Collingbourne
3fa231f47c
Add SimplifyTypeTests pass.
This pass figures out whether inlining has exposed a constant address to
a lowered type test, and remove the test if so and the address is known
to pass the test. Unfortunately this pass ends up needing to reverse
engineer what LowerTypeTests did; this is currently inherent to the design
of ThinLTO importing where LowerTypeTests needs to run at the start.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: https://github.com/llvm/llvm-project/pull/141327
2025-06-05 11:09:20 -07:00
Tianle Liu
e038c5401c
[LTO][Pipelines] Add 0 hot-caller threshold for SamplePGO + FullLTO (#135152)
If a hot callsite function is not inlined in the 1st build, inlining the
hot callsite in pre-link stage of SPGO 2nd build may lead to Function
Sample not found in profile file in link stage. It will miss some
profile info.
ThinLTO has already considered and dealed with it by setting
HotCallSiteThreshold to 0 to stop the inline. This patch just adds the
same processing for FullLTO.
2025-04-14 11:21:08 +08:00
Shilei Tian
a45b133d40
[AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier (#134910) 2025-04-11 15:32:37 -04:00
Alan Zhao
50ea777e40
[opt][timers] Fix time-passes.ll test failing on reversed iterators (#131941)
After https://github.com/llvm/llvm-project/pull/131217 was submitted,
time-passes.ll fails because `opt` prints `-time-report` when
`ManagedTimerGlobals` is destroyed. `ManagedTimerGlobals` stores
`TimerGroup`s in an unordered map, so the ordering of the output
`TimerGroup`s depends on the underlying iterator.

To fix this, we do what Clang does and use
`llvm::TimerGroup::printAll(...)`, which *is* deterministic. This is
also what Clang does. This does put move analysis section before the
pass section for `-time-report`, but again, this is also what Clang
currently does.
2025-03-27 15:31:53 -07:00
Matt Arsenault
da3ee97632
StandardInstrumentation: Fix -ir-dump-directory with -print-before-pass-number (#130983) 2025-03-14 22:55:56 +07:00
Guy David
9820248e0a
AddressSanitizer: Add use-after-scope to pass options (#130924) 2025-03-12 17:17:51 +02:00
Mircea Trofin
11b1f154be
Optionally print !prof metadata inline (#130303)
Inspired by PR #127944, this patch adds an option to print profile metadata inline with respect to the instruction (or function) it annotates - this saves one time from having to search up and down large textual modules to find this info.
2025-03-07 12:22:13 -08:00
Vitaly Buka
3ccacc4e44
Revert "[LTO][Pipelines][Coro] De-duplicate Coro passes" (#129977)
Reverts llvm/llvm-project#128654

Breaks FatLTO
https://github.com/llvm/llvm-project/pull/128654#issuecomment-2700053700
2025-03-06 07:57:30 -08:00
Peter Collingbourne
0ebf7b473a
IR, CodeGen: Add command line flags for dumping instruction addresses and debug locations.
As previously discussed [1], it is sometimes useful to be able to see
instruction addresses and debug locations as part of IR dumps. The
same applies to MachineInstrs which already dump debug locations but
not addresses. Therefore add some flags that can be used to enable
dumping of this information.

[1] https://discourse.llvm.org/t/small-improvement-to-llvm-debugging-experience/79914

Reviewers: rnk

Reviewed By: rnk

Pull Request: https://github.com/llvm/llvm-project/pull/127944
2025-02-27 15:45:55 -08:00
Vitaly Buka
31897e651a
[LTO][Pipelines][Coro] De-duplicate Coro passes (#128654)
```
if (!isLTOPostLink(Phase))
    CoroPM.addPass(CoroEarlyPass());
if (!isLTOPreLink(Phase))
    // Other Coro passes
```

Followup to #126168.
2025-02-25 20:14:19 -08:00
Arthur Eubanks
ab098a7ebf
[CGSCC] Add statistic on largest SCC visited (#128073)
To help debugging long compile times.
2025-02-21 09:13:11 -08:00
Nikita Popov
d8b2e432d6
[IR] Remove mul constant expression (#127046)
Remove support for the mul constant expression, which has previously
already been marked as undesirable. This removes the APIs to create mul
expressions and updates tests to stop using mul expressions.

Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
2025-02-14 09:28:57 +01:00
Vitaly Buka
1032df6f60
[LTO][Pipelines][Coro] Handle coroutines in LTO pipeline (#126168)
ThinLTO delays handling of coroutines to ThinLTO backend.
However it's usually possible to use ThinLTO prelink objects for FullLTO.

In this case we have left-over coroutines which crash in codegen.

Issue #104525.
2025-02-12 21:39:32 -08:00
Axel Sorenson
d3161defd6
[PassBuilder] VectorizerEnd Extension Points (#123494)
Added an extension point after vectorizer passes in the PassBuilder.
Additionally, added extension points before and after vectorizer passes
in `buildLTODefaultPipeline`. Credit goes to @mshockwave for guiding me
through my first LLVM contribution (and my first open source
contribution in general!) :)
- Implemented `registerVectorizerEndEPCallback`
- Implemented `invokeVectorizerEndEPCallbacks`
- Added `VectorizerEndEPCallbacks` SmallVector
- Added a command line option `passes-ep-vectorizer-end` to
`NewPMDriver.cpp`
- `buildModuleOptimizationPipeline` now calls
`invokeVectorizerEndEPCallbacks`
- `buildO0DefaultPipeline` now calls `invokeVectorizerEndEPCallbacks`
- `buildLTODefaultPipeline` now calls BOTH
`invokeVectorizerStartEPCallbacks` and `invokeVectorizerEndEPCallbacks`
- Added LIT tests to `new-pm-defaults.ll`, `new-pm-lto-defaults.ll`,
`new-pm-O0-ep-callbacks.ll`, and `pass-pipeline-parsing.ll`
- Renamed `CHECK-EP-Peephole` to `CHECK-EP-PEEPHOLE` in
`new-pm-lto-defaults.ll` for consistency.

This code is intended for developers that wish to implement and run
custom passes after the vectorizer passes in the PassBuilder pipeline.
For example, in #91796, a pass was created that changed the induction
variables of vectorized code. This is right after the vectorization
passes.
2025-01-29 11:24:03 -08:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
gulfemsavrun
38902153fe
[PassBuilder] Add RelLookupTableConverterPass to LTO (#124053)
[PassBuilder] Add RelLookupTableConverterPass to LTO

This patch adds RelLookupTableConverterPass into the LTO
post-link optimization pass pipeline. This optimization
converts lookup tables to relative lookup tables to make
them PIC-friendly, which is already included in the non-LTO
pass pipeline. This patch adds this optimization to the
post-link optimization pipeline to discover more
opportunities in the LTO context.
2025-01-28 15:08:03 -08:00
Jon Roelofs
ec15b24250
[llvm][Support] Only enable backtrace test when it's enabled (#123852)
rdar://138554797
2025-01-22 10:37:56 -08:00
Akshay Deodhar
5b6a26ccdd
Add option to print entire function instead of just the loops for loo… (#123229)
print-after-all is useful for diffing IR between two passes. When one of
the two is a function pass, and the other is a loop pass, the diff
becomes useless. Add an option which prints the entire function for loop
passes.
2025-01-17 17:55:54 -08:00
Antonio Frighetto
7a0f75c738 Reapply "[GVN] MemorySSA for GVN: add optional AllowMemorySSA"
Original commit: eb63cd62a4a1907dbd58f12660efd8244e7d81e9

Previously reverted due to non-negligible compile-time impact in
stage1-ReleaseLTO-g scenario. The issue has been addressed by
always reusing previously computed MemorySSA results, and request
new ones only when `isMemorySSAEnabled` is set.

Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>
2025-01-14 10:03:47 +01:00
Teresa Johnson
799955eb17
[ThinLTO] Skip opt pipeline and summary wrapper pass on empty modules (#120143)
Follow up to PR118508, to avoid unnecessary compile time for an empty
combind regular LTO module if all modules end up being ThinLTO only.

This required minor changes to a few tests to ensure they weren't empty.
2025-01-10 19:33:20 -08:00
Nikita Popov
c39500f88c Revert "[GVN] MemorySSA for GVN: add optional AllowMemorySSA"
This reverts commit eb63cd62a4a1907dbd58f12660efd8244e7d81e9.

This changes the preservation behavior for MSSA when the new flag
is not enabled.
2025-01-10 12:57:00 +01:00
Momchil Velikov
eb63cd62a4 [GVN] MemorySSA for GVN: add optional AllowMemorySSA
Preparatory work to migrate from MemoryDependenceAnalysis
towards MemorySSA in GVN.

Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
2025-01-10 10:43:12 +01:00
GrumpyPigSkin
f7ba2bdf86
[LLVM][SLSR] Add a debug counter (#119981)
Added debug counter and test for SLSR.

Fixes: https://github.com/llvm/llvm-project/issues/119770
2024-12-21 12:37:44 -05:00
Nikita Popov
10f315dc9c
[ConstantFolding] Infer getelementptr nuw flag (#119214)
Infer nuw from nusw and nneg. This is the constant expression variant of
https://github.com/llvm/llvm-project/pull/111144.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-09 16:44:05 +01:00
Haopeng Liu
4d6e69143d
Add the initializes attribute inference (#117104)
reland https://github.com/llvm/llvm-project/pull/97373 after fixing
clang tests.

Confirmed with "ninja check-llvm" and "ninja check-clang"
2024-11-20 19:15:23 -08:00
Mikhail Goncharov
f77126c549 Revert "[FunctionAttrs] Add the "initializes" attribute inference (#97373)"
This reverts commit 661c593850715881d2805a59e90e6d87d8b9fbb8.

Multiple buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/108/builds/6096
2024-11-19 10:29:36 +01:00
Haopeng Liu
661c593850
[FunctionAttrs] Add the "initializes" attribute inference (#97373)
Add the "initializes" attribute inference.

This change is expected to have ~0.09% compile time regression, which
seems acceptable for interprocedural DSE.

https://llvm-compile-time-tracker.com/compare.php?from=9f10252c4ad7cffbbcf692fa9c953698f82ac4f5&to=56345c1cee4375eb5c28b8e7abf4803d20216b3b&stat=instructions%3Au
2024-11-18 21:36:05 -08:00
Lei Wang
bc1aa2863b
[SampleFDO] Support enabling sample loader pass in O0 mode (#113985)
Add support for enabling sample loader pass in O0 mode(under
`-fsample-profile-use`). This can help verify PGO raw profile count
quality or provide a more accurate performance proxy(predictor), as O0
mode has minimal or no compiler optimizations that might otherwise
impact profile count accuracy.
- Explicitly disable the sample loader inlining to ensure it only emits
sampling annotation.
- Use flattened profile for O0 mode.
- Add the pass after `AddDiscriminatorsPass` pass to work with
`-fdebug-info-for-profiling`.
2024-11-08 15:29:44 -08:00
Lee Wei
1469d82e1c
Remove br i1 undef from some regression tests [NFC] (#115130)
As defined in LangRef, branching on `undef` is undefined behavior.
This PR aims to remove undefined behavior from tests. As UB tests break
Alive2 and may be the root cause of breaking future optimizations.

Here's an Alive2 proof for one of the examples:
https://alive2.llvm.org/ce/z/TncxhP
2024-11-07 08:11:15 +00:00
Yingwei Zheng
cacbe71af7
[Analysis] Avoid running transform passes that have just been run (#112092)
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.

RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467

Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
2024-11-07 07:52:14 +08:00
Hari Limaye
fbd89bcc66
Reland "[LTO] Run Argument Promotion before IPSCCP" (#111853)
Run ArgumentPromotion before IPSCCP in the LTO pipeline, to expose more
constants to be propagated. We also run PostOrderFunctionAttrs to
improve the information available to ArgumentPromotion's alias analysis,
and SROA to clean up allocas.

Relands #111163.
2024-11-06 13:54:48 +00:00
Shubham Sandeep Rastogi
b8930cd13d
Add a pass to collect dropped variable statistics (#102233)
This patch is inspired by @Snowy1803 excellent work in swift and the
patch: https://github.com/swiftlang/swift/pull/73334/files

Add an instrumentation pass to llvm to collect dropped debug information
variable statistics for every Function-level and Module-level IR pass.

This patch creates adds the class DroppedVariableStats which iterates
over every DbgRecord in a function or module before and after an
optimization pass and counts the number of variables who's debug
information has been dropped due to that pass, then prints that output
to stdout in a csv format.

I ran this patch on optdriver.cpp can see:

Pass Name, Dropped Variables
'InstCombinePass', 1
'SimplifyCFGPass', 6
'JumpThreadingPass', 25
2024-10-21 18:13:49 -07:00
Hari Limaye
0a0f100a70
Revert "[LTO] Run Argument Promotion before IPSCCP" (#111839)
Reverts llvm/llvm-project#111163, as this was merged prematurely.
2024-10-10 15:03:01 +01:00
Hari Limaye
b9754e9d28
[LTO] Run Argument Promotion before IPSCCP (#111163)
Run ArgumentPromotion before IPSCCP in the LTO pipeline, to expose more
constants to be propagated. We also run PostOrderFunctionAttrs to
improve the information available to ArgumentPromotion's alias analysis,
and SROA to clean up allocas.
2024-10-10 06:08:27 -04:00
duk
0004fba079
[StandardInstrumentations] Ensure non-null module pointer when getting display name for IR file (#110779)
Fixes a crash when using `-filter-print-funcs` with
`-ir-dump-directory`. A quick reproducer on trunk (also included as a
test):

```ll
; opt -passes=no-op-function -print-after=no-op-function -filter-print-funcs=nope -ir-dump-directory=somewhere test.ll

define void @test() {
    ret void
}
```

[Compiler Explorer](https://godbolt.org/z/sPErz44h4)
2024-10-01 22:01:13 -07:00
Mircea Trofin
e64a1c00c1 Fix unintended extra commit in PR #107499 2024-09-09 18:26:44 -07:00