This avoid pulling in the entire Passes library with all passes as
dependencies when just referring to PassPlugin, which is in fact
independent of the Passes themselves.
Pull Request: https://github.com/llvm/llvm-project/pull/173279
This avoid pulling in the entire Passes library with all passes as
dependencies when just referring to PassPlugin, which is in fact
independent of the Passes themselves.
Pull Request: https://github.com/llvm/llvm-project/pull/172478
Functions listed in `llvm.global_ctors` or `llvm.global_dtors` aren't too interesting for performance. Passes like LowerTypeTests synthesize some of these, and, while we could add a "0" entry count explicitly, we can also just not bother verifying in the first place.
The libcall lowering decisions should be program dependent,
depending on the current module's RuntimeLibcallInfo. We need
another related analysis derived from that plus the current
function's subtarget to provide concrete lowering decisions.
This takes on a somewhat unusual form. It's a Module analysis,
with a lookup keyed on the subtarget. This is a separate module
analysis from RuntimeLibraryAnalysis to avoid that depending on
codegen. It's not a function pass to avoid depending on any
particular function, to avoid repeated subtarget map lookups in
most of the use passes, and to avoid any recomputation in the
common case of one subtarget (and keeps it reusable across
repeated compilations).
This also switches ExpandFp and PreISelIntrinsicLowering as
a sample function and module pass. Note this is not yet wired
up to SelectionDAG, which is still using the LibcallLoweringInfo
constructed inside of TargetLowering.
- Use namespace qualifiers to define variables declared in `llvm`
namespace.
- move file local `TimeTracerRAII` struct into anonymous namespace.
- Use explicit types in a few places.
- Convert the loop over `PassList` to a range for loop.
Some LLVM passes need access to the filesystem to read configuration
files and similar. In some places, this is achieved by grabbing the VFS
from `PGOOptions`, but some passes don't have access to these and resort
to just calling `vfs::getRealFileSystem()`. This PR allows setting the
VFS directly on `PassBuilder` that's able to pass it down to all passes
that need it.
This reverts commit 895cda70a95529fd22aac05eee7c34f7624996af.
And fix attempt: 06f671e57a574ba1c5127038eff8e8773273790e.
Performance regressions and broken sanitizers, see #142686.
This patch adds the flag -fexperimental-loop-fuse to the clang and flang
drivers. This is primarily useful for experiments as we envision to
enable the pass one day.
The options are based on the same principles and reason on which we have
`floop-interchange`.
---------
Co-authored-by: Madhur Amilkanthwar <madhura@nvidia.com>
Add option to `opt` to run the `ProfileInjectorPass` before the passes opt would run, and then `ProfileVerifierPass` after. This will then be a mode in which we run tests on a specialized buildbot, with the goal of finding passes that drop (and, later, corrupt) profile information.
This is needed in llvm-reduce to avoid perturbing the uselistorder in
intermediate steps. Really llvm-reduce wants pure serialization with
no dependency on the pass manager. There are other optimizations mixed
in to the serialization here depending on metadata in the module, which
is also bad.
Part of #63621
After https://github.com/llvm/llvm-project/pull/131217 was submitted,
time-passes.ll fails because `opt` prints `-time-report` when
`ManagedTimerGlobals` is destroyed. `ManagedTimerGlobals` stores
`TimerGroup`s in an unordered map, so the ordering of the output
`TimerGroup`s depends on the underlying iterator.
To fix this, we do what Clang does and use
`llvm::TimerGroup::printAll(...)`, which *is* deterministic. This is
also what Clang does. This does put move analysis section before the
pass section for `-time-report`, but again, this is also what Clang
currently does.
Added an extension point after vectorizer passes in the PassBuilder.
Additionally, added extension points before and after vectorizer passes
in `buildLTODefaultPipeline`. Credit goes to @mshockwave for guiding me
through my first LLVM contribution (and my first open source
contribution in general!) :)
- Implemented `registerVectorizerEndEPCallback`
- Implemented `invokeVectorizerEndEPCallbacks`
- Added `VectorizerEndEPCallbacks` SmallVector
- Added a command line option `passes-ep-vectorizer-end` to
`NewPMDriver.cpp`
- `buildModuleOptimizationPipeline` now calls
`invokeVectorizerEndEPCallbacks`
- `buildO0DefaultPipeline` now calls `invokeVectorizerEndEPCallbacks`
- `buildLTODefaultPipeline` now calls BOTH
`invokeVectorizerStartEPCallbacks` and `invokeVectorizerEndEPCallbacks`
- Added LIT tests to `new-pm-defaults.ll`, `new-pm-lto-defaults.ll`,
`new-pm-O0-ep-callbacks.ll`, and `pass-pipeline-parsing.ll`
- Renamed `CHECK-EP-Peephole` to `CHECK-EP-PEEPHOLE` in
`new-pm-lto-defaults.ll` for consistency.
This code is intended for developers that wish to implement and run
custom passes after the vectorizer passes in the PassBuilder pipeline.
For example, in #91796, a pass was created that changed the induction
variables of vectorized code. This is right after the vectorization
passes.
The early simplication pipeline is used in non-LTO and (Thin/Full)LTO
pre-link
stage. There are some passes that we want them in non-LTO mode, but not
at LTO
pre-link stage. The control is missing currently. This PR adds the
support. To
demonstrate the use, we only enable the internalization pass in non-LTO
mode for
AMDGPU because having it run in pre-link stage causes some issues.
The performance of cold functions shouldn't matter too much, so if we
care about binary sizes, add an option to mark cold functions as
optsize/minsize for binary size, or optnone for compile times [1]. Clang
patch will be in a future patch.
This is intended to replace `shouldOptimizeForSize(Function&, ...)`.
We've seen multiple cases where calls to this expensive function, if not
careful, can blow up compile times. I will clean up users of that
function in a followup patch.
Initial version: https://reviews.llvm.org/D149800
[1]
https://discourse.llvm.org/t/rfc-new-feature-proposal-de-optimizing-cold-functions-using-pgo-info/56388
In Bazel, Clang current separates the clang executable into a
clang-driver library, and the actual clang executable. This allows
downstream users to make their own variations of clang, without having
to redo/maintain separate build pipelines.
This adds the same for opt for both CMake and Bazel.
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.
Differential Revision: https://reviews.llvm.org/D154856
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.
This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.
Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).
Differential Revision: https://reviews.llvm.org/D154856
Here's a high level summary of the changes in this patch. For more
information on rational, see the RFC.
(https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).
- Add config parameter to LTO backend, specifying which LTO mode is
desired when using unified LTO.
- Add unified LTO flag to the summary index for efficiency. Unified
LTO modules can be detected without parsing the module.
- Make sure that the ModuleID is generated by incorporating more types
of symbols.
Differential Revision: https://reviews.llvm.org/D123803
Make the access to profile data going through virtual file system so the
inputs can be remapped. In the context of the caching, it can make sure
we capture the inputs and provided an immutable input as profile data.
Reviewed By: akyrtzi, benlangmuir
Differential Revision: https://reviews.llvm.org/D139052
This is an issue reported inside the NewPMDriver module. Static analyzer reported that Null pointer 'P' may be dereferenced at line 371 and two more sites. Proposed change guards this use.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D142047
We've already verified the input module manually in opt so this is redundant.
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D139899
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
This diff splits out (from LLVMCore) IR printing passes into IRPrinter.
This structure is similar to what we already have for IRReader and
enables us to avoid circular dependencies between LLVMCore and Analysis
(this is a preparation for https://reviews.llvm.org/D137768).
The legacy interface is left unchanged, once the legacy pass manager
is removed (in the future) we will be able to clean it up further.
The bazel build configuration has been updated as well.
Test plan:
1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc
2/ bazel build --config=generic_clang @llvm-project//...
Differential revision: https://reviews.llvm.org/D138081
This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e.
There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D137768
This reverts commit ef9e624694c0f125c53f7d0d3472fd486bada57d
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.