756 Commits

Author SHA1 Message Date
paperchalice
72c75501ec
[CodeGen] Port LowerEmuTLS to new pass manager (#75171)
In fact, this pass need `llc` to test. `TargetMachine` seems redundant,
because before adding this pass `CodeGenPassBuilder` already checks it:

ed4194bb8d/llvm/include/llvm/CodeGen/CodeGenPassBuilder.h (L590-L592)
2023-12-19 14:44:35 +08:00
paperchalice
60eca674b1
[CodeGen] Port ExpandMemCmp to new pass manager (#74050) 2023-12-13 16:18:24 +08:00
paperchalice
80bb994d2b
[CodeGen] Port IndirectBrExpand to new pass manager (#75287) 2023-12-13 16:13:17 +08:00
paperchalice
06aa8b189a
[CodeGen] Add analyses to help for porting GC passes (#74972)
- `CollectorMetadataAnalysis` provides `GCStrategyMap`.
- `GCFunctionAnalysis` provides `GCFunctionInfo`.

`GCStrategyMap` owns `GCStrategy` pointers and this
pass is used by `AsmPrinter` to iterate all GC strategies.

Most passes that require `GCModuleInfo` actually require the
`GCFunctionInfo`,
so add `GCFunctionAnalysis` for convenience.
2023-12-13 15:56:12 +08:00
paperchalice
a930fec033
[CodeGen] Port InterleavedLoadCombine to new pass manager (#75164) 2023-12-13 12:46:22 +08:00
paperchalice
27259f17e9
[CodeGen] Port CFGuard to new pass manager (#75146)
Port `CFGuard` to new pass manager, add a pass parameter to choose guard
mechanism.
2023-12-13 08:50:22 +08:00
paperchalice
b0cc42ae0f
[CodeGen] Port SjLjEHPrepare to new pass manager (#75023)
`doInitialization` in `SjLjEHPrepare` is trivial.

This is the last pass suffix with `ehprepare`.
2023-12-12 16:07:26 +08:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
paperchalice
ce08c7ee1e
[CodeGen] Port SelectOptimize to new pass manager (#74920)
- Use `BlockFrequencyInfoWrapperPass` in legacy pass so member
`std::unique_ptr<BranchProbabilityInfo> BPI` could be removed.
- Member `DominatorTree *DT = nullptr` is unused, remove it.
2023-12-12 12:09:30 +08:00
paperchalice
62b21c6ced
[CodeGen] Port JMCInstrumenter to new pass manager (#75049) 2023-12-12 09:00:44 +08:00
paperchalice
cd6e462d01
[CodeGen] Port InterleavedAccess to new pass manager (#74904) 2023-12-10 19:15:51 +08:00
paperchalice
5baf66f3c2
[CodeGen] Port WasmEHPrepare to new pass manager (#74435)
Port `WasmEHPrepare` to new pass manager, also rename `wasmehprepare` to
`wasm-eh-prepare`.
2023-12-06 11:11:00 -08:00
paperchalice
8a9bbac662
[CodeGen] Port WinEHPrepare to new pass manager (#74233) 2023-12-04 20:46:51 +07:00
chrulski-intel
ff0d8a9a6c
Report pass name when -llvm-verify-each reports breakage (#71447)
Update the string reported to include the pass name of last pass when
running verifier after each pass.
2023-12-01 10:36:25 -08:00
paperchalice
3bd5172057
Reland "[CodeGen] Port SafeStack to new pass manager (#74027)
Forgot to update related code in `CodeGenPassBuilder.h`, also update it
for `CallBrPreparePass`.
Fix build when `LLVM_ENABLE_MODULES:BOOL=ON`.
2023-12-01 13:55:05 +09:00
Paul Kirth
cfe1ece833
[clang][llvm][fatlto] Avoid cloning modules in FatLTO (#72180)
https://github.com/llvm/llvm-project/issues/70703 pointed out that
cloning LLVM modules could lead to miscompiles when using FatLTO.

This is due to an existing issue when cloning modules with labels (see
#55991 and #47769). Since this can lead to miscompilation, we can avoid
cloning the LLVM modules, which was desirable anyway.

This patch modifies the EmbedBitcodePass to no longer clone the module
or run an input pipeline over it. Further, it make FatLTO always perform
UnifiedLTO, so we can still defer the Thin/Full LTO decision to
link-time. Lastly, it removes dead/obsolete code related to now defunct
options that do not work with the EmbedBitcodePass implementation any
longer.
2023-11-30 17:09:34 -08:00
Shubham Sandeep Rastogi
2eff36b7d3
Revert "[CodeGen] Port SafeStack to new pass manager (#73747)" (#73965)
This reverts commit a4d5fd4d2ee9470e55345a9540f6b6fb6faf66e1.

The above commit breaks greendragon lldb bots:
Link to failing builds:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/63300/
https://green.lab.llvm.org/green/view/LLDB/job/as-lldb-cmake/10345/

I found this PR to be the offending one after using git bisect with the
cmake invocation:

cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON
'-DLLVM_TARGETS_TO_BUILD=X86;ARM;AArch64'
-DLLVM_ENABLE_ASSERTIONS:BOOL=TRUE
-DLLVM_ENABLE_MODULES=On
-DLLVM_ENABLE_PROJECTS='clang;lld;lldb;cross-project-tests'
-DLLVM_VERSION_PATCH=99
'-DLLVM_ENABLE_RUNTIMES=libcxx;libcxxabi;compiler-rt'

and running
ninja lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/CodeGenPassBuilder.cpp.o
2023-11-30 09:53:15 -08:00
paperchalice
a4d5fd4d2e
[CodeGen] Port SafeStack to new pass manager (#73747)
Just copy the `runOnFunction` method from `SafeStackLegacyPass` and
remove the workaround for computing analysis lazily, the analysis result
in new pass manager is computed lazily by default.
2023-11-30 13:26:49 +09:00
paperchalice
1debbae96b
[CodeGen] Port CallBrPrepare to new pass manager (#73630)
IIUC in the new pass manager infrastructure, the analysis result is
always computed lazily. So just use `getResult` here.
2023-11-29 10:33:14 +09:00
paperchalice
61e58c4dc1
[CodeGen] Port DwarfEHPrepare to new pass manager (#72500)
Co-authored-by: PaperChalice <example@example.com>
2023-11-28 17:53:25 +09:00
Nikita Popov
6e56c35d19 [SpeculativeExecution] Add only-if-divergent-target pass option
The optimization pipeline enables this option, but it was not
preserved in -print-pipeline-passes output.
2023-11-07 11:49:37 +01:00
Nikita Popov
a682a9cfd0 Revert "Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235)"
This reverts commit 19b5495b653a00da7a250f48b4f739fcf2bbe82f.

PR landed without approval, with severe quality issues.
2023-11-03 21:15:46 +01:00
Manman Ren
19b5495b65
Port Swift's merge function pass to llvm: merging functions that differ in constants (#68235)
See RFC for details:
https://discourse.llvm.org/t/rfc-for-moving-swift-s-merge-function-pass-to-llvm/73778

We will need to refactor extension to FunctionComparator/FunctionHash to
StructuralHash. This patch adds a new pass which is ported from Swift,
and will need to discuss on how to migrate Swift’s pass over after we
land this in llvm.

Create this PR to get some early review on the patch.

---------

Co-authored-by: Manman Ren <mren@meta.com>
2023-11-03 11:13:58 -07:00
Matt Arsenault
3cef582ae4 CodeGen: Port ExpandLargeFpConvert to new PM (#71027) 2023-11-03 14:23:30 +09:00
Matt Arsenault
94202e7b17
CodeGen: Port ExpandLargeDivRem to new pass manager (#71022) 2023-11-03 08:34:15 +09:00
Alex Voicu
0ce6255a50 [HIP][LLVM][Opt] Add LLVM support for hipstdpar
This patch adds the LLVM changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. What we do here is add two passes, one mandatory and one optional:

1. HipStdParAcceleratorCodeSelectionPass is mandatory, depends on CallGraphAnalysis, and implements the following transform:

    - Traverse the call-graph, and check for functions that are roots for accelerator execution (at the moment, these are GPU kernels exclusively, and would originate in the accelerator specific algorithm library the toolchain uses as an implementation detail);
    - Starting from a root, do a BFS to find all functions that are reachable (called directly or indirectly via a call- chain) and record them;
    - After having done the above for all roots in the Module, we have the computed the set of reachable functions, which is the union of roots and functions reachable from roots;
    - All functions that are not in the reachable set are removed; for the special case where the reachable set is empty we completely clear the module;

2. HipStdParAllocationInterpositionPass is optional, is meant as a fallback with restricted functionality for cases where on-demand paging is unavailable on a platform, and implements the following transform:
    - Iterate all functions in a Module;
    - If a function's name is in a predefined set of allocation / deallocation that the runtime implementation is allowed and expected to interpose, replace all its uses with the equivalent accelerator aware function, iff the latter is available;
        - If the accelerator aware equivalent is unavailable we warn, but compilation will go ahead, which means that it is possible to get issues around the accelerator trying to access inaccessible memory at run time;
    - We rely on direct name matching as opposed to using the new alloc-kind family of attributes and / or the LibCall analysis pass because some of the legacy functions that need replacing would not carry the former or be identified by the latter.

Reviewed by: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D155856
2023-10-12 11:26:48 +01:00
Alex Voicu
25935c384d Revert "[HIP][LLVM][Opt] Add LLVM support for hipstdpar"
This reverts commit c5bba7ea5a05f540948f76a189c880eb24a5e8c6.
2023-10-11 12:27:03 +01:00
Alex Voicu
c5bba7ea5a [HIP][LLVM][Opt] Add LLVM support for hipstdpar
This patch adds the LLVM changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. What we do here is add two passes, one mandatory and one optional:

1. HipStdParAcceleratorCodeSelectionPass is mandatory, depends on CallGraphAnalysis, and implements the following transform:

    - Traverse the call-graph, and check for functions that are roots for accelerator execution (at the moment, these are GPU kernels exclusively, and would originate in the accelerator specific algorithm library the toolchain uses as an implementation detail);
    - Starting from a root, do a BFS to find all functions that are reachable (called directly or indirectly via a call- chain) and record them;
    - After having done the above for all roots in the Module, we have the computed the set of reachable functions, which is the union of roots and functions reachable from roots;
    - All functions that are not in the reachable set are removed; for the special case where the reachable set is empty we completely clear the module;

2. HipStdParAllocationInterpositionPass is optional, is meant as a fallback with restricted functionality for cases where on-demand paging is unavailable on a platform, and implements the following transform:
    - Iterate all functions in a Module;
    - If a function's name is in a predefined set of allocation / deallocation that the runtime implementation is allowed and expected to interpose, replace all its uses with the equivalent accelerator aware function, iff the latter is available;
        - If the accelerator aware equivalent is unavailable we warn, but compilation will go ahead, which means that it is possible to get issues around the accelerator trying to access inaccessible memory at run time;
    - We rely on direct name matching as opposed to using the new alloc-kind family of attributes and / or the LibCall analysis pass because some of the legacy functions that need replacing would not carry the former or be identified by the latter.

Reviewed by: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D155856
2023-10-11 12:22:00 +01:00
Alex Voicu
98eda5dda7 Revert "[HIP][LLVM][Opt] Add LLVM support for hipstdpar" in order to address build breakage.
This reverts commit 9b98ebb0eb43b005921926a622177f10e13b1ac6.
2023-10-10 12:16:10 +01:00
Alex Voicu
9b98ebb0eb [HIP][LLVM][Opt] Add LLVM support for hipstdpar
This patch adds the LLVM changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. What we do here is add two passes, one mandatory and one optional:

1. HipStdParAcceleratorCodeSelectionPass is mandatory, depends on CallGraphAnalysis, and implements the following transform:

    - Traverse the call-graph, and check for functions that are roots for accelerator execution (at the moment, these are GPU kernels exclusively, and would originate in the accelerator specific algorithm library the toolchain uses as an implementation detail);
    - Starting from a root, do a BFS to find all functions that are reachable (called directly or indirectly via a call- chain) and record them;
    - After having done the above for all roots in the Module, we have the computed the set of reachable functions, which is the union of roots and functions reachable from roots;
    - All functions that are not in the reachable set are removed; for the special case where the reachable set is empty we completely clear the module;

2. HipStdParAllocationInterpositionPass is optional, is meant as a fallback with restricted functionality for cases where on-demand paging is unavailable on a platform, and implements the following transform:
    - Iterate all functions in a Module;
    - If a function's name is in a predefined set of allocation / deallocation that the runtime implementation is allowed and expected to interpose, replace all its uses with the equivalent accelerator aware function, iff the latter is available;
        - If the accelerator aware equivalent is unavailable we warn, but compilation will go ahead, which means that it is possible to get issues around the accelerator trying to access inaccessible memory at run time;
    - We rely on direct name matching as opposed to using the new alloc-kind family of attributes and / or the LibCall analysis pass because some of the legacy functions that need replacing would not carry the former or be identified by the latter.

Reviewed by: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D155856
2023-10-10 12:02:05 +01:00
Dhruv Chawla
0f152a55d3
[InferAlignment] Implement InferAlignmentPass
This pass aims to infer alignment for instructions as a separate pass,
to reduce redundant work done by InstCombine running multiple times. It
runs late in the pipeline, just before the back-end passes where this
information is most useful.

Differential Revision: https://reviews.llvm.org/D158529
2023-09-20 12:03:36 +05:30
Justin Bogner
71e3642619
[Transforms][DXIL] Wire up a basic DXILUpgrade pass (#66275)
This pass will upgrade DXIL-style llvm constructs (which are mostly
metadata) into the representations we use in LLVM for the same concepts.

For now we just strip the valver metadata, which we don't need. Later
changes will make this pass more useful, and then we should be able to
wire it into clang and possibly the DirectX backend's AsmParser.
2023-09-14 11:02:31 -07:00
Aiden Grossman
3a42b1fd3e [IR] Add SturcturalHash printer pass
This patch adds in a StructuralHash printer pass that prints out the
hexadeicmal representation of the hash of a module and all of the
functions within it.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D158317
2023-08-29 18:59:52 -07:00
Changpeng Fang
c1803d5366 [FunctionAttrs] Unconditionally perform argument attribute inference in the first function-attrs pass
Summary:
  Argument attributes like NoAlias and ReadOnly could affect memoryssa and thus earlyCSE in the function simplification pipeline.
https://reviews.llvm.org/D145210 adjusted PostOrderFunctionAttrs placement and caused the argument attributes not referred for the use
in the pipeline. This work (initiated by @nikic) unconditionally performs argument attribute inference in the first function-attrs pass.

Reviewers:
  aeubanks and nikic

Differential Revision:
  https://reviews.llvm.org/D156397
2023-08-09 17:49:14 -07:00
Nikita Popov
41895843b5 [InstCombine] Only perform one iteration
InstCombine is a worklist-driven algorithm, which works roughly
as follows:

* All instructions are initially pushed to the worklist.
  The initial order is in RPO program order.
* All newly inserted instructions get added to the worklist.
* When an instruction is folded, its users get added back to the
  worklist.
* When the use-count of an instruction decreases, it gets added
  back to the worklist.
* And a few of other heuristics on when we should revisit
  instructions.

On top of the worklist algorithm, InstCombine layers an additional
fix-point iteration: If any fold was performed in the previous
iteration, then InstCombine will re-populate the worklist from
scratch and fold the entire function again. This continues until
a fix-point is reached.

In the vast majority of cases, InstCombine will reach a fix-point
within a single iteration: However, a second iteration is performed
to verify that this is indeed the fixpoint. We can see this in the
statistics for llvm-test-suite:

    "instcombine.NumOneIteration": 411380,
    "instcombine.NumTwoIterations": 117921,
    "instcombine.NumThreeIterations": 236,
    "instcombine.NumFourOrMoreIterations": 2,

The way to read these numbers is that in 411380 cases, InstCombine
performs no folds. In 117921 cases it performs a fold and reaches
the fix-point within one iteration (the second iteration verifies
the fixpoint). In the remaining 238 cases, more than one iteration
is needed to reach the fixpoint.

In other words, only in 0.04% of cases are additional iterations
needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine
performs a completely useless extra iteration to verify the fix point.

This patch removes the fixpoint iteration from InstCombine, and always
only perform a single iteration. This results in a major compile-time
improvement of around 4% at negligible codegen impact.

This explicitly does accept that we will not reach a fixpoint in all
cases. However, this is mitigated by two factors: First, the data
suggests that this happens very rarely in practice. Second,
InstCombine runs many times during the optimization pipeline
(8 times even without LTO), so there are many chances to recover
such cases.

In order to prevent accidental optimization regressions in the
future, this implements a verify-fixpoint option, which is enabled
by default when instcombine is specified in -passes and disabled
when InstCombinePass() is constructed from C++. This means that
test cases need to explicitly use the no-verify-fixpoint option
if they fail to reach a fixed point (for a well understand reason
we cannot / do not want to avoid).

Differential Revision: https://reviews.llvm.org/D154579
2023-07-31 10:56:49 +02:00
Teresa Johnson
546ec641b4 Restore "[MemProf] Use new option/pass for profile feedback and matching"
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.

Differential Revision: https://reviews.llvm.org/D154856
2023-07-11 13:16:20 -07:00
JP Lehr
3ab7ef28ee Revert "[MemProf] Use new option/pass for profile feedback and matching"
This reverts commit b4a82b62258c5f650a1cccf5b179933e6bae4867.

Broke AMDGPU OpenMP Offload buildbot
2023-07-11 05:44:42 -04:00
Teresa Johnson
b4a82b6225 [MemProf] Use new option/pass for profile feedback and matching
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.

This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.

Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).

Differential Revision: https://reviews.llvm.org/D154856
2023-07-10 16:42:56 -07:00
Arthur Eubanks
72e7e5851f [MemorySSA] Always perform MemoryUses liveOnEntry optimization on MSSA construction
Fixes invariant memory regressions in future DSE patches.

Also add a flag to print<memoryssa> to not ensure optimized uses to test this.

Noticeable compile time regression [1], but a future DSE change that depends on this more than makes up for it.

[1] https://llvm-compile-time-tracker.com/compare.php?from=9d5466849a770eeab222d5a5890376d3596e8ad6&to=95682dbe11d76a3342870437377216e96b167504&stat=instructions:u

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D152859
2023-07-06 14:09:47 -07:00
Matthew Voss
a1ca3af31e [llvm] A Unified LTO Bitcode Frontend
Here's a high level summary of the changes in this patch. For more
information on rational, see the RFC.
(https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).

  - Add config parameter to LTO backend, specifying which LTO mode is
    desired when using unified LTO.
  - Add unified LTO flag to the summary index for efficiency. Unified
    LTO modules can be detected without parsing the module.
  - Make sure that the ModuleID is generated by incorporating more types
    of symbols.

Differential Revision: https://reviews.llvm.org/D123803
2023-07-05 14:53:14 -07:00
Arthur Eubanks
ff79eb3af6 [PassBuilder] Add textual representation for function simplification pipeline
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153784
2023-06-29 09:39:04 -07:00
Paul Kirth
75a1797044 Reland [llvm] Preliminary fat-lto-objects support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.

After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.

Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in  `.text` are congruent with the existing output produced by the
default and LTO pipelines.

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Earlier versions of this patch were missing REQUIRES lines for llc
related tests in Transforms/EmbedBitcode. Those tests are now under
CodeGen/X86, which should avoid running the check on unsupported
platforms.

The EmbedbBitcodePass also returned PreservedAnalyses::all when adding a
metadata section, which failed expensive checks, since it modified the
module. This is now corrected.

Reviewed By: tejohnson, MaskRay, nikic

Differential Revision: https://reviews.llvm.org/D146776
2023-06-28 21:37:50 +00:00
Alex Brachet
6085eb3084 Revert "Reland [llvm] Preliminary fat-lto-objects support"
This reverts commit 44265dc3554ef40920b587eeb787a400663af6c7.
2023-06-24 01:15:50 +00:00
Teresa Johnson
200cc952a2 [LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase
D63932 added a module flag to indicate that we are executing the regular
LTO post merge pipeline, so that GlobalDCE could perform more aggressive
optimization for Dead Virtual Function Elimination. This caused issues
trying to reuse bitcode that had already been through the LTO pipeline
(see context in D139816).

Instead support this by passing down a parameter flag to the
GlobalDCEPass constructor, which is the more usual way for indicating
this information.

Most test changes are to remove incidental uses of this flag. Of the 2
real uses, llvm/test/LTO/ARM/lto-linking-metadata.ll is now obsolete and
removed in this patch, and the virtual-functions-visibility-post-lto.ll
test is updated to use the regular LTO default pipeline where this
parameter is set to true.

Differential Revision: https://reviews.llvm.org/D153655
2023-06-23 17:05:07 -07:00
Paul Kirth
44265dc355 Reland [llvm] Preliminary fat-lto-objects support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.

After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.

Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in  `.text` are congruent with the existing output produced by the
default and LTO pipelines.

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Earlier versions of this patch were missing REQUIRES lines for llc
related tests in Transforms/EmbedBitcode. Those tests are now under
CodeGen/X86, which should avoid running the check on unsupported
platforms.

Reviewed By: tejohnson, MaskRay, nikic

Differential Revision: https://reviews.llvm.org/D146776
2023-06-23 23:23:58 +00:00
Paul Kirth
a3800ad9d8 Revert "[llvm] Preliminary fat-lto-objects support"
There seems to be a problem on arm buildbots. Reverting until I can
investigate.  https://lab.llvm.org/buildbot#builders/245/builds/10184

This reverts commit a67208e1c697649ce432e6497f56a93675273dd8
and dependent commit e54a3112cee5ae0a9117359ecbea878e1388f51e.
2023-06-23 18:43:41 +00:00
Paul Kirth
a67208e1c6 [llvm] Preliminary fat-lto-objects support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.

After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.

Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in  `.text` are congruent with the existing output produced by the
default and LTO pipelines.

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Reviewed By: tejohnson, MaskRay, nikic

Differential Revision: https://reviews.llvm.org/D146776
2023-06-23 17:51:30 +00:00
Yann Girsberger
1d5651060e [opt] Exposing the parameters of LoopRotate to the -passes interface
There is a gap between running opt -Oz and running opt -passes="OZ_PASSES" where OZ_PASSES is taken from running opt -Oz -print-pipeline-passes.

One of the reasons causing this is that -Oz uses non-default setting for LoopRotate but LoopRotate does not expose its settings when printing the pipeline.

This commit fixes this by exposing LoopRotates parameters.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D153437
2023-06-22 11:09:23 -07:00
Arthur Eubanks
d49984fa4f [SimplifyCFG] Add option to not speculate blocks
Required for phase ordering changes to not regress Rust code with D145265.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153391
2023-06-22 08:51:40 -07:00
Arthur Eubanks
278d65b2cf [SimplifyCFG] Add textual pass params for FoldTwoEntryPHINode and SimplifyCondBranch 2023-06-15 14:21:24 -07:00