270 Commits

Author SHA1 Message Date
Justin Bogner
3f066f5fcf
[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633)
We extract the binding logic out of the DXILResource analysis passes into the
FrontendHLSL library. This will allow us to use this logic for resource and
root signature bindings in both the DirectX backend and the HLSL frontend.
2025-07-31 08:35:47 -07:00
Ramkumar Ramachandra
af2f8a8c14
[HashRecognize] Introduce new analysis (#139120)
Introduce a fresh analysis for recognizing polynomial hashes, with the
rationale that several targets have specific instructions to optimize
things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We
limit the scope to polynomial hashes computed in a Galois field of
characteristic 2, since this class of operations can also be optimized
in the absence of target-specific instructions to use a lookup table.

At the moment, we only recognize the CRC algorithm.

RFC:
https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268
2025-06-02 08:25:50 +01:00
S. VenkataKeerthy
58ab005d8d
Adding IR2Vec as an analysis pass (#134004)
This PR introduces IR2Vec as an analysis pass. The changes include:
- Logic for generating Symbolic encodings.
- 75D learned vocabulary.
- lit tests.

Here is the link to the RFC -
https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings

Acknowledgements: contributors -
https://github.com/IITH-Compilers/IR2Vec/graphs/contributors

---------

Co-authored-by: svkeerthy <venkatakeerthy@google.com>
Co-authored-by: Mircea Trofin <mtrofin@google.com>
2025-05-22 09:50:21 -07:00
Tim Gymnich
571a24c314
Reland [llvm] add GenericFloatingPointPredicateUtils #140254 (#141065)
#140254 was previously missing 2 files in the bazel build config.
2025-05-22 17:17:02 +02:00
Kewen12
c47a5fbb22
Revert "[llvm] add GenericFloatingPointPredicateUtils (#140254)" (#140968)
This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238. 

The PR breaks our buildbots and blocks downstream merge.
2025-05-21 19:31:14 -04:00
Tim Gymnich
d00d74bb25
[llvm] add GenericFloatingPointPredicateUtils (#140254)
add `GenericFloatingPointPredicateUtils` in order to generalize
effects of floating point comparisons on `KnownFPClass` for both IR and
MIR.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-21 23:45:31 +02:00
Lucas Duarte Prates
67783eb166
Re-land: [Analysis] Ensure use of strict fp exceptions in ConstantFolding (#137652)
To perform constant folding in math operations, the implementation of
the ConstantFolding Analysis relies on the use of the math functions
from the host's libm. In particular, it relies on checking the value of
errno and IEEE exceptions to determine when an operation is safe to be
constant-folded.

On some platforms, such as BSD or Darwin, math library functions don't
set errno, so the ConstantFolding check depends only on the value of
IEEE exceptions. As the FP exception behaviour is set to `ignore` by
default, the compiler can perform optimisations that would get in the
way of such checks being performed correctly.

This patch sets the FP exception behaviour to `strict` when compiling
the `ConstantFolding.cpp` source file, ensuring the value of IEEE
exceptions can be reliably used by its implementation.

This re-lands the changes from #136139, but using the `-ftrapping-math`
compile option instead of `-ffp-exception-behavior` for GCC support.
2025-04-29 15:24:52 +01:00
Antonio Frighetto
6ae4030d4c Revert "[Analysis] Ensure use of strict fp exceptions in ConstantFolding (#136139)"
This reverts commit 8506980d30fd2faf41518f24e985f820609a7bd0, multiple buildbot failures reported.
2025-04-28 11:26:12 +02:00
Lucas Duarte Prates
8506980d30
[Analysis] Ensure use of strict fp exceptions in ConstantFolding (#136139)
To perform constant folding in math operations, the implementation of
the ConstantFolding Analysis relies on the use of the math functions
from the host's libm. In particular, it relies on checking the value of
errno and IEEE exceptions to determine when an operation is safe to be
constant-folded.

On some platforms, such as BSD or Darwin, math library functions don't
set errno, so the ConstantFolding check depends only on the value of
IEEE exceptions. As the FP exception behaviour is set to `ignore` by
default, the compiler can perform optimisations that would get in the
way of such checks being performed correctly.

This patch sets the FP exception behaviour to `strict` when compiling
the `ConstantFolding.cpp` source file, ensuring the value of IEEE
exceptions can be reliably used by its implementation.
2025-04-28 10:01:54 +01:00
Mingming Liu
c8a70f4c6e
[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information (#125756)
In this PR, static-data-splitter pass finds out the local-linkage global
variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by
analyzing machine instruction operands, and aggregates their accesses
from code across functions.

A follow-up item is to analyze global variable initializers and count
for access from data.
* This limitation is demonstrated by `bss2` and `data3` in
`llvm/test/CodeGen/X86/global-variable-partition.ll`.

Some stats of static-data-splitter with this patch:

**section**|**bss**|**rodata**|**data**
:-----:|:-----:|:-----:|:-----:
hot-prefixed section coverage|99.75%|97.71%|91.30%
unlikely-prefixed section size percentage|67.94%|39.37%|63.10%

1. The coverage is defined as `#perf-sample-in-hot-prefixed <data>
section / #perf-sample in <data.*> section` for each <data> section.
* The perf command samples
`MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high
frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built
as non-PIE so `data.rel.ro` coverage data is not available.
2. The unlikely-prefixed `<data>` section size percentage is defined as
`unlikely <data> section size / the sum size of <data>.* sections` for
each `<data>` section
2025-03-28 16:31:46 -07:00
vporpo
08dda4dcbf
[Analysis][EphemeralValuesCache][InlineCost] Ephemeral values caching for the CallAnalyzer (#130210)
This patch does two things:

1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.

2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
2025-03-19 18:18:45 -07:00
Joel E. Denny
18f8106f31
[KernelInfo] Implement new LLVM IR pass for GPU code analysis (#102944)
This patch implements an LLVM IR pass, named kernel-info, that reports
various statistics for codes compiled for GPUs. The ultimate goal of
these statistics to help identify bad code patterns and ways to mitigate
them. The pass operates at the LLVM IR level so that it can, in theory,
support any LLVM-based compiler for programming languages supporting
GPUs. It has been tested so far with LLVM IR generated by Clang for
OpenMP offload codes targeting NVIDIA GPUs and AMD GPUs.

By default, the pass runs at the end of LTO, and options like
``-Rpass=kernel-info`` enable its remarks. Example `opt` and `clang`
command lines appear in `llvm/docs/KernelInfo.rst`. Remarks include
summary statistics (e.g., total size of static allocas) and individual
occurrences (e.g., source location of each alloca). Examples of its
output appear in tests in `llvm/test/Analysis/KernelInfo`.
2025-01-29 12:40:19 -05:00
Mikhail Goncharov
cffe22a937 Revert "[NFC] Move DroppedVariableStats code to Analysis (#120502)"
that introduces a circular dependency of analysis -> codegen -> target

This reverts commit e389492d6a00e1c49a034e13343098541ebd03c6.
2024-12-19 10:56:02 +01:00
Shubham Sandeep Rastogi
16d952898f Revert "Add a pass to collect dropped var stats for MIR (#120501)"
This reverts commit 223c7648468cd4f649a578d3f9cbc27a63523192.

Reverted due to vuildbot failure:

flang-aarch64-libcxx

Linking CXX shared library lib/libLLVMAnalysis.so.20.0git
FAILED: lib/libLLVMAnalysis.so.20.0git
2024-12-19 00:48:40 -08:00
Shubham Sandeep Rastogi
223c764846
Add a pass to collect dropped var stats for MIR (#120501)
Reland "Add a pass to collect dropped var stats for MIR" (#117044)

I am trying to reland https://github.com/llvm/llvm-project/pull/115566

I also moved the DroppedVariableStats code to the Analysis lib

This is part of a stack of patches with
https://github.com/llvm/llvm-project/pull/120502 being the first one in
the stack
2024-12-19 00:41:48 -08:00
Shubham Sandeep Rastogi
e389492d6a
[NFC] Move DroppedVariableStats code to Analysis (#120502)
This is done because the CodeGen library and Passes library both link
against Analysis, to avoid adding a dependency between CodeGen and
Passes if we want to extend the DroppedVariableStats code for MIR stats
as well, as seen in https://github.com/llvm/llvm-project/pull/120501
2024-12-18 23:42:24 -08:00
Yingwei Zheng
cacbe71af7
[Analysis] Avoid running transform passes that have just been run (#112092)
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.

RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467

Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
2024-11-07 07:52:14 +08:00
Thomas Preud'homme
eed135fea7 Revert "[Analysis] Guard logf128 cst folding"
This reverts commit 42d3cccffd203ff6dc967d4243588ca466c0faf7 which
caused a test failure.
2024-08-29 17:58:10 +01:00
Thomas Preud'homme
56152fa377
[Analysis] Guard logf128 cst folding (#106543)
LLVM has a CMake variable to control whether to consider logf128
constant folding which libAnalysis ignores. This patch changes the
logf128 check to rely on the global LLVM_HAS_LOGF128 setting made in
config-ix.cmake.
2024-08-29 14:34:16 +01:00
NAKAMURA Takumi
3ef64f7ab5 Revert "Enable logf128 constant folding for hosts with 128bit long double (#104929)"
ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`.
LLVM should not change the behavior depending on host configurations.

This reverts commit 14c7e4a1844904f3db9b2dc93b722925a8c66b27.
(llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
2024-08-25 08:30:23 +09:00
Matthew Devereau
14c7e4a184
Enable logf128 constant folding for hosts with 128bit long double (#104929)
This is a reland of (#96287). This patch attempts to reduce the reverted
patch's clang compile time by removing #includes of float128.h and
inlining convertToQuad functions instead.
2024-08-22 10:12:59 +01:00
Nikita Popov
6300233de1 Revert "Reland logf128 constant folding (#103217)"
This reverts commit 3cab7c555ad6451f2b1b4dc918a4b4f4e4a3e45d.

The modified test fails on ppc64le buildbots.
2024-08-14 12:30:33 +02:00
Matthew Devereau
3cab7c555a
Reland logf128 constant folding (#103217)
This is a reland of #96287. This change makes tests in logf128.ll ignore
the sign of NaNs for negative value tests and moves an #include <cmath>
to be blocked behind #ifndef _GLIBCXX_MATH_H.
2024-08-14 08:55:52 +01:00
S. Bharadwaj Yadavalli
03e6675fc7
[DXIL][Analysis] Add DXILMetadataAnalysis pass (#102079)
DXIL Metadata Analysis passes (one for legacy PM and one for new PM)
that collect following DXIL module metadata information in a structure
are added.
1. Shader Model version
2. DXIL version 
3. Shader Stage

Information collected using the legacy pass is verified by adding
additional test commands to existing metadata test sources.
2024-08-12 13:51:09 -04:00
Nikita Popov
a15de17772 Revert "Enable logf128 constant folding for hosts with 128bit floats (#96287)"
This reverts commit ccb2b011e577e861254f61df9c59494e9e122b38.

Causes buildbot failures, e.g. on ppc64le builders.
2024-08-09 15:12:11 +02:00
Matthew Devereau
ccb2b011e5
Enable logf128 constant folding for hosts with 128bit floats (#96287)
Hosts which support a float size of 128 bits can benefit from constant
fp128 folding.
2024-08-09 11:12:43 +01:00
Mircea Trofin
dbbf0762b6
[ctx_prof] CtxProfAnalysis (#102084)
This is an immutable analysis that loads and makes the contextual profile available to other passes. This patch introduces the analysis and an analysis printer pass. Subsequent patches will introduce the APIs that IPO passes will call to modify the profile as result of their changes.
2024-08-07 14:39:48 -04:00
Justin Bogner
b365dbbd8d
[DXIL][Analysis] Move dxil::ResourceInfo to the Analysis library. NFC
I had put this in Transforms/Utils, but that doesn't actually make
sense if we want to populate these structures via an analysis pass.

Pull Request: https://github.com/llvm/llvm-project/pull/100621
2024-07-25 11:22:04 -07:00
Matthew Devereau
3613b26831
Constant Fold logf128 calls (#90611)
This is a second attempt to land #84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is
available and checking if the host does not use __ibm128 which is
prevalent on power pc targets and replaces IEEE754 float128s.
2024-05-29 06:13:02 +01:00
Matt Devereau
efce8a05aa Revert "Constant Fold logf128 calls"
This reverts commit 088aa81a545421933254f19cd3c8914a0373b493.
2024-05-01 12:18:39 +00:00
Matt Devereau
088aa81a54 Constant Fold logf128 calls
This is a second attempt to land #84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
2024-05-01 11:55:54 +00:00
Matt Devereau
c26e9bf8fa Revert "Constant Fold Logf128 calls (#84501)"
This reverts commit e90bc9cfd4d22c89dd993f62ede700ae25df49c5.
2024-04-18 11:20:54 +00:00
Matthew Devereau
e90bc9cfd4
Constant Fold Logf128 calls (#84501)
This patch enables constant folding for 128 bit floating-point logf
calls. This is achieved by querying if the host system has the logf128()
symbol available with a CMake test. If so, replace the runtime call with
the compile time value returned from logf128.
2024-04-18 10:19:01 +01:00
Björn Pettersson
5d9d740c39
Remove the unused IntervalPartition analysis pass (#88133)
This removes the old legacy PM "intervals" analysis pass (aka
IntervalPartition). It also removes the associated Interval and
IntervalIterator help classes.

Reasons for removal:
1) The pass is not used by llvm-project (not even being tested by
   any regression tests).
2) Pass has not been ported to new pass manager, which at least
   indicates that it isn't used by the middle-end.
3) ASan reports heap-use-after-free on
      ++I;  // After the first one...
   even if false is passed to intervals_begin. Not sure if that is
   a false positive, but it makes the code a bit less trustworthy.
2024-04-09 20:12:26 +02:00
Alexandros Lamprineas
92289db82f
[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
This fixes #71892 allowing us to check magled names in the IR verifier.
2024-01-17 09:55:30 +00:00
Nikita Popov
d77067d08a
[ValueTracking] Add dominating condition support in computeKnownBits() (#73662)
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.

DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.

The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.

This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.

Fixes https://github.com/llvm/llvm-project/issues/74242.
2023-12-06 14:17:18 +01:00
Aiden Grossman
3a42b1fd3e [IR] Add SturcturalHash printer pass
This patch adds in a StructuralHash printer pass that prints out the
hexadeicmal representation of the hash of a module and all of the
functions within it.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D158317
2023-08-29 18:59:52 -07:00
Arthur Eubanks
eb3d21be37 [Passes] Remove some legacy printer passes
MemDepPrinter doesn't have a new PM equivalent, but MemDep is soft deprecated anyway and adding one should be easy if somebody wants to.
2023-06-13 13:18:48 -07:00
Kazu Hirata
7f3047219c [Analysis] Remove AliasAnalysisSummary.cpp
The last use was removed by:

  commit 8005332835246c54a4a6b026eede930ed559deb4
  Author: Nikita Popov <npopov@redhat.com>
  Date:   Fri Dec 9 11:57:50 2022 +0100

Differential Revision: https://reviews.llvm.org/D150748
2023-05-17 08:39:48 -07:00
pvanhout
ae77aceba5 [Analysis] Remove DA & LegacyDA
UniformityAnalysis offers all of the same features and much more, there is no reason left to use the legacy DAs.
See RFC: https://discourse.llvm.org/t/rfc-deprecate-divergenceanalysis-legacydivergenceanalysis/69538

- Remove LegacyDivergenceAnalysis.h/.cpp
- Remove DivergenceAnalysis.h/.cpp + Unit tests
- Remove SyncDependenceAnalysis - it was not a real registered analysis and was only used by DAs
- Remove/adjust references to the passes in the docs where applicable
- Remove TTI hook associated with those passes.
- Move tests to UniformityAnalysis folder.
  - Remove RUN lines for the DA, leave only the UA ones.
- Some tests had to be adjusted/removed depending on how they used the legacy DAs.

Reviewed By: foad, sameerds

Differential Revision: https://reviews.llvm.org/D148116
2023-04-17 09:01:22 +02:00
chenglin.bi
76df706bca Revert "[LogicCombine 1/?] Implement a general way to simplify logical operations."
This reverts commit 97dcbea63e11d566cff0cd3a758cf1114cf1f633.
2023-03-14 09:00:06 +08:00
chenglin.bi
97dcbea63e [LogicCombine 1/?] Implement a general way to simplify logical operations.
This patch involves boolean ring to simplify logical operations. We can treat `&` as ring multiplication and `^` as ring addition.
So we need to canonicalize all other operations to `*` `+`. Like:
```
a & b -> a * b
a ^ b -> a + b
~a -> a + 1
a | b -> a * b + a + b
c ? a : b -> c * a + (c + 1) * b
```
In the code, we use a mask set to represent an expression. Every value that is not comes from logical operations could be a bit in the mask.
The mask itself is a multiplication chain. The mask set is an addiction chain.
We can calculate two expressions based on boolean algebras.

For now, the initial patch only enabled on and/or/xor,  Later we can enhance the code step by step.

Reference: https://en.wikipedia.org/wiki/Boolean_ring

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D142803
2023-03-02 20:46:16 +08:00
Mircea Trofin
5b8dc7c8a5 [mlgo] Introduce an "InteractiveModelRunner"
This is a model runner for ML researchers using environments like
CompilerGym. In such environments, researchers host the compiler and
want to be able to observe the problem space (features) at each decision
step of some optimization pass, at which point the compiler is stopped,
waiting for the host makes a decision and provide an advice back to
the compiler, which then continues its normal operation, and so on.

The InteractiveModelRunner supports this scenario for the feature set
exposed by the compiler at a given time. It uses 2 files - ideally FIFO
pipes - one to pass data to the host, the other to get advices back from
the host. This means this scenario is supported with no special
dependencies. The file creation and deletion is the responsibility of
the host. Hooking up this model evaluator to a MLGO-ed pass is the
responsibilty of the pass author, and subsequent patches will do so for
the current set of mlgo passes, and offer an API to easily "just opt in"
by default when mlgo-ing a new pass.

The data protocol is that of the training logger: the host sees a training
log doled out observation by observation by reading from one of the
files, and passes back its advice as a serialized tensor (i.e. tensor value
memory dump) via the other file.

There are some differences wrt the log seen during training: the
interactive model doesn't currently include the outcome (because it should be
identical to the decision, and it's also not present in the "release"
mode); and partial rewards aren't currently communicated back.

The assumption - just like with the training logger - is that the host
is co-located, thus avoiding any endianness concerns. In a distributed
environment, it is up to the hosting infrastructure to intermediate
that.

Differential Revision: https://reviews.llvm.org/D142642
2023-01-27 17:03:28 -08:00
Stefan Gränitz
3b387d1070 Lift EHPersonalities from Analysis to IR (NFC)
Computing EH-related information was only relevant for analysis passes so far. Lifting it to IR will allow the IR Verifier to calculate EH funclet coloring and validate funclet operand bundles in a follow-up step.

Reviewed By: rnk, compnerd

Differential Revision: https://reviews.llvm.org/D138122
2023-01-27 18:05:13 +01:00
Mircea Trofin
5898be19e6 [mlgo] Remove the protobuf dependency
The dependency was due to the log format. This change switches to the
previously-introduced (D139370) "dependency-free" logger instead of the
protobuf-based one.

A subsequent change will clean out the unnecessary abstraction left
behind.

This change drops the logger unittest, we have sufficient test coverage
via lit tests, and a unit test would require adding, unnecesarily, a log
reader (the reader is expected to be python, for the ML side, and there
is a reader for that under Analysis/models, used for tests).

Differential Revision: https://reviews.llvm.org/D141720
2023-01-17 13:12:27 -08:00
Benjamin Kramer
a3d58bbaff Detemplate llvm::EmitGEPOffset and move it into a cpp file. NFC. 2022-12-29 16:24:21 +01:00
Archibald Elliott
f09cf34d00 [Support] Move TargetParsers to new component
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
  component into a new LLVM Component called "TargetParser". This
  potentially enables using tablegen to maintain this information, as
  is shown in https://reviews.llvm.org/D137517. This cannot currently
  be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
  information in the TargetParser:
  - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
    the current Host machine for info about it, primarily to support
    getting the host triple, but also for `-mcpu=native` support in e.g.
    Clang. This is fairly tightly intertwined with the information in
    `X86TargetParser.h`, so keeping them in the same component makes
    sense.
  - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
    the target triple parser and representation. This is very intertwined
    with the Arm target parser, because the arm architecture version
    appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.

And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM

Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.

If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.

Differential Revision: https://reviews.llvm.org/D137838
2022-12-20 11:05:50 +00:00
Sameer Sahasrabuddhe
475ce4c200 RFC: Uniformity Analysis for Irreducible Control Flow
Uniformity analysis is a generalization of divergence analysis to
include irreducible control flow:

  1. The proposed spec presents a notion of "maximal convergence" that
     captures the existing convention of converging threads at the
     headers of natual loops.

  2. Maximal convergence is then extended to irreducible cycles. The
     identity of irreducible cycles is determined by the choices made
     in a depth-first traversal of the control flow graph. Uniformity
     analysis uses criteria that depend only on closed paths and not
     cycles, to determine maximal convergence. This makes it a
     conservative analysis that is independent of the effect of DFS on
     CycleInfo.

  3. The analysis is implemented as a template that can be
     instantiated for both LLVM IR and Machine IR.

Validation:
  - passes existing tests for divergence analysis
  - passes new tests with irreducible control flow
  - passes equivalent tests in MIR and GMIR

Based on concepts originally outlined by
Nicolai Haehnle <nicolai.haehnle@amd.com>

With contributions from Ruiling Song <ruiling.song@amd.com> and
Jay Foad <jay.foad@amd.com>.

Support for GMIR and lit tests for GMIR/MIR added by
Yashwant Singh <yashwant.singh@amd.com>.

Differential Revision: https://reviews.llvm.org/D130746
2022-12-20 07:22:24 +05:30
Kazu Hirata
9112ec6ad0 [mlgo] Use LLVM_HAVE_TFLITE instead of LLVM_HAVE_TF_API
This patch replaces uses of LLVM_HAVE_TF_API with LLVM_HAVE_TFLITE in
a couple of CMakeLists.txt.

Now that 842b0d0fe2dd142305a9461e50cdce9aff7f86bc has landed,
we now have:

  LLVM_HAVE_TF_API is defined if and only if LLVM_HAVE_TFLITE
  evaluates to true

in the CMake variable world (assuming that you do not set
LLVM_HAVE_TF_API on the cmake invocation).

FWIW, the story is a little different in the C++ macro world, where:

  LLVM_HAVE_TF_API is defined if and only if LLVM_HAVE_TFLITE is
  defined

This is why edc83a15b45e6b91fce3f35622a6b0a6d34e5211 consisted only of
mechanical replacements.

Differential Revision: https://reviews.llvm.org/D140061
2022-12-15 11:11:24 -08:00
Kazu Hirata
9883ee6816 [Analysis] Remove TFUtils.cpp
Differential Revision: https://reviews.llvm.org/D139773
2022-12-12 08:38:55 -08:00