307 Commits

Author SHA1 Message Date
David Green
4875553f4c [AArch64][GlobalISel] Port unmerge KnownBits tests to print<gisel-value-tracking>. NFC
This takes the known-bits tests added in #112172 and ports them over to be a
new print<gisel-value-tracking> test.
2025-08-20 20:57:14 +01:00
jyli0116
961a4aabf8
[GlobalISel] Add constant matcher for APInt (#151357)
Changed m_SpecificICst, m_SpecificICstSplat and m_SpecificICstorSplat to
match against APInt as well.
2025-08-04 09:47:21 +01:00
Fabian Ritter
ef6eaa045a
[GISel] Introduce MIFlags::InBounds (#150900)
This flag applies to G_PTR_ADD instructions and indicates that the operation
implements an inbounds getelementptr operation, i.e., the pointer operand is in
bounds wrt. the allocated object it is based on, and the arithmetic does not
change that.

It is set when the IRTranslator lowers inbounds GEPs (currently only in some
cases, to be extended with a future PR), and in the
(build|materialize)ObjectPtrOffset functions.

Inbounds information is useful in ISel when we have instructions that perform
address computations whose intermediate steps must be in the same memory region
as the final result. A follow-up patch will start using it for AMDGPU's flat
memory instructions, where the immediate offset must not affect the memory
aperture of the address.

This is analogous to a concurrent effort in SDAG: #131862
(related: #140017, #141725).

For SWDEV-516125.
2025-07-30 13:01:23 +02:00
Fabian Ritter
d64240b5c6
[GISel] Introduce MachineIRBuilder::(build|materialize)ObjectPtrOffset (#150392)
These functions are for building G_PTR_ADDs when we know that the base
pointer and the result are both valid pointers into (or just after) the
same object. They are similar to SelectionDAG::getObjectPtrOffset.

This PR also changes call sites of the generic (build|materialize)PtrAdd
functions that implement pointer arithmetic to split large memory
accesses to the new functions. Since memory accesses have to fit into an
object in memory, pointer arithmetic to an offset into a large memory
access also yields an address in that object.

Currently, these (build|materialize)ObjectPtrOffset functions only add
"nuw" to the generated G_PTR_ADD, but I intend to introduce an
"inbounds" MIFlag in a later PR (analogous to a concurrent effort in
SDAG: #131862, related: #140017, #141725) that will also be set in the
(build|materialize)ObjectPtrOffset functions.

Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's
call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where
offsets are now folded into scratch instructions, and cases where the
behavior of the check regeneration script changed, resulting, e.g., in
better checks for "nusw G_PTR_ADD" instructions, matched empty lines,
and the use of "CHECK-NEXT" in MIPS tests.

For SWDEV-516125.
2025-07-29 13:04:04 +02:00
Tim Gymnich
5aed4800f3
[GISel] KnownFPClass ValueTracking fix handling of vectors (#143372) 2025-06-12 14:43:40 +02:00
Tim Gymnich
760bf4f116
[GISel] Add KnownFPClass Analysis to GISelValueTrackingPass (#134611)
- add KnownFPClass analysis to GISelValueTrackingPass
- add MI pattern for `m_GIsFPClass`
2025-05-23 14:38:51 +02:00
David Green
ec406e8674
[GlobalISel] Add a GISelValueTracker printing pass (#139687)
This adds a GISelValueTrackingPrinterPass that can print the known bits
and sign bit of each def in a function. It is built on the new pass
manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older
class to GISelValueTrackingAnalysisLegacy.

The first 2 functions from the AArch64GISelMITest are ported over to an
mir test to show it working. It also runs successfully on all files in
llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can
hopefully be used to test GlobalISel known bits analysis more directly
in common cases, without jumping through the hoops that the C++ tests
requires.
2025-05-14 11:05:04 +01:00
David Green
137aa573ca
[GlobalISel] Add computeNumSignBits for G_BUILD_VECTOR. (#139506)
The code is similar to SelectionDAG::ComputeNumSignBits, but does not
deal with truncating buildvectors.
2025-05-13 09:36:14 +01:00
Kazu Hirata
fc0f074d0d [CodeGen] Fix warnings
This patch fixes:

  third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
  error: comparison of integers of different signs: 'const unsigned
  long' and 'const int' [-Werror,-Wsign-compare]
2025-05-05 10:18:46 -07:00
KRM7
0926d94453
[GlobalISel] Take the result size into account when const folding icmp (#134365)
The current implementation always creates a 1 bit constant for the
result of the `G_ICMP`, which will cause issues if the destination
register size is larger than that. With asserts enabled, it will cause a
crash in `buildConstant`:
```
llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp:322: virtual MachineInstrBuilder llvm::MachineIRBuilder::buildConstant(const DstOp &, const ConstantInt &): Assertion `EltTy.getScalarSizeInBits() == Val.getBitWidth() && "creating constant with the wrong size"' failed. 
```
2025-05-05 19:01:53 +02:00
Tim Gymnich
1d0005a69a
[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (#133466)
- rename `GISelKnownBits` to `GISelValueTracking` to analyze more than
just `KnownBits` in the future
2025-03-29 11:51:29 +01:00
Nikita Popov
f137c3d592
[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940)
This avoids doing a Triple -> std::string -> Triple round trip in lots
of places, now that the Module stores a Triple.
2025-03-12 17:35:09 +01:00
David Green
9bac1b63ac
[GlobalISel] Add and use a m_GAddLike pattern matcher. NFC (#125435)
This adds a Flags parameter to the BinaryOp_match, allowing it to detect
different flags like Disjoint. A m_GDisjointOr is added to detect Or's
with disjoint flags, and G_AddLike is then either a m_GADD or
m_GDisjointOr.

The rest is trying to allow matching `const MachineInstr&`, as opposed
to non-const references.
2025-03-10 22:03:36 +00:00
Nikita Popov
cc539138ac
[CodeGen] Use __extendhfsf2 and __truncsfhf2 by default (#126880)
The standard libcalls for half to float and float to half conversion are
__extendhfsf2 and __truncsfhf2. However, LLVM currently uses
__gnu_h2f_ieee and __gnu_f2h_ieee instead. As far as I can tell, these
libcalls are an ARM-ism and only provided by libgcc on that platform.
compiler-rt always provides both libcalls.

Use the standard libcalls by default, and only use the __gnu libcalls on
ARM.
2025-02-19 10:16:57 +01:00
Alan Li
220004d2f8
[GISel] Add more FP opcodes to CSE (#123949)
Resubmit, previously PR has compilation issues.
2025-01-22 23:00:08 -08:00
Danial Klimkin
c938436f71
Revert "[GISel] Add more FP opcodes to CSE (#123624)" (#123954)
This reverts commit 43177b524ee06dfc09cbc357ff277d4f53f5dc15.
2025-01-22 16:21:05 +01:00
lialan
43177b524e
[GISel] Add more FP opcodes to CSE (#123624)
This fixes #122724
2025-01-22 06:20:42 -08:00
Min-Yih Hsu
a74f825a7a
[MIPatternMatch] Add m_DeferredReg/Type (#121218)
This pattern does the same thing as m_SpecificReg/Type except the value
it matches against origniated from an earlier pattern in the same
mi_match expression.

This patch also changes how commutative patterns are handled: in order
to support m_DefferedReg/Type, we always have to run the LHS-pattern
before the RHS one.
2024-12-30 09:23:51 -08:00
Min-Yih Hsu
831e1ac12e
[MIPatternMatch] Add m_GUMin and m_GUMax (#121068)
And make all unsigned and signed versions of min/max matchers
commutative, since we already made a precedent of m_GAdd that is
commutative by default.
2024-12-26 09:28:17 -08:00
Min-Yih Hsu
d21f300f06
[MIPatternMatch] Fix incorrect argument type of m_Type (#121074)
m_Type is supposed to extract the underlying value type (equality type
comparison is covered by m_SpecificType), therefore it should take a LLT
reference as its argument rather than passing by value.

This was originated from de256478e61d6488db751689af82d280ba114a6f, which
refactored out a good chunk of LLT reference usages. And it's just so
happen that (for some reasons) no one is using m_Type and no test was
covering it.
2024-12-26 09:09:02 -08:00
Vikash Gupta
c21a3776c9
[GlobalIsel] [Utility] [NFC] Added isConstantOrConstantSplatVectorFP to handle float constants. (#120935)
Needed for #120104
2024-12-26 18:57:19 +05:30
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00
David Green
04546a0dd6
[GlobalISel] Support vector G_UNMERGE_VALUES in computeKnownBits. (#112172)
This adds computeKnownBits support for vector->vector G_UNMERGE_VALUES,
grabbing the known bits with an adjusted DemandedElts mask.
2024-10-15 08:23:05 +01:00
Michael Maitland
ee2add0683
[GISEL] Fix bugs and clarify spec of G_EXTRACT_SUBVECTOR (#108848)
The implementation was missing the fact that `G_EXTRACT_SUBVECTOR`
destination and source vector can be different types.

Also fix a bug in the MIR builder for `G_EXTRACT_SUBVECTOR` to generate
the correct opcode.

Clarify the G_EXTRACT_SUBVECTOR specification.
2024-09-17 10:08:39 -04:00
JOE1994
387bee91f0 [llvm][unittests] Strip unneeded uses of raw_string_ostream::str() (NFC)
Avoid excess layer of indirection.
2024-09-13 09:42:32 -04:00
Kyungwoo Lee
38c3855c9f
[NFC] Remove unused argument (FuncName) for parseMIR (#106144)
While working on a MIR unittest, I noticed that parseMIR includes an
unused argument that sets a function name. This is not only redundant
but also irrelevant, as parseMIR is designed to parse entire module, not
specific functions, even though most unittests contain a single function
per module. To streamline the API, I have removed this unnecessary
argument from parseMIR. However, if this argument was originally
included to enhance readability or for any other purpose, please let me
know.
2024-08-26 19:19:02 -07:00
Tobias Stadler
d2336fd75c
[RFC][GlobalISel] InstructionSelect: Allow arbitrary instruction erasure (#97670)
See https://discourse.llvm.org/t/rfc-globalisel-instructionselect-allow-arbitrary-instruction-erasure
2024-08-11 17:26:43 +02:00
Manish Kausik H
69192e0193
[LegalizeDAG] Optimize CodeGen for ISD::CTLZ_ZERO_UNDEF (#83039)
Previously we had the same instructions being generated for `ISD::CTLZ` and `ISD::CTLZ_ZERO_UNDEF` which did not take advantage of the fact that zero is an invalid input for `ISD::CTLZ_ZERO_UNDEF`. This commit separates codegen for the two cases to allow for the optimization for the latter case.

The details of the optimization are outlined in #82075

Fixes #82075

Co-authored-by: Manish Kausik H <hmamishkausik@gmail.com>
2024-07-08 14:01:32 +01:00
Kazu Hirata
75bc20ff89
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914) 2024-07-07 08:23:41 +09:00
darkbuck
1ff05876fb
[GlobalISel] Add unit tests for call lowering on byref support
Reviewers: tschuett, spaits, aemerson, arsenm

Reviewed By: spaits, arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/96805
2024-06-27 13:19:44 -04:00
Serge Pavlov
f9795f34a6
[GlobalISel] Add build methods for FP environment intrinsics (#96607)
This change adds methods like buildGetFPEnv and similar for opcodes that
represent manipulation on floating-point state.
2024-06-25 16:13:52 +07:00
Pierre van Houtryve
ed299b3efd
[GlobalISel] Optimize ULEB128 usage (#90565)
- Remove some cases where ULEB128 isn't needed
- Add a fastDecodeULEB128 tailored for GlobalISel which does unchecked
decoding optimized for the common case, which is 1 byte values. We
rarely have >1 byte Inst IDs, OpIdx, etc. and those are the most common
ULEB users by far.

This specific LEB128 decode function generates almost 2x less
instructions than the generic one.
2024-05-03 10:26:54 +02:00
chuongg3
bf57d2e57c
[AArch64][GlobalISel] Enable computeNumSignBits for G_XOR, G_AND, G_OR (#89896) 2024-04-29 10:53:30 +01:00
Dávid Ferenc Szabó
2347020e4c
[GlobalISel] Fix fewerElementsVectorPhi to insert after G_PHIs (#87927)
Currently the inserted mergelike instructions will be inserted at the
location of the G_PHI. Seems like the behaviour was correct before, but
the rework done in https://reviews.llvm.org/D114198 forgot to include
the part which makes sure the instructions will be inserted after all
the G_PHIs.
2024-04-15 11:01:55 +02:00
Shilei Tian
3a106e5b2c
[GlobalISel] Fold G_ICMP if possible (#86357)
This patch tries to fold `G_ICMP` if possible.
2024-03-29 15:59:50 -04:00
David Green
47f4a07a2f
[GlobalISel] Add Knownbits for G_LOAD/ZEXTLOAD/SEXTLOAD with range metadata (#86431)
Similar to #80829 for GlobalISel.
2024-03-26 13:42:08 +00:00
Shilei Tian
0a4299403e
[GlobalISel] Fold G_CTTZ if possible (#86224)
This patch tries to fold `G_CTTZ` if possible.
2024-03-25 16:55:37 -04:00
Michael Maitland
96049fcf4e [GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
2024-03-07 09:10:03 -08:00
Michael Maitland
552da24843
Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector types" (#84330)
Reverts llvm/llvm-project#80378

causing Buildbot failures that did not show up with check-llvm or CI.
2024-03-07 10:16:31 -05:00
Michael Maitland
2b8aaef09e
[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.

ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.

`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
2024-03-07 09:50:29 -05:00
Michael Maitland
c954986fec
[GISel] Add support for scalable vectors in getGCDType (#80307)
This function can be called from buildCopyToRegs where at least one of
the types is a scalable vector type. This function crashed because it
did not know how to handle scalable vector types.

This patch extends the functionality of getGCDType to handle when at
least one of the types is a scalable vector. getGCDType between a fixed
and scalable vector is not implemented since the docstring of the
function explains that getGCDType is used to build MERGE/UNMERGE
instructions and we will never build a MERGE/UNMERGE between fixed and
scalable vectors.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-02-07 10:32:12 -05:00
Michael Maitland
055ac72ecc
[GISel] Add support for scalable vectors in getLCMType (#80306)
This function can be called from buildCopyToRegs where at least one of
the types is a scalable vector type. This function crashed because it
did not know how to handle scalable vector types.

This patch extends the functionality of getLCMType to handle when at
least one of the types is a scalable vector. getLCMType between a fixed
and scalable vector is not implemented since the docstring of the
function explains that getLCMType is used to build MERGE/UNMERGE
instructions and we will never build a MERGE/UNMERGE between fixed and
scalable vectors.
2024-02-06 20:23:07 -05:00
Kai Nacke
f2d0bba874
[GISel] Lower scalar G_SELECT in LegalizerHelper (#79342)
The LegalizerHelper only has support to lower G_SELECT with
vector operands. The approach is the same for scalar arguments,
which this PR adds.
2024-01-26 09:11:29 -05:00
David Green
d659bd1635
[GlobalISel][AArch64] Tail call libcalls. (#74929)
This tries to allow libcalls to be tail called, using a similar method
to DAG where the type is checked to make sure they match, and if so the
backend, through lowerCall checks that the tailcall is valid for all
arguments.
2024-01-03 07:59:36 +00:00
Nikita Popov
261b471015
[FileCheck] Don't use regex to find prefixes (#72237)
FileCheck currently compiles a regular expression of the form
`Prefix1|Prefix2|...` and uses it to find the next prefix in the input.

If we had a fast regex implementation, this would be a useful thing to
do, as the regex implementation would be able to match multiple prefixes
more efficiently than a naive approach. However, with our actual regex
implementation, finding the prefixes basically becomes O(InputLen *
RegexLen * LargeConstantFactor), which is a lot worse than a simple
string search.

Replace the regex with StringRef::find(), and keeping track of the next
position of each prefix. There are various ways this could be improved
on, but it's already significantly faster that the previous approach.

For me, this improves check-llvm time from 138.5s to 132.5s, so by
around 4-5%.

For vector-interleaved-load-i16-stride-7.ll in particular, test time
drops from 5s to 2.5s.
2023-11-15 09:34:52 +01:00
Tobias Stadler
373c343a77 Reland: [GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND
Reland 3686a0b after fixing an exposed miscompile in #68840

Differential Revision: https://reviews.llvm.org/D159140
2023-11-02 00:18:19 +01:00
Tobias Stadler
305fbc1b32 Revert "[GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND"
This reverts commit 3686a0b611c65f0d7190345b8e3e73cdca9fa657.
This seems to have broken some sanitizer tests:
https://lab.llvm.org/buildbot/#/builders/184/builds/7721
2023-09-29 03:35:40 +02:00
Tobias Stadler
3686a0b611 [GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND
The legalizer currently generates lots of G_AND artifacts.
For example between boolean uses and defs there is always a G_AND with a mask of 1, but when the target uses ZeroOrOneBooleanContents, this is unnecessary.
Currently these artifacts have to be removed using post-legalize combines.
Omitting these artifacts at their source in the artifact combiner has a few advantages:
- We know that the emitted G_AND is very likely to be useless, so our KnownBits call is likely worth it.
- The G_AND and G_CONSTANT can interrupt e.g. G_UADDE/... sequences generated during legalization of wide adds which makes it harder to detect these sequences in the instruction selector (e.g. useful to prevent unnecessary reloading of AArch64 NZCV register).
- This cleans up a lot of legalizer output and even improves compilation-times.
AArch64 CTMark geomean: `O0` -5.6% size..text; `O0` and `O3` ~-0.9% compilation-time (instruction count).

Since this introduces KnownBits into code-paths used by `O0`, I reduced the default recursion depth.
This doesn't seem to make a difference in CTMark, but should prevent excessive recursive calls in the worst case.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D159140
2023-09-29 02:11:57 +02:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Sameer Sahasrabuddhe
d9847cde48 [GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154766
2023-07-31 12:15:39 +05:30