35598 Commits

Author SHA1 Message Date
Noah Goldstein
7013638978 [DAG] Add support for nneg flag with uitofp
Copy `nneg` flag when building `UINT_TO_FP` from `uitofp` and use
`nneg` flag in the one place we transform `UINT_TO_FP` -> `SINT_TO_FP`
if the operand is non-negative.
2024-04-09 23:06:55 -05:00
Björn Pettersson
5d9d740c39
Remove the unused IntervalPartition analysis pass (#88133)
This removes the old legacy PM "intervals" analysis pass (aka
IntervalPartition). It also removes the associated Interval and
IntervalIterator help classes.

Reasons for removal:
1) The pass is not used by llvm-project (not even being tested by
   any regression tests).
2) Pass has not been ported to new pass manager, which at least
   indicates that it isn't used by the middle-end.
3) ASan reports heap-use-after-free on
      ++I;  // After the first one...
   even if false is passed to intervals_begin. Not sure if that is
   a false positive, but it makes the code a bit less trustworthy.
2024-04-09 20:12:26 +02:00
Fangrui Song
d3016aa889
[DWARF] Refactor .debug_names bucket count computation (#88087)
`getDebugNamesBucketAndHashCount` lures users to provide an array to
compute the bucket count using an O(n log n) sort. This is inefficient
as hash table based uniquifying is faster.

The performance issue matters less for Clang as the number of names is
relatively small. For `ld.lld --debug-names`, I plan to compute the
unique hash count as a side product of parallel entry pool computation,
and I just need a function to suggest a bucket count.
2024-04-09 11:02:39 -07:00
Qiu Chaofan
71eda17a06
[Legalizer] Soften EXTRACT_ELEMENT on ppcf128 (#77412)
ppc_fp128 values are always split into two f64. Implement soften
operation in soft-float mode to handle output f64 correctly.
2024-04-09 10:26:24 +08:00
Leonard Grey
c23135c548
-fsanitize=function: fix .subsections_via_symbols (#87527)
-fsanitize=function emits a signature and function hash before a
function. Similar to 7f6e2c9, these can be sheared off when
`.subsections_via_symbols` is used.

This change uses the same technique 7f6e2c9 introduced for prefixes:
emitting a symbol for the metadata, then marking the actual function
entry as an .alt_entry symbol.
2024-04-08 16:05:52 -04:00
Malay Sanghi
38f996bb2b
Replace copy with a reference. (#87975) 2024-04-08 20:31:51 +08:00
David Green
9fd2e2c2fd
[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608)
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the moment though, meaning it
will generate a normal instruction. This patch ensures that the right
flags are set if the instruction has non-temporal metadata.
2024-04-08 08:53:27 +01:00
David Green
ac321cbb03
[AArch64][GlobalISel] Legalize Insert vector element (#81453)
This attempts to standardize and extend some of the insert vector
element lowering. Most notably:
- More types are handled by splitting illegal vectors.
- The index type for G_INSERT_VECTOR_ELT is canonicalized to
  TLI.getVectorIdxTy(), similar to extact_vector_element.
- Some of the existing patterns now have the index type specified to
  make sure they can apply to GISel too.
- The C++ selection code has been removed, relying on tablegen patterns.
- G_INSERT_VECTOR_ELT with small GPR input elements are pre-selected to
  use a i32 type, allowing the existing patterns to apply.
- Variable index inserts are lowered in post-legalizer lowering,
  expanding into a stack store and reload.
2024-04-08 08:44:13 +01:00
Bevin Hansson
110c22fe12
[ExpandLargeFpConvert] Support bfloat. (#87619)
The conversion expansions did not properly handle bfloat types.

I'm not certain that these expansions are completely correct;
I don't have any experience with AMDGPU or the ability to run
anything to test it.

Note that it doesn't seem like AMDGPU with GlobalISel can
handle fptrunc of float to bfloat, which is needed for itofp.
I've omitted the GISEL run for the bfloat case.

This fixes #85379.
2024-04-08 09:07:55 +02:00
Haohai Wen
cebf77fb93
[CodeGen][DebugInfo] Add missing DebugLoc for SplitCriticalEdge (#72192)
In SplitCriticalEdge, DebugLoc of the branch instruction in new created
MBB was set to empty. It should be set and we can find proper DebugLoc
for it in most cases. This patch set it to non empty merged DebugLoc of
current MBB branches.
2024-04-08 09:44:34 +08:00
darkbuck
8e98435ae9
[GISel][Combine] Enhance combining on G_BUILD_VECTOR
Reviewers: aemerson, arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/87831
2024-04-06 18:33:01 -04:00
Matt Arsenault
4cb110a84f
[RFC] IR: Support atomicrmw FP ops with vector types (#86796)
Allow using atomicrmw fadd, fsub, fmin, and fmax with vectors of
floating-point type. AMDGPU supports atomic fadd for <2 x half> and <2 x
bfloat> on some targets and address spaces.

Note this only supports the proper floating-point operations; float
vector typed xchg is still not supported. cmpxchg still only supports
integers, so this inserts bitcasts for the loop expansion.

I have support for fp vector typed xchg, and vector of int/ptr
separately implemented but I don't have an immediate need for those
beyond feature consistency.
2024-04-06 15:27:45 -04:00
Amara Emerson
60fc4ac67a [GlobalISel] Don't form anyextending atomic loads.
Until we can reliably check the legality and improve our selection of these,
don't form them at all.
2024-04-05 13:34:59 -07:00
Michael Liao
a1b2f0cc44 Reland "[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs"
- That test needs to disable combine rules by name and hence requires `asserts`.
2024-04-05 10:34:12 -04:00
Gulfem Savrun Yeniceri
be8fd86f6a Revert "[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs"
This reverts commit 1f01c580444ea2daef67f95ffc5fde2de5a37cec
because combine-commute-int-const-lhs.mir test failed in
multiple builders.
https://lab.llvm.org/buildbot/#/builders/124/builds/10375
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751607530180046481/overview
2024-04-04 16:39:31 +00:00
Jay Foad
1b761205f2
[APInt] Add a simpler overload of multiplicativeInverse (#87610)
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.

Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
2024-04-04 16:11:06 +01:00
Piotr Sobczak
5b59ae423a
[DAG] Preserve NUW when reassociating (#87621)
Similarly to the generic case below, preserve the NUW flag when
reassociating adds with constants.
2024-04-04 16:47:25 +02:00
Simon Pilgrim
2d0087424f
[DAG] Remove extract_vector_elt(freeze(x)), idx -> freeze(extract_vector_elt(x), idx) fold (#87480)
Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds.

This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements.

There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads?

Fixes #86968
2024-04-04 11:10:55 +01:00
Simon Pilgrim
a9d963fdf8 [DAG] SoftenFloatResult - add clang-format off/on tags around switch statement. NFC.
Stop clang-format from trying to put all the case on separate lines.
2024-04-04 11:02:02 +01:00
Stephen Tozer
708ce85690
[RemoveDIs][NFC] Use ScopedDbgInfoFormatSetter in more places (#87380)
The class `ScopedDbgInfoFormatSetter` was added as a convenient way to
temporarily change the debug info format of a function or module, as
part of IR printing; since this process is repeated in a number of other
places, this patch uses the format-setter class in those places as well.
2024-04-04 10:20:14 +01:00
Luke Lau
3a7b5223a6
[DAGCombiner][RISCV] Handle truncating splats in isNeutralConstant (#87338)
On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1),
(splat_vector i64 0)), where the splat_vectors are implicitly
truncating.

When the vselect is used by a binop we want to pull the vselect out via
foldSelectWithIdentityConstant. But because vectors with an element size
< i64 will truncate, isNeutralConstant will return false.

This patch handles truncating splats by getting the APInt value and
truncating it. We almost don't need to do this since most of the neutral
elements are either one/zero/all ones, but it will make a difference for
smax and smin.

I wasn't able to figure out a way to write the tests in terms of select,
since we need the i1 zext legalization to create a truncating
splat_vector.

This supercedes #87236. Fixed vectors are unfortunately not handled by
this patch (since they get legalized to _VL nodes), but they don't seem
to appear in the wild.
2024-04-04 12:36:15 +08:00
darkbuck
1f01c58044
[GlobalISel] Fix the infinite loop issue in commute_int_constant_to_rhs
- When both operands are constant, the matcher runs into an infinite
  loop as the commutation should be applied only when LHS is a constant
  and RHS is not.

Reviewers: arsenm

Reviewed By: arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/87426
2024-04-03 20:52:21 -04:00
Michael Maitland
8aa3a77eaf [RISCV][GISEL] Legalize G_ZEXT, G_SEXT, and G_ANYEXT, G_SPLAT_VECTOR, and G_ICMP for scalable vector types
This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a
legal mask type, then the instruction is legalized as the element-wise
select, where the condition on the select is the mask typed source
operand, and the true and false values are 1 or -1 (for
zero/any-extension and sign extension) and zero. If the type is a legal integer
or vector integer type, then the instruction is marked as legal.

The legalization of the extends may introduce a G_SPLAT_VECTOR, which
needs to be legalized in this patch for the extend test cases to pass.

A G_SPLAT_VECTOR is legal if the vector type is a legal integer or
floating point vector type and the source operand is sXLen type. This is
because the SelectionDAG patterns only support sXLen typed
ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A
G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector
type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL
if the splat is all ones or all zeros respectivley. In the case of a
non-constant mask splat, we legalize by promoting the scalar value to
s8.

In order to get the s8 element vector back into s1 vector, we use a
G_ICMP. In order for the splat vector and extend tests to pass, we also
need to legalize G_ICMP in this patch.

A G_ICMP is legal if the destination type is a legal bool vector and the LHS and
RHS are legal integer vector types.
2024-04-03 15:27:15 -07:00
Jay Foad
a6170d5b7e
[SelectionDAG] Dump convergencectrl_glue DAG node (#87487) 2024-04-03 16:21:57 +01:00
Simon Pilgrim
39eedfded4 [DAG] visitADDLikeCommutative - convert (add x, shl(0 - y, n)) fold to SDPatternMatch. NFC. 2024-04-03 15:37:38 +01:00
aniplcc
d650fcd6bf
[DAG] SimplifyDemandedVectorElts - add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes (#86284)
Fixes #84768
2024-04-03 15:00:50 +01:00
AinsleySnow
52b18430ae
[VP][DAGCombine] Use simplifySelect when combining vp.select. (#87342)
Hi all,

This patch is a follow-up of #79101. It migrates logic from
`visitVSELECT` to `visitVP_SELECT` to simplify `vp.select`. With this
patch we can do the following combinations:

```
vp.select undef, T, F --> T (if T is a constant), F otherwise
vp.select <condition>, undef, F --> F
vp.select <condition>, T, undef --> T
vp.select false, T, F --> F
vp.select <condition>, T, T --> T
```

I'm a total newbie to llvm and I'm sure there's room for improvements in
this patch. Please let me know if you have any advice. Thank you in
advance!
2024-04-03 07:45:50 -04:00
Gleb Popov
0356d0cfdc
Print more descriptive error message when trying to link a global with appending linkage (#69613)
This is a proper fix for https://github.com/llvm/llvm-project/issues/40308
2024-04-03 12:26:12 +01:00
Elizaveta Noskova
4dd103e9c6
[CodeGen][ShrinkWrap] Clarify StackAddressUsedBlockInfo meaning (#80679) 2024-04-03 11:22:43 +03:00
Bevin Hansson
7edddee2aa
[ExpandLargeFpConvert] Scalarize vector types. (#86954)
expand-large-fp-convert cannot handle vector types.
If overly large vector element types survive into
isel, they will likely be scalarized there, but since
isel cannot handle scalar integer types of that size,
it will assert.

Handle vector types in expand-large-fp-convert by
scalarizing them and then expanding the scalar type
operation. For large vectors, this results in a
*massive* code expansion, but it's better than
asserting.
2024-04-03 08:45:59 +02:00
Ryotaro KASUGA
ea4a11926b
Reapply "[CodeGen] Fix register pressure computation in MachinePipeli… (#87312)
…ner (#87030)"

Fix broken test.

This reverts commit b8ead2198f27924f91b90b6c104c1234ccc8972e.
2024-04-03 09:28:09 +09:00
Matt Arsenault
7f2a41b643 MachineScheduler: Simplify usage of TargetInstrInfo 2024-04-02 16:24:47 -04:00
Prabhuk
212b1a84a6
[CallSiteInfo][NFC] CallSiteInfo -> CallSiteInfo.ArgRegPairs (#86842)
CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.

Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the
upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().

An upcoming patch will add type ids for indirect calls to propogate them
from
middle-end to the back-end. The type ids will be then used to emit the
call
graph section.

Original RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Differential Revision: https://reviews.llvm.org/D107109?id=362888

Co-authored-by: Necip Fazil Yildiran <necip@google.com>
2024-04-02 13:05:16 -07:00
Atousa Duprat
4aba595f09
[ADT] Add signed and unsigned mulh to APInt (#84719)
Fixes #84207
2024-04-02 17:07:56 +01:00
Bevin Hansson
cd6434f9ec
[ExpandLargeDivRem] Scalarize vector types. (#86959)
expand-large-divrem cannot handle vector types.
If overly large vector element types survive into
isel, they will likely be scalarized there, but since
isel cannot handle scalar integer types of that size,
it will assert.

Handle vector types in expand-large-divrem by
scalarizing them and then expanding the scalar type
operation. For large vectors, this results in a
*massive* code expansion, but it's better than
asserting.
2024-04-02 16:37:36 +02:00
Il-Capitano
0ef7437780
[SelectionDAG][Statepoint] Fix truncation of gc.statepoint ID argument (#85908)
The ID argument of `gc.statepoint` gets incorrectly truncated to 32 bits
during code generation.
This is fixed by using `uint64_t` instead of `unsigned` for the `ID`
member in `SelectionDAGBuilder::StatepointLoweringInfo`, and a
`patchpoint` test case is extended to check for 64 bit ID generation in
stackmaps.
2024-04-02 09:28:19 -04:00
Sizov Nikita
6654235594
[SelectionDAG] implement computeKnownBits for add AVG* instructions (#86754)
knownBits calculation for **AVGFLOORU** / **AVGFLOORS** / **AVGCEILU** / **AVGCEILS** instructions

Prerequisite for #76644
2024-04-02 10:39:49 +01:00
Thorsten Schütt
8bb9443333
[GlobalIsel] Combine G_EXTRACT_VECTOR_ELT (#85321)
preliminary steps
2024-04-02 09:01:24 +02:00
Gulfem Savrun Yeniceri
b8ead2198f Revert "[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)"
This reverts commit a4dec9d6bc67c4d8fbd4a4f54ffaa0399def9627
because the test failed in the following builder:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751864477467126481/overview
2024-04-01 18:27:41 +00:00
Michael Maitland
da9f06c9b1
[GISEL] G_SPLAT_VECTOR can take a splat that is larger than the vector element (#86974)
This is what SelectionDAG does. We'd like to reuse SelectionDAG
patterns.
2024-04-01 08:46:22 -04:00
Ryotaro KASUGA
a4dec9d6bc
[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)
`RegisterClassInfo::getRegPressureSetLimit` has been changed to return a
smaller value than before so the limit may become negative in later
calculations. As a workaround, change to use
`TargetRegisterInfo::getRegPressureSetLimit`.
Also improve tests.
2024-04-01 17:04:44 +09:00
Vitaly Buka
c4df57da1d [CodeGen] llvm.allow.{runtime,ubsan}.check() in FastISel
Follow up to #86049.
clang-armv8-quick build bot can trigger this branch.
2024-03-31 23:39:33 -07:00
Sameer Sahasrabuddhe
421557974a
[AMDGPU] Use glue for convergence tokens at call-like operations (#86766)
The earlier implementation on AMDGPU used explicit token operands at
SI_CALL and SI_CALL_ISEL. This is now replaced with CONVERGENCECTRL_GLUE
operands, with the following effects:

- The treatment of tokens at call-like operations is now consistent with
the treatment at intrinsics.
- Support for tail calls using implicit tokens at SI_TCRETURN "just
works".
- The extra parameter at call-like instructions is eliminated, thus
restoring those instructions and their handling to the original state.

The new glue node is placed after the existing glue node for the
outgoing call parameters, which seems to not interfere with selection of
the call-like nodes.
2024-04-01 10:51:13 +05:30
Vitaly Buka
20f56e1f8e
[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-03-31 22:19:33 -07:00
Shilei Tian
3a106e5b2c
[GlobalISel] Fold G_ICMP if possible (#86357)
This patch tries to fold `G_ICMP` if possible.
2024-03-29 15:59:50 -04:00
Shilei Tian
360f7f5674
[GlobalISel] Call setInstrAndDebugLoc before tryCombineAll (#86993)
This can remove all unnecessary redundant calls in each combiner.
2024-03-29 15:27:28 -04:00
Kevin P. Neal
fe893c93b7
[FPEnv][AtomicExpand] Correct strictfp attribute handling in AtomicExpandPass (#87082)
The AtomicExpand pass was lowering function calls with the strictfp
attribute to sequences that included function calls incorrectly lacking
the attribute. This patch corrects that.

The pass now also emits the correct constrained fp call instead of
normal FP instructions when in a function with the strictfp attribute.

Test changes verified with D146845.
2024-03-29 14:54:51 -04:00
Shilei Tian
661bb9daae
[GlobalISel] Handle div-by-pow2 (#83155)
This patch adds similar handling of div-by-pow2 as in `SelectionDAG`.
2024-03-29 12:41:47 -04:00
Thorsten Schütt
84299df301
[GlobalIsel] add trunc flags (#87045)
https://github.com/llvm/llvm-project/pull/85592
2024-03-29 13:38:08 +01:00
Wang Pengcheng
610b9e23c5
[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505)
We can avoid libcalls.

Fixes #86205
2024-03-29 15:38:39 +08:00