56432 Commits

Author SHA1 Message Date
Jorge Gorbe Moya
71e434d302
[SandboxVec] Reapply "Add barebones Region class. (#108899)" (#109059)
A `#ifndef NDEBUG` in the wrong place caused an error in release builds.
2024-09-18 11:36:45 -07:00
Craig Topper
292ee93a87
[CodeGen] Use Register in SwitchLoweringUtils. NFC (#109092)
Use an empty Register() instead of -1U.
2024-09-18 09:43:21 -07:00
Rahul Joshi
2731be7ac5
[Support] Add helper struct indent for adding indentation (#108966)
Add helper struct indent() for adding indentation to raw_ostream.
2024-09-18 08:54:11 -07:00
Lei Huang
4b524088a8
[NFC] Update function names in MCTargetAsmParser.h (#108643)
Update function names to adhere to LLVM coding standard.
2024-09-18 11:43:49 -04:00
Rahul Joshi
47c3df2a7f
[LLVM][TableGen] Change CallingConvEmitter to use const RecordKeeper (#108955)
Change CallingConvEmitter to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-18 07:19:40 -07:00
Phoebe Wang
a10c9f994b
Revert "[X86][BF16] Add libcall for F80 -> BF16" (#109140)
Reverts llvm/llvm-project#109116
2024-09-18 21:35:38 +08:00
Phoebe Wang
76eda76f9f
[X86][BF16] Add libcall for F80 -> BF16 (#109116)
This fixes #108936, but the calling convention doesn't match with GCC. I
doubt we have such a lib function for now, so leave the calling
convention as is.
2024-09-18 21:23:10 +08:00
Rahul Joshi
ef71226fcd
[LLVM][TableGen] Change WebAsm Emitter to use const RecordKeeper (#109051)
Change WebAssemblyDisassemblerEmitter to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-18 05:37:54 -07:00
Benjamin Maxwell
43c9203d49
[TLI] Support inferring function attributes for sincos[f|l] (#108554) 2024-09-18 09:40:29 +01:00
David Green
112aac4e89
[InstCombine] Fold fmod to frem if we know it does not set errno. (#107912)
fmod will be folded to frem in clang under -fno-math-errno and can be constant
folded in llvm if the operands are known. It can be relatively common to have
fp code that handles special values before doing some calculation:
```
if (isnan(f))
  return handlenan;
if (isinf(f))
  return handleinf;
..
fmod(f, 2.0)
```

This patch enables the folding of fmod to frem in instcombine if the first
parameter is not inf and the second is not zero. Other combinations do not set
errno.

The same transform is performed for fmod with the nnan flag, which implies the
input is known to not be inf/zero.
2024-09-18 09:38:28 +01:00
Craig Topper
fe012bd52d [SelectionDAG] Use Register around RegisterSDNode related functions. NFC
RegisterSDNode itself already stored a Register.
2024-09-17 23:26:56 -07:00
Fangrui Song
125635eb68 [CMake] Remove unused HAVE_SYS_PARAM_H/HAVE_SYS_TYPES_H 2024-09-17 22:55:53 -07:00
Max Winkler
8280651ad5
[llvm] [Demangle] Fix MSVC demangling for placeholder return types (#106178)
Properly demangle `_T` and `_P` return type manglings for MSVC 1920+.
Also added a unit test for `@` return type that is used when mangling
non-template auto placeholder return type function.

Tested the output against the undname shipped with MSVC 19.40.
2024-09-17 20:05:44 -07:00
vporpo
42c5a301f5
[SandboxVec] Legality boilerplate (#108650)
This patch adds the basic API for the Legality component of the
vectorizer. It also adds some very basic code in the bottom-up
vectorizer that uses the API.
2024-09-17 17:06:29 -07:00
Jorge Gorbe Moya
aa2e6b8734
Revert "[SandboxVec] Add barebones Region class." (#109058)
Reverts llvm/llvm-project#108899

It broke the llvm-clang-x86_64-win-fast buildbot.
2024-09-17 15:47:30 -07:00
Jorge Gorbe Moya
3aecf41c2b
[SandboxVec] Add barebones Region class. (#108899)
A region identifies a set of vector instructions generated by
vectorization passes. The vectorizer can then run a series of
RegionPasses on the region, evaluate the cost, and commit/reject the
transforms on a region-by-region basis, instead of an entire basic
block.

This is heavily based ov @vporpo's prototype. In particular, the doc
comment for the Region class is all his. The rest of this commit is
mostly boilerplate around a SetVector: getters, iterators, and some
debug helpers.
2024-09-17 15:40:24 -07:00
vporpo
b846638548
[SanbdoxIR] Implement BBIterator::getNodeParent() (#109039)
This patch implements sandboxir::BasicBlock::iterator::getNodeParent()
which returns the parent basic block of an iterator.
2024-09-17 15:20:09 -07:00
Kazu Hirata
9c9a627190
[ThinLTO] Add lookup to ImportListsTy (#109036)
This is primarily to unblock Rust, which could potentially use
ImportListsTy::operator[] on a module that's not in ListsImpl and
cause concurrency problems.

This patch fixes a regression in the sense that it restores
ImportListsTy::lookup, which was available when ImportListsTy was just
a plain DenseMap.
2024-09-17 15:15:49 -07:00
Ellis Hoag
9e709dcb70
[NFC][Glob] Escape backslash to fix doxygen rendering (#109055)
The docs for Glob wasn't rendered correctly, I believe because the `\`
was not properly escaped. I haven't built these docs locally, so I'll
follow up to see if this is fixed after it lands.

https://llvm.org/doxygen/classllvm_1_1GlobPattern.html
2024-09-17 16:34:57 -05:00
Craig Topper
783d323da3
[VirtRegMap] Replace a single value enum with a static constexpr member variable. NFC (#109010)
Change the constant to INT_MAX instead of our own large number. Any
value larger than a valid frame index should work.

I'm a bit puzzled why it was using a shift of 30. A long time ago when
it was first created, the value was INT_MAX. Then it was changed in
e2b77d57c0c13 to (~0 >> 1) which I guess was trying to be INT_MAX
without using the constant. But ~0 is an `int` so that produced -1.

I'm not sure what the 'l' suffix was for. Unless that was an attempt to
avoid undefined behavior had the shift been 31 instead of 30. But 'long'
is 32 bits on some targets so that wouldn't have worked for all
platforms.

Using INT_MAX is straightforward and avoids any mysteries.
2024-09-17 13:53:50 -07:00
vporpo
9a312d47f3
[SandboxIR] Implement GlobalAlias (#109019)
This patch implements sandboxir::GlobalAlias, mirroring
llvm::GlobalAlias.
2024-09-17 13:45:36 -07:00
vporpo
318d2f5e5d
[SandboxVec][DAG] Boilerplate (#108862)
This patch adds a very basic implementation of the Dependency Graph to
be used by the vectorizer.
2024-09-17 12:03:52 -07:00
vporpo
b9bf831e8d
[SandboxIR] Implement GlobalVariable (#108642)
This patch implements sandboxir::GlobalVariable mirroring
llvm::GlobalVariable.
2024-09-17 10:26:34 -07:00
Farzon Lotfi
0f97b4824a
[Scalarizer][DirectX] Add support for scalarization of Target intrinsics (#108776)
Since we are using the Scalarizer pass in the backend we needed a way to
allow this pass to operate on Target intrinsics.
We achieved this by adding `TargetTransformInfo ` to the Scalarizer
pass. This allowed us to call a function available to the DirectX
backend to know if an intrinsic is a target intrinsic that should be
scalarized.
2024-09-17 11:35:42 -04:00
Craig Topper
78f7aae895
[VirtRegMap] Remove unused MAX_STACK_SLOT. NFC (#108781)
I think this has been unuesd since
92255f27f1c1884585cbcb3fcbd72bd4b0b533f7 in 2011.
2024-09-17 08:34:49 -07:00
Michael Maitland
ee2add0683
[GISEL] Fix bugs and clarify spec of G_EXTRACT_SUBVECTOR (#108848)
The implementation was missing the fact that `G_EXTRACT_SUBVECTOR`
destination and source vector can be different types.

Also fix a bug in the MIR builder for `G_EXTRACT_SUBVECTOR` to generate
the correct opcode.

Clarify the G_EXTRACT_SUBVECTOR specification.
2024-09-17 10:08:39 -04:00
Stanislav Mekhanoshin
ce73407015
Fix MachineInstr::uses() doc. NFC. (#108950)
Uses was documented as register uses, which is not true.
2024-09-17 03:51:45 -07:00
Csanád Hajdú
bc8a5d104c
[Patchpoint] Add immarg attributes to patchpoint arguments (#97276) 2024-09-17 14:00:24 +04:00
Thorsten Schütt
acfa294b5e
[GlobalIsel] Canonicalize G_FCMP (#108891)
As a side-effect, we start constant folding fcmps.
2024-09-17 09:42:04 +02:00
Farzon Lotfi
8ee685e601
[NFC][DirectX] fix intrinsics that need IntrNoMem and test typo (#108852)
In the process of adding scalarization support for DirectX target
intrinsics I found that intrinsics that weren't marked with `IntrNoMem`
did not get removed by
`RecursivelyDeleteTriviallyDeadInstructionsPermissive`. So this change
is to make it more clear that our intrinsics don't have side effects.

I only added `IntrNoMem` to the intrinics in `IntrinsicsDirectX.td` I
was involved with. There a potentially a few other cases that might
warrant this attribute, but will need input on the others.
2024-09-16 14:19:29 -04:00
David Green
960c975acd
[AArch64] Expand scmp/ucmp vector operations with sub (#108830)
Unlike scalar, where AArch64 prefers expanding scmp/ucmp with select,
under Neon we can use the arithmetic expansion to generate fewer
instructions. Notably it also prevents the scalarization of vselect
during vector-legalization.
2024-09-16 18:44:52 +01:00
nebulark
f5ba3e1fa6
[CodeView] Flatten cmd args in frontend for LF_BUILDINFO (#106369) 2024-09-16 19:29:42 +02:00
Thorsten Schütt
5c348f692a
[GlobalIsel] Canonicalize G_ICMP (#108755)
As a side-effect, we start constant folding icmps.

Split out from https://github.com/llvm/llvm-project/pull/105991.
2024-09-16 19:25:34 +02:00
Sergio Afonso
e0e93c3f76
[Frontend][OpenMP] Follow compound construct clause restrictions (#107853)
This patch removes from the list of allowed clauses for a handful of
compound constructs those that are specifically disallowed by the OpenMP
spec. In particular, the following restrictions are followed:
- (regarding combined constructs) If _directive-name-A_ is `target`, the
`copyin` clause must not be specified.
- (regarding composite constructs) If _directive-name-A_ is
`distribute`, the `ordered` clause must not be specified.

These restrictions are listed in the OpenMP Specification version 5.2,
sections 17.4 and 17.5. Since it's a similar case as PR #90754, I'm
adding people involved in that decision as reviewers here.
2024-09-16 15:02:11 +01:00
David Green
feac761f37
[GlobalISel][AArch64] Add G_FPTOSI_SAT/G_FPTOUI_SAT (#96297)
This is an implementation of the saturating fp to int conversions for
GlobalISel. On AArch64 the converstion instrctions work this way,
producing saturating results. LegalizerHelper::lowerFPTOINT_SAT is
ported from SDAG.

AArch64 has a lot of existing tests for fptosi_sat, covering a wide
range of types. I have tried to make most of them work all at once, but
a few fall back due to other missing features such as f128 handling for
min/max.
2024-09-16 10:33:59 +01:00
Andrea Di Biagio
6784202b6b
[MCA][ResourceManager] Fix a bug in the instruction issue logic. (#108386)
Before this patch, the pipeline selection logic in
ResourceManager::issueInstruction() didn't know how to correctly handle
instructions which consume multiple partially overlapping resource
groups. In some cases (like the test case from #108157), the inability
to correctly allocate resources on instruction issue was leading to
crashes.

The presence of multiple partially overlapping groups complicates the
selection process by introducing extra constraints. For those cases, the
issue logic now prioritizes groups which are more constrained than
others.

Fixes #108157
2024-09-16 09:48:42 +01:00
Nikita Popov
b7e51b4f13
[IPSCCP] Infer attributes on arguments (#107114)
During inter-procedural SCCP, also infer attributes on arguments, not
just return values. This allows other non-interprocedural passes to make
use of the information later.
2024-09-16 10:23:41 +02:00
Nikita Popov
dfa54298ff
[InitUndef] Enable the InitUndef pass on non-AMDGPU targets (#108353)
The InitUndef pass works around a register allocation issue, where undef
operands can be allocated to the same register as early-clobber result
operands. This may lead to ISA constraint violations, where certain
input and output registers are not allowed to overlap.

Originally this pass was implemented for RISCV, and then extended to ARM
in #77770. I've since removed the target-specific parts of the pass in
#106744 and #107885. This PR reduces the pass to use a single
requiresDisjointEarlyClobberAndUndef() target hook and enables it by
default. The hook is disabled for AMDGPU, because overlapping
early-clobber and undef operands are known to be safe for that target,
and we get significant codegen diffs otherwise.

The motivating case is the one in arm64-ldxr-stxr.ll, where we were
previously incorrectly allocating a stxp input and output to the same
register.
2024-09-16 09:48:25 +02:00
Antonio Frighetto
2ae968a0d9
[Instrumentation] Move out to Utils (NFC) (#108532)
Utility functions have been moved out to Utils. Minor opportunity to
drop the header where not needed.
2024-09-15 21:07:40 -07:00
Craig Topper
46f7cb3e84 [CodeGen] Use Register::id() instead of implicit cast to unsigned in Register.h. NFC 2024-09-15 19:01:38 -07:00
Craig Topper
a5b63b5cb7
[VirtRegMap] Store MCRegister in Virt2PhysMap. (#108775)
Remove NO_PHYS_REG in favor of MCRegister() and converting MCRegister to
bool.
2024-09-15 14:04:59 -07:00
Kazu Hirata
00e4575c67
[Instrumentation] Remove extraneous std::move (NFC) (#108764) 2024-09-15 10:35:31 -07:00
Craig Topper
e523f4e2c3 [VirtRegMap] Store Register in Virt2SplitMap. NFC 2024-09-15 10:30:38 -07:00
Craig Topper
23953798f3 [VirtRegMap] Remove unnecessary calls to Register::id() accessing IndexMaps.
VirtReg2IndexFunctor already takes a Register.
2024-09-15 09:59:34 -07:00
Craig Topper
2f48178825 [VirtRegMap] Use Register for Virt2ShapeMap key. NFC 2024-09-15 09:59:34 -07:00
Craig Topper
508e734e33 [CodeGen] Use DenseMapInfo<Register> to implement DenseMapInfo<TargetInstrInfo::RegSubRegPair>. NFC
Instead of casting Register to unsigned to use DenseMapInfo<unsigned>.
2024-09-15 09:59:34 -07:00
Rahul Joshi
3ae71d154e
[LLVM][TableGen] Change CodeGenSchedule to use const RecordKeeper (#108617)
Change CodeGenSchedule to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-15 04:55:29 -07:00
Robert Dazi
8837898b8d
[DAGCombine] Count leading ones: refine post DAG/Type Legalisation if promotion (#102877)
This PR is related to #99591. In this PR, instead of modifying how the
legalisation occurs depending on surrounding instructions, we refine
after legalisation.

This PR has two parts:

* `SDPatternMatch/MatchContext`: Modify a little bit the code to match
Operands (used by `m_Node(...)`) and Unary/Binary/Ternary Patterns to
make it compatible with `VPMatchContext`, instead of only `m_Opc`
supported. Some tests were added to ensure no regressions.
* `DAGCombiner`: Add a `foldSubCtlzNot` which detect and rewrite the
patterns using matching context.

Remaining Tasks:

- [ ] GlobalISel
- [ ] Currently the pattern matching will occur even before
legalisation. Should I restrict it to specific stages instead ?
- [ ] Style: Add a visitVP_SUB ?? Move `foldSubCtlzNot` in another
location for style consistency purpose ?

@topperc

---------

Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>
2024-09-15 15:48:36 +04:00
Craig Topper
a9e05a36db [ARM] Use MCRegister for ARMTargetStreamer::emitRegSave. NFC 2024-09-14 17:25:56 -07:00
Kazu Hirata
390b82dd4c
[ADT] Remove DenseMap::{getOrInsertDefault,FindAndConstruct} (#108678)
These functions have been deprecated since:

  commit 59a3b4156836c3ea8589d7a39e7b4712fc8698ec
  Author: Kazu Hirata <kazu@google.com>
  Date:   Tue Sep 3 08:19:45 2024 -0700

  commit 7732d8e51819416b9d28b1815bdf81d0e0642b04
  Author: Kazu Hirata <kazu@google.com>
  Date:   Wed Sep 4 06:51:30 2024 -0700
2024-09-13 23:34:23 -07:00