615 Commits

Author SHA1 Message Date
antangelo
b9ac390cc7
[GISel] Add generic implementation for @llvm.expect.with.probability when optimizations are disabled (#117835)
Handle @llvm.expect.with.probability in GlobalISel in the same way
@llvm.expect is handled, passing the value through as-is. This can be
encountered if the intrinsic is used without optimizations, which would
otherwise transform it out.

Fixes #115411 for GlobalISel
2024-11-29 22:30:13 -05:00
Kazu Hirata
4048c64306
[llvm] Remove redundant control flow statements (NFC) (#115831)
Identified with readability-redundant-control-flow.
2024-11-12 10:09:42 -08:00
Thorsten Schütt
e399322d5e
[GlobalISel] Import llvm.stepvector (#115721) 2024-11-11 21:35:22 +01:00
David Green
a4e507df7a
[AArch64][GlobalISel] Do not create LIFETIME instructions in functions. (#115669)
For the same reason that we do not translate lifetime markers in a -O0,
we should not translate them for optnone functions too.
2024-11-11 09:27:40 +00:00
Kazu Hirata
b83399eab6
[GlobalISel] Remove unused includes (NFC) (#115429)
Identified with misc-include-cleaner.
2024-11-08 22:28:47 -08:00
Thorsten Schütt
9061e6e58a
[GlobalISel][AArch64] Legalize G_EXTRACT_VECTOR_ELT for SVE (#115161)
AArch64InstrGISel.td defines:
def : GINodeEquiv<G_EXTRACT_VECTOR_ELT, vector_extract>;

There are many patterns for SVE. Let's exploit that fact.
2024-11-08 07:58:17 +01:00
Thorsten Schütt
b3bb6f18bb
[GlobalISel] Import samesign flag (#114267)
Credits: https://github.com/llvm/llvm-project/pull/111419

Fixes icmp-flags.mir

First attempt: https://github.com/llvm/llvm-project/pull/113090

Revert: https://github.com/llvm/llvm-project/pull/114256
2024-10-30 19:56:25 +01:00
Thorsten Schütt
4b028773b2
Revert "[GlobalISel] Import samesign flag" (#114256)
Reverts llvm/llvm-project#113090
2024-10-30 17:03:17 +01:00
Thorsten Schütt
72b115301d
[GlobalISel] Import samesign flag (#113090)
Credits: https://github.com/llvm/llvm-project/pull/111419
2024-10-30 16:34:01 +01:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
Keith Packard
44b020a381
[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928)
Add support for using a thread-local variable with a specified offset
for holding the stack guard canary value. This supports both 32- and 64-
bit PowerPC targets.

This mirrors changes from #108942 but targeting PowerPC instead of
RISCV. Because both of these PRs modify the same driver functions, this
series is stack on top of the RISC-V one.

---------

Signed-off-by: Keith Packard <keithp@keithp.com>
2024-10-17 19:06:47 -07:00
duk
464a7ee79e
[CodeGen] Generalize trap emission after SP check fail (#109744)
Generalize and improve some target-specific code that emits traps after
stack protector failure in SelectionDAG & GlobalIsel.
2024-10-12 20:01:22 -04:00
Stephen Tozer
d826b0c90f
[LLVM] Add HasFakeUses to MachineFunction (#110097)
Following the addition of the llvm.fake.use intrinsic and corresponding
MIR instruction, two further changes are planned: to add an
-fextend-lifetimes flag to Clang that emits these intrinsics, and to
have -Og enable this flag by default. Currently, some logic for handling
fake uses is gated by the optdebug attribute, which is intended to be
switched on by -fextend-lifetimes (and by extension -Og later on).
However, the decision was made that a general optdebug attribute should
be incompatible with other opt_ attributes (e.g. optsize, optnone),
since they all express different intents for how to optimize the
program. We would still like to allow -fextend-lifetimes with optsize
however (i.e. -Os -fextend-lifetimes should be legal), since it may be a
useful configuration and there is no technical reason to not allow it.

This patch resolves this by tracking MachineFunctions that have fake
uses, allowing us to run passes that interact with them and skip passes
that clash with them.
2024-10-04 13:13:30 +01:00
Thorsten Schütt
53943de73a
[GlobalISel] Import extract/insert subvector (#110287)
Test: AArch64/GlobalISel/irtranslator-subvector.ll

Reference:

https://llvm.org/docs/LangRef.html#llvm-vector-extract-intrinsic 
https://llvm.org/docs/LangRef.html#llvm-vector-insert-intrinsic
2024-09-30 22:12:06 +02:00
Tex Riddell
139688a699
[SPIRV] Add atan2 function lowering (p2) (#110037)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add generic opcode for atan2
- Add SPIRV lowering for atan2

Part 2 for Implement the atan2 HLSL Function #70096.
2024-09-26 15:00:59 -07:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Craig Topper
292ee93a87
[CodeGen] Use Register in SwitchLoweringUtils. NFC (#109092)
Use an empty Register() instead of -1U.
2024-09-18 09:43:21 -07:00
David Green
feac761f37
[GlobalISel][AArch64] Add G_FPTOSI_SAT/G_FPTOUI_SAT (#96297)
This is an implementation of the saturating fp to int conversions for
GlobalISel. On AArch64 the converstion instrctions work this way,
producing saturating results. LegalizerHelper::lowerFPTOINT_SAT is
ported from SDAG.

AArch64 has a lot of existing tests for fptosi_sat, covering a wide
range of types. I have tried to make most of them work all at once, but
a few fall back due to other missing features such as f128 handling for
min/max.
2024-09-16 10:33:59 +01:00
Craig Topper
367c145e5f
[IRTranslator][RISCV] Support scalable vector zeroinitializer. (#108666) 2024-09-14 15:46:18 -07:00
Craig Topper
947374c393
[IRTranslator] Simplify fixed vector ConstantAggregateZero handling. NFC (#108667)
We don't need to loop through the elements, they're all the same zero.
We can get the first element and create a splat build_vector.
2024-09-13 22:02:29 -07:00
Craig Topper
e6e857cdf9
[GISel] Use Function::getFunctionType() instead of getType() in some remarks. (#107651)
getType() on a Function is always 'ptr'. We should use getFunctionType()
so we get the function signature.
2024-09-06 19:59:44 -07:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Craig Topper
d5c292d8ef
[GISel][RISCV] Correctly handle scalable vector shuffles of pointer vectors in IRTranslator. (#106580) 2024-08-29 12:35:50 -07:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
David Green
05d17a1c70
[GlobalISel] Bail out early for big-endian (#103310)
If we continue through the function we can currently hit crashes. We can
bail out early and fall back to SDAG.

Fixes #103032
2024-08-19 18:50:47 +01:00
Kazu Hirata
7aa4726f44
[CodeGen] Use range-based for loops (NFC) (#104536) 2024-08-16 08:35:43 -07:00
Tobias Stadler
5c3a0fa7e5
[GlobalISel] IRTranslator: Use RAIIMFObsDelInstaller (#102379)
Similar to #102156.

runOnMachineFunction() installs the Observer correctly manually.

finishPendingPhis() doesn't install into the MF, but this currently
can't cause issues because DILocationVerifier doesn't care about changed
instructions.

Switch to RAIIMFObsDelInstaller in both places for consistency.
2024-08-13 15:10:59 +02:00
Alexis Engelke
37e75cdf9f
[CodeGen] Use BasicBlock numbers to map to MBBs (#101883)
Now that basic blocks have numbers, we can replace the BB-to-MBB maps
and the visited set during ISel with vectors for faster lookup.
2024-08-06 10:22:31 +02:00
Joseph Huber
615b7eeaa9 Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5.

I moved the `ISD` dependencies into the CodeGen portion of the handling,
it's a little awkward but it's the easiest solution I can think of for
now.
2024-07-20 09:29:31 -05:00
NAKAMURA Takumi
740161a9b9 Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)"
This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69.
(llvmorg-19-init-17714-gc05126bdfc3b)
See #99610
2024-07-20 12:36:57 +09:00
Matt Arsenault
0f0cfcff2c
CodeGen: Avoid some references to MachineFunction's getMMI (#99652)
MachineFunction's probably should not include a backreference to
the owning MachineModuleInfo. Most of these references were used
just to query the MCContext, which MachineFunction already directly
stores. Other contexts are using it to query the LLVMContext, which
can already be accessed through the IR function reference.
2024-07-19 22:09:05 +04:00
Thorsten Schütt
f554dd7e76
[GlobalIsel] import G_SCMP and G_UCMP (#99518)
See https://github.com/llvm/llvm-project/pull/98894
2024-07-19 06:16:12 +02:00
Lawrence Benson
177ce1900f
[LLVM] Add llvm.experimental.vector.compress intrinsic (#92289)
This PR adds a new vector intrinsic `@llvm.experimental.vector.compress`
to "compress" data within a vector based on a selection mask, i.e., it
moves all selected values (i.e., where `mask[i] == 1`) to consecutive
lanes in the result vector. A `passthru` vector can be provided, from
which remaining lanes are filled.

The main reason for this is that the existing
`@llvm.masked.compressstore` has very strong constraints in that it can
only write values that were selected, resulting in guard branches for
all targets except AVX-512 (and even there the AMD implementation is
_very_ slow). More instruction sets support "compress" logic, but only
within registers. So to store the values, an additional store is needed.
But this combination is likely significantly faster on many target as it
avoids branches.

In follow up PRs, my plan is to add target-specific lowerings for x86,
SVE, and possibly RISCV. I also want to combine this with a store
instruction, as this is probably a common case and we can avoid some
memory writes in that case.

See [discussion in
forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663)
for initial discussion on the design.
2024-07-17 14:24:24 +02:00
Joseph Huber
c05126bdfc
[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512)
Summary:
The LTO pass and LLD linker have logic in them that forces extraction
and prevent internalization of needed runtime calls. However, these
currently take all RTLibcalls into account, even if the target does not
support them. The target opts-out of a libcall if it sets its name to
nullptr. This patch pulls this logic out into a class in the header so
that LTO / lld can use it to determine if a symbol actually needs to be
kept.

This is important for targets like AMDGPU that want to be able to use
`lld` to perform the final link step, but does not want the overhead of
uncalled functions. (This adds like a second to the link time trivially)
2024-07-16 06:22:09 -05:00
Ahmed Bougacha
d286efeb1d
[AArch64][PAC] Lower direct authenticated calls to ptrauth constants. (#97664)
This tries to turn indirect ptrauth calls into direct calls, using
`ConstantPtrAuth::isKnownEquivalent` to compare the `ConstantPtrAuth`
target with the ptrauth call bundle.

This should be straightforward, other than the somewhat awkward GISel
handling, which has a handshake between CallLowering and IRTranslator to
elide the ptrauth when possible.
2024-07-15 16:14:43 -07:00
Kazu Hirata
75bc20ff89
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914) 2024-07-07 08:23:41 +09:00
Igor Kudrin
23db37c51c
[CodeGen] Do not emit TRAP for unreachable after @llvm.trap (#94570)
With `--trap-unreachable`, `clang` can emit double `TRAP` instructions
for code that contains a call to `__builtin_trap()`:

```
> cat test.c
void test() { __builtin_trap(); }
> clang test.c --target=x86_64 -mllvm --trap-unreachable -O1 -S -o -
...
test:
...
  ud2
  ud2
...
```

`SimplifyCFGPass` inserts `unreachable` after a call to a `noreturn`
function, and later this instruction causes `TRAP/G_TRAP` to be emitted
in `SelectionDAGBuilder::visitUnreachable()` or
`IRTranslator::translateUnreachable()` if
`TargetOptions.TrapUnreachable` is set.

The patch checks the instruction before `unreachable` and avoids
inserting an additional trap.
2024-07-02 15:36:02 -07:00
Youngsuk Kim
a95c85fba5
[llvm][CodeGen] Avoid 'raw_string_ostream::str' (NFC) (#97318)
Since `raw_string_ostream` doesn't own the string buffer, it is
desirable (in terms of memory safety) for users to directly reference
the string buffer rather than use `raw_string_ostream::str()`.

Work towards TODO comment to remove `raw_string_ostream::str()`.
2024-07-01 21:52:37 -04:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Daniil Kovalev
1488fb4153
[PAC][AArch64] Lower ptrauth constants in code (#96879)
This re-applies #94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570

According to standard, `constexpr` variables and `const` variables
initialized with constant expressions can be used in lambdas w/o
capturing - see https://en.cppreference.com/w/cpp/language/lambda.
However, MSVC used on buildkite seems to ignore that rule and does not
allow using such uncaptured variables in lambdas: we have "error C3493:
'Mask16' cannot be implicitly captured because no default capture mode
has been specified" - see
https://buildkite.com/llvm-project/github-pull-requests/builds/73238

Explicitly capturing such a variable, however, makes buildbot fail with
"error: lambda capture 'Mask16' is not required to be captured for this
use [-Werror,-Wunused-lambda-capture]" - see
https://lab.llvm.org/buildbot/#/builders/51/builds/570.

Fix both cases by using `0xffff` value directly instead of giving a name
to it.

Original PR description below.

Depends on #94240.

Define the following pseudos for lowering ptrauth constants in code:

- non-`extern_weak`:
  - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
  - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-06-28 07:29:38 +03:00
Daniil Kovalev
99251f5a11
Revert "[PAC][AArch64] Lower ptrauth constants in code (#94241)" (#96865)
This reverts #94241.

See buildbot failure
https://lab.llvm.org/buildbot/#/builders/51/builds/570
2024-06-27 11:10:38 +03:00
Daniil Kovalev
b5cc19e572
[PAC][AArch64] Lower ptrauth constants in code (#94241)
Depends on #94240.

Define the following pseudos for lowering ptrauth constants in code:

- non-`extern_weak`:
  - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
    PAC;
  - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
  special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
  during relocation resolving instead of a GOT slot.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-06-27 10:02:17 +03:00
Serge Pavlov
f9795f34a6
[GlobalISel] Add build methods for FP environment intrinsics (#96607)
This change adds methods like buildGetFPEnv and similar for opcodes that
represent manipulation on floating-point state.
2024-06-25 16:13:52 +07:00
Nikita Popov
f2f18459d4 Revert "Intrinsic: introduce minimumnum and maximumnum (#93841)"
As far as I can tell, this pull request was not approved, and
did not go through an RFC on discourse.

This reverts commit 89881480030f48f83af668175b70a9798edca2fb.
This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.
2024-06-21 08:34:04 +02:00
YunQiang Su
8988148003
Intrinsic: introduce minimumnum and maximumnum (#93841)
Currently, on different platform, the behaivor of llvm.minnum is
different if one operand is sNaN:

When we compare sNaN vs NUM:

ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN.
RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86:
Returns NUM but not same with IEEE754-2019's minimumNumber as
     +0.0 is not always greater than -0.0.
MIPS/LoongArch/Generic: return NUM.
LIBCALL: returns qNaN.

So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow
IEEE754-2019's minimumNumber/maximumNumber.

Half-fix: #93033
2024-06-21 11:53:08 +08:00
Farzon Lotfi
2ae6889d3f
[SPIRV] Add trig function lowering (#95973)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This is part 2 of 4 PRs. It sets the ground work for adding the
intrinsics.

Add SPIRV  Lower for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`
https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966


There isn't any aarch64 change in this pr, but when you add a target
opcode it is visible in there validaiton tests.
2024-06-20 10:34:23 -04:00
Thorsten Schütt
b1f9440fa9
[GlobalIsel] Import GEP flags (#93850)
https://github.com/llvm/llvm-project/pull/90824
2024-06-14 20:56:43 +02:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
Ahmed Bougacha
cc548ec47c
[AArch64][PAC] Lower authenticated calls with ptrauth bundles. (#85736)
This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure, eventually selecting a BLRA
pseudo.  The pseudo encapsulates the safe discriminator
computation, which together with the real BLRA* call get emitted
in late pseudo expansion in AsmPrinter.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.
However this doesn't currently support PAuth_LR tail-call variants.

This also adopts an x8+ allocation order for GPR64noip, matching
GPR64.
2024-05-31 14:08:10 -07:00
Him188
8bce40b1eb
[AArch64][GISel] Support SVE with 128-bit min-size for G_LOAD and G_STORE (#92130)
This patch adds basic support for scalable vector types in load & store
instructions for AArch64 with GISel.

Only scalable vector types with a 128-bit base size are supported, e.g.
`<vscale x 4 x i32>`, `<vscale x 16 x i8>`.

This patch adapted some ideas from a similar abandoned patch
[https://github.com/llvm/llvm-project/pull/72976](https://github.com/llvm/llvm-project/pull/72976).
2024-05-30 09:10:43 +01:00