2168 Commits

Author SHA1 Message Date
Ahmed Bougacha
3575d23ca8
[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465)
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
2024-05-20 10:23:04 -07:00
Nathan Gauër
e08f1fda75
[clang][SPIR-V] Always add convergence intrinsics (#88918)
PR #80680 added bits in the codegen to lazily add convergence intrinsics
when required. This logic relied on the LoopStack. The issue is when
parsing the condition, the loopstack doesn't yet reflect the correct
values, as expected since we are not yet in the loop.

However, convergence tokens should sometimes already be available. The
solution which seemed the simplest is to greedily generate the tokens
when we generate SPIR-V.

Fixes #88144

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-05-14 17:00:40 +02:00
Brendan Dahl
8a3277acbc
[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545)
Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory. Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is
incorrect and will be changed to 0xFC31 soon.
2024-05-09 15:38:13 -07:00
Farzon Lotfi
31b45a9d0d
[clang][hlsl] Add tan intrinsic part 1 (#90276)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

If you want an overarching view of how this will all connect see:
https://github.com/llvm/llvm-project/pull/90088

Changes:
- `clang/docs/LanguageExtensions.rst` - Document the new elementwise tan
builtin.
-  `clang/include/clang/Basic/Builtins.td` - Implement the tan builtin.
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the tan intrinsic on uses
of the builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the tan builtin
with the equivalent hlsl apis
- `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks as well as
HLSL specifc sema checks to the tan builtin
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
-  `llvm/docs/LangRef.rst` - Document the tan intrinsic
2024-05-07 22:54:15 -04:00
Brendan Dahl
1a2a1fbd7c
[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906)
Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is
incorrect and will be changed to 0xFC30 soon.
2024-05-07 11:33:10 -07:00
Karl-Johan Karlsson
cb015b9ec9
[clang][CodeGen] Propagate pragma set fast-math flags to floating point builtins (#90377)
This is a fix for the issue #87758 where fast-math flags are not
propagated all builtins.

It seems like pragmas with fast math flags was only propagated to calls
of unary floating point builtins. This patch propagate them also for
binary and ternary floating point builtins.
2024-05-04 17:47:48 +02:00
Björn Pettersson
7298ae3b6d
[clang][CodeGen] Fix in codegen for __builtin_popcountg/ctzg/clzg (#90845)
Make sure that the result from the popcnt/ctlz/cttz intrinsics is
unsigned casted to int, rather than casted as a signed value, when
expanding the __builtin_popcountg/__builtin_ctzg/__builtin_clzg
builtins.

An example would be
  unsigned _BitInt(1) x = ...;
  int y = __builtin_popcountg(x);
which previously was incorrectly expanded to
  %1 = call i1 @llvm.ctpop.i1(i1 %0)
  %cast = sext i1 %1 to i32

Since the input type is generic for those "g" versions of the builtins
the intrinsic call may return a value for which the sign bit is set
(that could typically for BitInt of size 1 and 2). So we need to emit a
zext rather than a sext to avoid negative results.
2024-05-02 22:49:39 +02:00
zhijian lin
d4a25976df
Implement a subset of builtin_cpu_supports() features (#82809)
The PR implements a subset of features of function
__builtin_cpu_support() for AIX OS based on the information which AIX
kernel runtime variable `_system_configuration` and function call `getsystemcfg()` of
/usr/include/sys/systemcfg.h  in AIX OS can provide.

Following subset of features are supported in the PR

"arch_3_00", "arch_3_1","booke","cellbe","darn","dfp","dscr" ,"ebb","efpsingle","efpdouble","fpu","htm","isel",
"mma","mmu","pa6t","power4","power5","power5+","power6x","ppc32","ppc601","ppc64","ppcle","smt",
"spe","tar","true_le","ucache","vsx"
2024-05-02 14:59:33 -04:00
Lawrence Benson
bd07c22e53
[Clang] Add support for scalable vectors in __builtin_reduce_* functions (#87750)
Currently, a lot of `__builtin_reduce_*` function do not support
scalable vectors, i.e., ARM SVE and RISCV V. This PR adds support for
them. The main code change is to use a different path to extract the
type from the vectors, the rest is the same and LLVM supports the reduce
functions for `vscale` vectors.

This PR adds scalable vector support for:
- `__builtin_reduce_add`
- `__builtin_reduce_mul`
- `__builtin_reduce_xor`
- `__builtin_reduce_or`
- `__builtin_reduce_and`
- `__builtin_reduce_min`
- `__builtin_reduce_max`

Note: For all except `min/max`, the element type must still be an
integer value. Adding floating point support for `add` and `mul` is
still an open TODO.
2024-04-29 16:45:33 +02:00
Fangrui Song
76739d1256 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
Remove unneeded LLVM_FALLTHROUGH added after https://reviews.llvm.org/D131346
2024-04-25 13:25:08 -07:00
Bill Wendling
712d7dba4f
[Clang] Improve testing for the flexible array member (#89462)
Testing for the name of the flexible array member isn't as robust as
testing the FieldDecl pointers.
2024-04-24 19:39:33 +00:00
Farzon Lotfi
c4c54af569
[SPIRV][HLSL] map lerp to Fmix (#88976)
- `clang/lib/CodeGen/CGBuiltin.cpp` - switch to using
`getLerpIntrinsic()` to abstract backend intrinsic
- `clang/lib/CodeGen/CGHLSLRuntime.h` - add `getLerpIntrinsic()` 
- `llvm/include/llvm/IR/IntrinsicsSPIRV.td` - add SPIRV intrinsic for
lerp
- `llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp` - add mapping of
HLSL's lerp to GLSL's Fmix.

resolves #88940
2024-04-22 12:40:21 -04:00
Farzon Lotfi
5a1a5226b5
[SPIRV][HLSL] Add mad intrinsic lowering for spirv (#89130)
- `clang/lib/CodeGen/CGBuiltin.cpp` - Add a generic mull add
implementation. Make DXIL implementation tied to target.

resolves #88944
2024-04-20 11:13:53 -04:00
Bill Wendling
5bcf31ebfa
[Clang] Loop over FieldDecls instead of all Decls (#89453)
Only FieldDecls are of importance here. A struct defined within another
struct has the same semantics as if it were defined outside of the
struct. So there's no need to look into RecordDecls that aren't a field.
2024-04-19 21:38:17 +00:00
Bill Wendling
c32712d176
[Clang] Handle structs with inner structs and no fields (#89126)
A struct that declares an inner struct, but no fields, won't have a
field count. So getting the offset of the inner struct fails. This
happens in both C and C++:

  struct foo {
    struct bar {
      int Quantizermatrix[];
    };
  };

Here 'struct foo' has no fields.

Closes: https://github.com/llvm/llvm-project/issues/88931
2024-04-19 19:48:33 +00:00
Vitaly Buka
1f35e72271
[clang][builtin] Implement __builtin_allow_runtime_check (#87568)
RFC:
https://discourse.llvm.org/t/rfc-introduce-new-clang-builtin-builtin-allow-runtime-check/78281

---------

Co-authored-by: Noah Goldstein <goldstein.w.n@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-04-16 17:50:16 -07:00
Farzon Lotfi
105dcc882c
[HLSL][SPIRV] Add any intrinsic lowering (#88325)
- `CGBuiltin.cpp` - Switch to using
`CGM.getHLSLRuntime().get##NAME##Intrinsic()`
- `CGHLSLRuntime.h` - Add any to backend intrinsic abstraction
-  `IntrinsicsSPIRV.td` - Add any intrinsic to SPIR-V.
- `SPIRVInstructionSelector.cpp` - Add means of selecting any intrinsic.
Any and All share the same behavior up to the opCode. They are only
different in vector cases.

Completes #88045
2024-04-15 09:52:47 -04:00
Farzon Lotfi
4036a6946e
[HLSL] move rcp to cgbuiltins (#88401)
Removing the intrinsic because there is no opCodes for rcp in DXIL or
SPIR-V.
Moving means we don't have to re-implement this feature for each
backend.

fixes #87784

Co-authored-by: Farzon Lotfi <farzon@farzon.com>
2024-04-11 18:26:25 -04:00
Qiu Chaofan
a4558a4a53
[PowerPC] Implement 32-bit expansion for rldimi (#86783)
rldimi is 64-bit instruction, due to backward compatibility, it needs to
be expanded into series of rotate and masking in 32-bit environment. In
the future, we may improve bit permutation selector and remove such
direct codegen.
2024-04-09 16:43:49 +08:00
Farzon Lotfi
1cb64d75b2
[HLSL][DXIL][SPIRV] Implementation of an abstraction for intrinsic selection of HLSL backends (#87171)
Start of #83882
- `Builtins.td` - add the `hlsl` `all` elementwise builtin.
- `CGBuiltin.cpp` - Show a use case for CGHLSLUtils via an `all`
intrinsic codegen.
- `CGHLSLRuntime.cpp` - move `thread_id` to use CGHLSLUtils.
- `CGHLSLRuntime.h` - Create a macro to help pick the right intrinsic
for the backend.
- `hlsl_intrinsics.h` - Add the `all` api.
- `SemaChecking.cpp` - Add `all` builtin type checking
- `IntrinsicsDirectX.td` - Add the `all` `dx` intrinsic
- `IntrinsicsSPIRV.td` - Add the `all` `spv` intrinsic

Work still needed:
- `SPIRVInstructionSelector.cpp` - Add an implementation of `OpAll` for
`spv_all` intrinsic
2024-04-04 21:41:55 -04:00
David Green
42c7bc04c3
[AArch64][ARM] Make neon fp16 generic intrinsics always available. (#87467)
By generic intrinsics this mean things like dup, ext, zip and bsl that
can always be executed with integer s16 operations and do not require
fullfp16. This makes them always available, and brings them inline with
GCC.
https://godbolt.org/z/azs8eMv54

The relevant test cases have been moved into their own files, to allow
them to be tested with armv8-a and armv8.2-a+fp16.
2024-04-03 19:10:14 +01:00
Sven van Haastregt
e47a81c1d2
[OpenCL] Fix BIenqueue_kernel fallthrough (#83238)
Handling of the `BIenqueue_kernel` builtin must not fallthrough to the
`BIget_kernel_work_group_size` builtin, as these builtins have no common
functionality.
2024-04-02 09:31:38 +02:00
Marc Auberer
3c8ede9f45
[HLSL][clang] Move hlsl_wave_get_lane_index to EmitHLSLBuiltinExpr (#87131)
Resolves #87109
2024-03-30 21:33:56 +01:00
Nathan Gauër
0f61051f54
[clang][HLSL][SPRI-V] Add convergence intrinsics (#80680)
HLSL has wave operations and other kind of function which required the
control flow to either be converged, or respect certain constraints as
where and how to re-converge.

At the HLSL level, the convergence are mostly obvious: the control flow
is expected to re-converge at the end of a scope.
Once translated to IR, HLSL scopes disapear. This means we need a way to
communicate convergence restrictions down to the backend.

For this, the SPIR-V backend uses convergence intrinsics. So this commit
adds some code to generate convergence intrinsics when required.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-03-28 17:18:05 +01:00
Akira Hatanaka
84780af4b0
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
2024-03-28 06:54:36 -07:00
Akira Hatanaka
f75eebab88
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)" (#86898)
This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c.

The commit broke ubsan bots.
2024-03-27 18:14:04 -07:00
Akira Hatanaka
d9a685a9dd
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
2024-03-27 12:24:49 -07:00
Alex Voicu
ab7dba233a
[CodeGen][LLVM] Make the va_list related intrinsics generic. (#85460)
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
2024-03-27 11:41:34 +00:00
Changpeng Fang
d023995ae2
AMDGPU: Simplify EmitAMDGPUBuiltinExpr for load transposes, NFC (#86707)
We should not manually get the types of the loading data.
Instead, we can get the types from the intrinsics directly.
2024-03-26 17:51:03 -07:00
Akira Hatanaka
b311756450
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)" (#86674)
This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6.

It appears that the commit broke msan bots.
2024-03-26 07:37:57 -07:00
Akira Hatanaka
8bd1f9116a
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
2024-03-25 18:05:42 -07:00
Changpeng Fang
350bda4419
AMDGPU: Rename intrinsics and remove f16/bf16 versions for load transpose (#86313)
Rename the intrinsics to close to the instruction mnemonic names:
Use global_load_tr_b64 and global_load_tr_b128 instead of
global_load_tr.

This patch also removes f16/bf16 versions of builtins/intrinsics. To
simplify the design, we should avoid enumerating all possible types in
implementing builtins. We can always use bitcast.
2024-03-25 16:55:22 -07:00
Farzon Lotfi
060df78cdb
[DXIL] Add Float Dot Intrinsic Lowering (#86071)
Completes #83626
- `CGBuiltin.cpp` - modify `getDotProductIntrinsic` to be able to emit
`dot2`, `dot3`, and `dot4` intrinsics based on element count
- `IntrinsicsDirectX.td` - for floating point add `dot2`, `dot3`, and
`dot4` inntrinsics -`DXIL.td` add dxilop intrinsic lowering for `dot2`,
`dot3`, & `dot4`.
- `DXILOpLowering.cpp` - add vector arg flattening for dot product. 
- `DXILOpBuilder.h` - modify `createDXILOpCall` to take a smallVector
instead of an iterator
- `DXILOpBuilder.cpp` - modify `createDXILOpCall` by moving the small
vector up to the calling function in `DXILOpLowering.cpp`.
- Moving one function up gives us access to the `CallInst` and
`Function` which were needed to distinguish the dot product intrinsics
and get the operands without using the iterator.
2024-03-25 18:01:46 -04:00
Changpeng Fang
3054d0dae7
AMDGPU: Rename and add bf16 support for global_load_tr builtins (#86202)
Make the name of a clang builtin as close to the mnemonic instruction
name as possible. The data type suffix may not be enough to tell what
instruction the builtin is going to produce.
  This patch also add the bf16 support for global_load_tr_b128 builtins.
2024-03-22 08:51:53 -07:00
OverMighty
c1c2551a28
[clang] Implement __builtin_{clzg,ctzg} (#83431)
Fixes #83075, fixes #83076.
2024-03-21 09:33:16 -07:00
Yeoul Na
3eb9ff3095
Turn 'counted_by' into a type attribute and parse it into 'CountAttributedType' (#78000)
In `-fbounds-safety`, bounds annotations are considered type attributes
rather than declaration attributes. Constructing them as type attributes
allows us to extend the attribute to apply nested pointers, which is
essential to annotate functions that involve out parameters: `void
foo(int *__counted_by(*out_count) *out_buf, int *out_count)`.

We introduce a new sugar type to support bounds annotated types,
`CountAttributedType`. In order to maintain extra data (the bounds
expression and the dependent declaration information) that is not
trackable in `AttributedType` we create a new type dedicate to this
functionality.

This patch also extends the parsing logic to parse the `counted_by`
argument as an expression, which will allow us to extend the model to
support arguments beyond an identifier, e.g., `__counted_by(n + m)` in
the future as specified by `-fbounds-safety`.

This also adjusts `__bdos` and array-bounds sanitizer code that already
uses `CountedByAttr` to check `CountAttributedType` instead to get the
field referred to by the attribute.
2024-03-20 13:36:56 +09:00
Farzon Lotfi
081a66ffac
[DXIL] implement dot intrinsic lowering for integers (#85662)
this implements part 1 of 2 for #83626
- `CGBuiltin.cpp` - modified to have seperate cases for signed and
unsigned integers.
- `SemaChecking.cpp` - modified to prevent the generation of a double
dot product intrinsic if the builtin were to be called directly.
- `IntrinsicsDirectX.td` creation of the signed and unsigned dot
intrinsics needed for instruction expansion.
- `DXILIntrinsicExpansion.cpp` - handle instruction expansion cases for
integer dot product.
2024-03-19 12:03:43 -04:00
Farzon Lotfi
8386a388bd
[HLSL] implement clamp intrinsic (#85424)
closes #70071
- `CGBuiltin.cpp` - Add the unsigned\generic clamp intrinsic emitter.
- `IntrinsicsDirectX.td` - add the `dx.clamp` & `dx.uclamp` intrinsics
- `DXILIntrinsicExpansion.cpp` - add the `clamp` instruction expansion
while maintaining vector form.
- `SemaChecking.cpp` -  Add `clamp`  builtin Sema Checks.
- `Builtins.td` - add a `clamp` builtin
- `hlsl_intrinsics.h` - add the `clamp` api

Why `clamp` as instruction expansion  for DXIL?
1. SPIR-V has a GLSL `clamp` extension via:
-
[FClamp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#FClamp)
-
[UClamp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#UClamp)
-
[SClamp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#SClamp)
2. Further Clamp lowers to `min(max( x, min_range ), max_range)` which
we have float, signed, & unsigned dixilOps.
2024-03-15 20:57:08 -04:00
Ahmed Bougacha
0481f049c3
[AArch64][PAC] Support ptrauth builtins and -fptrauth-intrinsics. (#65996)
This defines the basic set of pointer authentication clang builtins
(provided in a new header, ptrauth.h), with diagnostics and IRGen
support.  The availability of the builtins is gated on a new flag,
`-fptrauth-intrinsics`.

Note that this only includes the basic intrinsics, and notably excludes
`ptrauth_sign_constant`, `ptrauth_type_discriminator`, and
`ptrauth_string_discriminator`, which need extra logic to be fully
supported.

This also introduces clang/docs/PointerAuthentication.rst, which
describes the ptrauth model in general, in addition to these builtins.

Co-Authored-By: Akira Hatanaka <ahatanaka@apple.com>
Co-Authored-By: John McCall <rjmccall@apple.com>
2024-03-15 14:17:21 -07:00
Farzon Lotfi
de1a97db39
[DXIL] exp, any, lerp, & rcp Intrinsic Lowering (#84526)
This change implements lowering for #70076, #70100, #70072, & #70102 
`CGBuiltin.cpp` - - simplify `lerp` intrinsic
`IntrinsicsDirectX.td` - simplify `lerp` intrinsic
`SemaChecking.cpp` - remove unnecessary check
`DXILIntrinsicExpansion.*` - add intrinsic to instruction expansion
cases
`DXILOpLowering.cpp` - make sure `DXILIntrinsicExpansion` happens first
`DirectX.h` - changes to support new pass
`DirectXTargetMachine.cpp` - changes to support new pass

Why `any`, and `lerp` as instruction expansion just for DXIL?
- SPIR-V there is an
[OpAny](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpAny)
- SPIR-V has a GLSL lerp extension via
[Fmix](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#FMix)

Why `exp` instruction expansion?
- We have an `exp2` opcode and `exp` reuses that opcode. So instruction
expansion is a convenient way to do preprocessing.
- Further SPIR-V has a GLSL exp extension via
[Exp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp)
and
[Exp2](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp2)

Why `rcp` as instruction expansion?
This one is a bit of the odd man out and might have to move to
`cgbuiltins` when we better understand SPIRV requirements. However I
included it because it seems like [fast math mode has an AllowRecip
flag](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_fp_fast_math_mode)
which lets you compute the reciprocal without performing the division.
We don't have that in DXIL so thought to include it.
2024-03-14 20:25:57 -04:00
Farzon Lotfi
d192b64370
[HLSL] implement the isinf intrinsic (#84927)
This change implements part 1 of 2 for #70095
- `hlsl_intrinsics.h` - add the `isinf` api
- `Builtins.td` - add an hlsl builtin for `isinf`.
- `CGBuiltin.cpp` add the ir generation for `isinf` intrinsic.
- `SemaChecking.cpp` - add a non-math elementwise checks because this is
a bool return.
- `IntrinsicsDirectX.td` - add an `isinf` intrinsic.

`DXIL.td` lowering is left, but changes need to be made there before we
can support this case.
2024-03-14 18:07:48 -04:00
Farzon Lotfi
8f9ee39c58
[HLSL] Implement rsqrt intrinsic (#84820)
This change implements #70074
- `hlsl_intrinsics.h` - add the `rsqrt` api
- `DXIL.td` add the llvm intrinsic to DXIL op lowering map.
- `Builtins.td` - add an hlsl builtin for rsqrt.
- `CGBuiltin.cpp` add the ir generation for the rsqrt intrinsic.
- `SemaChecking.cpp` - reuse the one arg float only  checks.
- `IntrinsicsDirectX.td` -add an `rsqrt` intrinsic.
2024-03-14 16:49:33 -04:00
Tim Northover
4299c727e4 AArch64: add __builtin_arm_trap
It's useful to provide an indicator code with the trap, which the generic
__builtin_trap can't do. asm("brk #N") is an option, but following that with a
__builtin_unreachable() leads to two traps when the compiler doesn't know the
block can't return. So compiler support like this is useful.
2024-03-14 11:32:44 +00:00
Sven van Haastregt
c7f1a987a6 [OpenCL] Elaborate about BIenqueue_kernel expansion; NFC 2024-03-12 12:53:22 +00:00
Joseph Huber
1fc5e50ceb
[AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (#83906)
Summary:
This patch implements the LLVM floating point environment control
intrinsics and also exposes it through clang. We encode the floating
point environment as a 64-bit value that simply concatenates the values
of the mode registers and the current trap status. We only fetch the
bits relevant for floating point instructions. That is, rounding mode,
denormalization mode, ieee, dx10 clamp, debug, enabled traps, f16
overflow, and active exceptions.
2024-03-06 08:11:54 -06:00
Farzon Lotfi
5a5266248d
[HLSL] implement the rcp intrinsic (#83857)
This PR implements the frontend for llvm#70100
This PR is part 1 of 2.
Part 2 requires an intrinsic to instructions lowering.


- `Builtins.td` - add an `rcp` builtin
- `CGBuiltin.cpp` - add the builtin to intrinsic lowering
-  `hlsl_intrinsics.h` - add the `rcp`  api
- `SemaChecking.cpp` - reuse frac's sema checks
- `IntrinsicsDirectX.td` - add the llvm intrinsic
2024-03-05 16:11:13 -05:00
Farzon Lotfi
2807ea6b80
[HLSL] implement the any intrinsic (#83903)
This PR implements the frontend for #70076
This PR is part 1 of 2.
Part 2 requires an intrinsic to instructions lowering.

- `Builtins.td` - add an `any` builtin
- `CGBuiltin.cpp` add the builtin to intrinsic lowering
- `hlsl_basic_types.h` -add the `bool` vectors since that is an input
for any
- `hlsl_intrinsics.h` - add the `any`  api
- `SemaChecking.cpp` - addy `any` builtin checking
- `IntrinsicsDirectX.td` - add the llvm intrinsic
2024-03-05 12:46:01 -05:00
Farzon Lotfi
643b31dbe8
[HLSL] implement mad intrinsic (#83826)
This change implements #83736
The dot product lowering needs a tertiary multipy add operation. DXIL
has three mad opcodes for `fmad`(46), `imad`(48), and `umad`(49). Dot
product in DXIL only uses `imad`\ `umad`, but for completeness and
because the hlsl `mad` intrinsic requires it `fmad` was also included.
Two new intrinsics were needed to be created to complete this change.
the `fmad` case already supported by llvm via `fmuladd` intrinsic.

- `hlsl_intrinsics.h` - exposed mad api call.
- `Builtins.td` - exposed a `mad` builtin.
- `Sema.h` - make `tertiary` calls check for float types optional. 
- `CGBuiltin.cpp` - pick the intrinsic for singed\unsigned & float also
reuse `int_fmuladd`.
- `SemaChecking.cpp` - type checks for `__builtin_hlsl_mad`. 
- `IntrinsicsDirectX.td` create the two new intrinsics for
`imad`\`umad`/
- `DXIL.td` - create the llvm intrinsic to  `DXIL` opcode mapping.

---------

Co-authored-by: Farzon Lotfi <farzon@farzon.com>
2024-03-05 12:23:26 -05:00
Qiu Chaofan
906580bad3
[PowerPC] Add intrinsics for rldimi/rlwimi/rlwnm (#82968)
These builtins are already there in Clang, however current codegen may
produce suboptimal results due to their complex behavior. Implement them
as intrinsics to ensure expected instructions are emitted.
2024-03-04 21:13:59 +08:00
Pavel Iliin
185b1df1b1
[X86][AArch64][PowerPC] __builtin_cpu_supports accepts unknown options. (#83515)
The patch fixes https://github.com/llvm/llvm-project/issues/83407
modifing __builtin_cpu_supports behaviour so that it returns false if
unsupported features names provided in parameter and issue a warning.
__builtin_cpu_supports is target independent, but currently supported by
X86, AArch64 and PowerPC only.
2024-03-01 10:12:19 +00:00