2044 Commits

Author SHA1 Message Date
Tex Riddell
5c2a133b13
Emit constrained atan2 intrinsic for clang builtin (#113636)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin
- `clang/test/CodeGenCXX/builtin-calling-conv.cpp` - Use erff instead of
atan2 for clang builtin to lib call calling convention check, now that
atan2 maps to an intrinsic.
- add atan2 cases to llvm.experimental.constrained tests for more
backends: ARM, PowerPC, RISCV, SystemZ.
- LangRef.rst: add llvm.experimental.constrained.atan2, revise
llvm.atan2 description.

Last part of Implement the atan2 HLSL Function. Fixes #70096.
2024-11-12 13:34:29 -08:00
Malay Sanghi
f77101ea79
[X86][AMX] Support AMX-MOVRS (#115151)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-12 15:05:43 +08:00
Finn Plummer
e520b28397
[DXIL][SPIRV] Lower WaveActiveCountBits intrinsic (#113382)
```
  - add codegen for llvm builtin to spirv/directx intrinsic in CGBuiltin.cpp
  - add lowering of spirv intrinsic to spirv backend in SPIRVInstructionSelector.cpp
  - add lowering of directx intrinsic to dxil op in DXIL.td

  - add test cases to illustrate passes
  - add test case for semantic analysis
```
  
Resolves #80176
2024-11-07 19:06:37 -08:00
Adam Yang
36d757f840
[HLSL][SPIRV] Added clamp intrinsic (#113394)
Fixes #88052

- Added the following intrinsics:
  - `int_spv_uclamp`
  - `int_spv_sclamp`
  - `int_spv_fclamp`
- Updated DirectX counterparts to have the same three clamp intrinsics.
- Update the clamp.hlsl unit tests to include SPIRV
- Added the SPIRV specific tests
2024-11-07 17:47:53 -08:00
Bill Wendling
7475156d49
[Clang] Add __builtin_counted_by_ref builtin (#114495)
The __builtin_counted_by_ref builtin is used on a flexible array
pointer and returns a pointer to the "counted_by" attribute's COUNT
argument, which is a field in the same non-anonymous struct as the
flexible array member. This is useful for automatically setting the
count field without needing the programmer's intervention. Otherwise
it's possible to get this anti-pattern:
    
      ptr = alloc(<ty>, ..., COUNT);
      ptr->FAM[9] = 42; /* <<< Sanitizer will complain */
      ptr->count = COUNT;
    
To prevent this anti-pattern, the user can create an allocator that
automatically performs the assignment:
    
      #define alloc(TY, FAM, COUNT) ({ \
          TY __p = alloc(get_size(TY, COUNT));             \
          if (__builtin_counted_by_ref(__p->FAM))          \
              *__builtin_counted_by_ref(__p->FAM) = COUNT; \
          __p;                                             \
      })

The builtin's behavior is heavily dependent upon the "counted_by"
attribute existing. It's main utility is during allocation to avoid
the above anti-pattern. If the flexible array member doesn't have that
attribute, the builtin becomes a no-op. Therefore, if the flexible
array member has a "count" field not referenced by "counted_by", it
must be set explicitly after the allocation as this builtin will
return a "nullptr" and the assignment will most likely be elided.

---------

Co-authored-by: Bill Wendling <isanbard@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-11-07 22:03:55 +00:00
Finn Plummer
bf30b6c33c
[HLSL][SPIRV][DXIL] Implement dot4add_u8packed intrinsic (#115068)
```- create a clang built-in in Builtins.td
- link dot4add_u8packed in hlsl_intrinsics.h
- add lowering to spirv backend through expansion of operation as OpUDot is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp
- add lowering to spirv backend using OpUDot if applicable SPIRV version or SPV_KHR_integer_dot_product is enabled
- add dot4add_u8packed intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td op Dot4AddU8Packed

- add tests for HLSL intrinsic lowering to dx/spv intrinsic in dot4add_u8packed.hlsl
- add tests for sema checks in dot4add_u8packed-errors.hlsl
- add test of spir-v lowering in SPIRV/dot4add_u8packed.ll
- add test to dxil lowering in DirectX/dot4add_u8packed.ll
```

Resolves #99219
2024-11-07 10:19:41 -08:00
Sarah Spall
fb90733e19
[HLSL] implement elementwise firstbithigh hlsl builtin (#111082)
Implements elementwise firstbithigh hlsl builtin.
Implements firstbituhigh intrinsic for spirv and directx, which handles
unsigned integers
Implements firstbitshigh intrinsic for spirv and directx, which handles
signed integers.
Fixes #113486
Closes #99115
2024-11-06 07:31:39 -08:00
Matt Arsenault
0c60573d1c
clang/AMDGPU: Emit grid size builtins with range metadata (#113038)
These cannot be 0.
2024-11-05 12:47:04 -08:00
Finn Plummer
3cdac06708
[HLSL][SPIRV][DXIL] Implement dot4add_i8packed intrinsic (#113623)
- create a clang built-in in Builtins.td
- link dot4add_i8packed in hlsl_intrinsics.h
- add lowering to spirv backend through expansion of operation as OPSDot
is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp
- add lowering to spirv backend using OpSDot in applicable SPIRV version
or if SPV_KHR_integer_dot_product is enabled
- add dot4add_i8packed intrinsic to IntrinsicsDirectX.td and mapping to
DXIL.td op Dot4AddI8Packed

- add tests for HLSL intrinsic lowering to dx/spv intrinsic in
dot4add_i8packed.hlsl
- add tests for sema checks in dot4add_i8packed-errors.hlsl
- add test of spir-v lowering in SPIRV/dot4add_i8packed.ll
- add test to dxil lowering in DirectX/dot4add_i8packed.ll
    
 Resolves #99220
2024-11-05 10:29:08 -08:00
Phoebe Wang
c72a751dab
[X86][AMX] Support AMX-TRANSPOSE (#113532)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-11-01 16:45:03 +08:00
Craig Topper
cd8d507b07 [RISCV] Pull __builtin_riscv_clz/ctz out of a nested switch. NFC
The nested switch exists to share setting IntrinsicsTypes to {ResultType}.
clz/ctz return before we reach that so they can just be in the top
level switch.
2024-10-31 11:01:58 -07:00
Simon Pilgrim
fcaa8c6e22 Fix MSVC "signed/unsigned mismatch" warning. NFC. 2024-10-31 11:50:19 +00:00
Stanislav Mekhanoshin
ba1a09da8d
[AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (#113610)
The same handling as for __builtin_amdgcn_mov_dpp.
2024-10-31 02:19:20 -07:00
joaosaffran
481bce018e
Adding splitdouble HLSL function (#109331)
- Adding hlsl `splitdouble` intrinsics
- Adding DXIL lowering
- Adding SPIRV lowering
- Adding test

Fixes: #108901

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-10-28 13:26:59 -07:00
Simon Pilgrim
d6d4569dd9 Fix MSVC "signed/unsigned mismatch" warnings. NFC. 2024-10-28 11:45:36 +00:00
Alex MacLean
fb33af08e4
[NVPTX] Remove nvvm.ldg.global.* intrinsics (#112834)
Remove these intrinsics which can be better represented by load
instructions with `!invariant.load` metadata:

- llvm.nvvm.ldg.global.i
- llvm.nvvm.ldg.global.f
- llvm.nvvm.ldg.global.p
2024-10-27 16:14:13 -07:00
Jay Foad
4dd55c567a
[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)
Follow up to #109133.
2024-10-24 10:23:40 +01:00
Alex Voicu
6e0b0038cd
[clang][OpenCL][CodeGen][AMDGPU] Do not use private as the default AS for when generic is available (#112442)
Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
2024-10-22 12:05:48 +01:00
Stanislav Mekhanoshin
622e398d88
[AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (#112447)
We need to support 64-bit data types (intrinsics do support it). We are
also silently converting FP to integer argument now, also fixed.
2024-10-21 11:57:18 -07:00
Sven van Haastregt
5a09ce9e03
[OpenCL] Replace a CreatePointerCast call; NFC (#112676)
With opaque pointers, the only purpose of the cast here is to cast
between address spaces, similar to the 4-argument case below.
2024-10-18 09:10:05 +02:00
Bill Wendling
8c62bf54df
[Clang] Disable use of the counted_by attribute for whole struct pointers (#112636)
The whole struct is specificed in the __bdos. The calculation of the
whole size of the structure can be done in two ways:

    1) sizeof(struct S) + count * sizeof(typeof(fam))
    2) offsetof(struct S, fam) + count * sizeof(typeof(fam))

The first will add any remaining whitespace that might exist after
allocation while the second method is more precise, but not quite
expected from programmers. See [1] for a discussion of the topic.

GCC isn't (currently) able to calculate __bdos on a pointer to the whole
structure. Therefore, because of the above issue, we'll choose to match
what GCC does for consistency's sake.

[1] https://lore.kernel.org/lkml/ZvV6X5FPBBW7CO1f@archlinux/

Co-authored-by: Eli Friedman <efriedma@quicinc.com>
2024-10-17 21:52:40 +00:00
Sven van Haastregt
caa7301bc8
[OpenCL] Restore addrspacecast for pipe builtins (#112514)
Commit 84ee629bc515 ("clang: Remove some pointer bitcasts (#112324)",
2024-10-15) triggered some "Call parameter type does not match function
signature!" errors when using the OpenCL pipe builtin functions under
the spir triple, due to a missing addrspacecast.

This would have been caught by the pipe_builtin.cl test if that had used
the `spir-unknown-unknown` triple, so extend the test to use that
triple too.
2024-10-16 13:58:12 +02:00
Finn Plummer
6d13cc9411
[HLSL] Implement WaveReadLaneAt intrinsic (#111010)
- create a clang built-in in Builtins.td
    - add semantic checking in SemaHLSL.cpp
    - link the WaveReadLaneAt api in hlsl_intrinsics.h
    - add lowering to spirv backend op GroupNonUniformShuffle
      with Scope = 2 (Group) in SPIRVInstructionSelector.cpp
    - add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping
      to DXIL.td

    - add tests for HLSL intrinsic lowering to spirv intrinsic in
      WaveReadLaneAt.hlsl
    - add tests for sema checks in WaveReadLaneAt-errors.hlsl
    - add spir-v backend tests in WaveReadLaneAt.ll
    - add test to show scalar dxil lowering functionality

    - note that this doesn't include support for the scalarizer to
      handle WaveReadLaneAt will be added in a future pr

This is the first part #70104
2024-10-15 18:49:40 -07:00
Matt Arsenault
84ee629bc5
clang: Remove some pointer bitcasts (#112324)
Obsolete since opaque pointers.
2024-10-15 22:46:24 +04:00
YunQiang Su
5bf81e53db
Clang: Support minimumnum and maximumnum intrinsics (#96281)
We just introduce llvm.minimumnum and llvm.maximumnum intrinsics support
to llvm. Let's support them in Clang.

See: #93033
2024-10-14 15:49:01 +08:00
Rahul Joshi
c8da2253f9
[Clang] Replace Intrinsic::getDeclaration with getOrInsertDeclaration (#111990)
Fix build failure from the rename change. Looks like one additional
reference sneaked in between pre-commit checks and the commit itself.
2024-10-11 05:45:09 -07:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Finn Plummer
2647505027
[HLSL] Implement the degrees intrinsic (#111209)
- add degrees builtin
    - link degrees api in hlsl_intrinsics.h
    - add degrees intrinsic to IntrinsicsDirectX.td
    - add degrees intrinsic to IntrinsicsSPIRV.td
- add lowering from clang builtin to dx/spv intrinsics in CGBuiltin.cpp
    - add semantic checks to SemaHLSL.cpp
- add expansion of directx intrinsic to llvm fmul for DirectX in
DXILIntrinsicExpansion.cpp
    - add mapping to spir-v intrinsic in SPIRVInstructionSelector.cpp

    - add test coverage:
- degrees.hlsl -> check hlsl lowering to dx/spv degrees intrinsics
- degrees-errors.hlsl/half-float-only-errors -> check semantic warnings
- hlsl-intrinsics/degrees.ll -> check lowering of spir-v degrees
intrinsic to SPIR-V backend
- DirectX/degrees.ll -> check expansion and scalarization of directx
degrees intrinsic to fmul
      
Resolves #99104
2024-10-10 16:34:26 -07:00
Finn Plummer
d36cef0b17
[HLSL][DXIL] Implement WaveGetLaneIndex Intrinsic (#111576)
- add additional lowering for directx backend in CGBuiltin.cpp
    - add directx intrinsic to IntrinsicsDirectX.td
    - add semantic check of arguments in SemaHLSL.cpp
    - add mapping to DXIL op in DXIL.td

    - add testing of semantics in WaveGetLaneIndex-errors.hlsl
    - add testing of dxil lowering in WaveGetLaneIndex.ll
  
Resolves #70105
2024-10-10 11:44:44 -07:00
Tim Gymnich
99608f114f
[clang][HLSL] Add sign intrinsic part 4 (#108396)
- Add handling for unsigned integers to hlsl_elementwise_sign
- Use `select` instead of adding dx and spirv intrinsics for unsigned
integers as [discussed previously
](https://github.com/llvm/llvm-project/pull/101988#discussion_r1736779424)

fixes #70078

### Related PRs
- https://github.com/llvm/llvm-project/pull/101987
- https://github.com/llvm/llvm-project/pull/101988
- https://github.com/llvm/llvm-project/pull/101989

cc @farzonl @pow2clk @bob80905 @bogner @llvm-beanz
2024-10-10 05:18:15 -04:00
Adam Yang
9df94e2791
[clang][HLSL] Add radians intrinsic (#110802)
partially fixes #99151

### Changes
* Implemented `radians` clang builtin
* Linked `radians` clang builtin with `hlsl_intrinsics.h`
* Added sema checks for `radians` to `CheckHLSLBuiltinFunctionCall` in
`SemaChecking.cpp`
* Add codegen for `radians` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
* Add codegen tests to `clang/test/CodeGenHLSL/builtins/radians.hlsl`
* Add sema tests to `clang/test/SemaHLSL/BuiltIns/radians-errors.hlsl`

### Related PRs
* [[DXIL] Add radians intrinsic
#110616](https://github.com/llvm/llvm-project/pull/110616)
* [[SPIRV] Add radians intrinsic
#110800](https://github.com/llvm/llvm-project/pull/110800)
2024-10-04 18:34:46 -04:00
Kazu Hirata
36929955f5 [CodeGen] Fix warnings
This patch fixes:

  clang/lib/CodeGen/CGBuiltin.cpp:18677:11: error: unused variable
  'XVecTy1' [-Werror,-Wunused-variable]

  clang/lib/CodeGen/CGBuiltin.cpp:18678:11: error: unused variable
  'XVecTy2' [-Werror,-Wunused-variable]
2024-10-03 10:47:26 -07:00
Joshua Batista
c098435eaa
Add cross builtins and cross HLSL function to DirectX and SPIR-V backend (#109180)
This PR adds the step intrinsic and an HLSL function that uses it.
The SPIRV backend is also implemented.

Used https://github.com/llvm/llvm-project/pull/106471 as a reference.
Fixes https://github.com/llvm/llvm-project/issues/99095
2024-10-03 10:24:09 -07:00
Francis Visoiu Mistrih
9440420f63
[Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (#110198)
We have the LLVM intrinsics, and we're missing the clang builtins to be
used directly in code that needs to make the distinction in NaN
semantics.
2024-10-01 15:39:23 -07:00
Tex Riddell
b70d32789c
[HLSL][clang] Add elementwise builtin for atan2 (p3) (#110187)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add HLSL frontend for atan2
- Add clang Builtin, map to new llvm.atan2
- SemaChecking restrict to floating point and 2 args
- SemaHLSL restrict to float or half.
- Add to clang ReleaseNotes.rst and LanguageExtensions.rst
- Add half-float-only-errors2.hlsl for 2 arg intrinsics, and update half-float-only-errors.hlsl with scalar case for consistency
- Remove fmod-errors.hlsl and pow-errors.hlsl now covered in half-float-only-errors2.hlsl

Part 3 for Implement the atan2 HLSL Function #70096.
2024-10-01 14:41:43 -07:00
realqhc
00128a20ee
[RISCV] Implement Clang Builtins for XCValu Extension in CV32E40P (#100684)
This commit adds the Clang Builtins, C API header and relevant tests for
XCValu extension.

Spec:
https://github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributor: @melonedo, @PaoloS02
2024-10-01 11:22:02 +10:00
Zhengxing li
5d08f3256b
[HLSL] Implementation of the elementwise fmod builtin (#108849)
This change add the elementwise fmod builtin to support HLSL function
'fmod' in clang for #99118

Builtins.td           - add the fmod builtin
CGBuiltin.cpp         - lower the builtin to llvm FRem instruction
hlsl_intrinsics.h     - add the fmod api
SemaChecking.cpp      - add type checks for builtin
SemaHLSL.cpp          - add HLSL type checks for builtin

clang/docs/LanguageExtensions.rst - add the builtin in *Elementwise
Builtins*
clang/docs/ReleaseNotes.rst        - announce the builtin
2024-09-27 17:26:06 -04:00
Lukacma
c511cc099a
[AArch64] Implement NEON vscale intrinsics (#100347)
This patch implements following intrinsics:

```
float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm)	
float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm)
float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm)
float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm)
float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm)
```

as defined in https://github.com/ARM-software/acle/pull/323

Co-authored-by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>
2024-09-26 16:39:18 +01:00
Paul Walker
0c31ea5a09
[Clang][SME2] Use tuple result of SME builtins directly. (#109423)
I missed a codepath during PR108008 so SME2/SVE2p1 builtins are
converting their struct return type into a large vector, which is
causing unnecessary casting via memory.
2024-09-25 11:19:05 +01:00
Benjamin Maxwell
53907ed508
[clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (#108853)
On some targets, an FP libcall with argument types such as long double
will be lowered to pass arguments indirectly via pointers. When this is
the case we should not mark the libcall with "int" TBAA as it may lead
to incorrect optimizations.

Currently, this can be seen for long doubles on x86_64-w64-mingw32. The
`load x86_fp80` after the call is (incorrectly) marked with "int" TBAA
(overwriting the previous metadata for "long double").

Nothing seems to break due to this currently as the metadata is being
incorrectly placed on the load and not the call. But if the metadata
is moved to the call (which this patch ensures), LLVM will optimize out
the setup for the arguments.
2024-09-25 09:50:55 +01:00
Yingwei Zheng
d8f555d625
[UBSan] Diagnose assumption violation (#104741)
This patch extends [D34590](https://reviews.llvm.org/D34590) to check
assumption violations.

---------

Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2024-09-25 13:59:10 +08:00
Congcong Cai
eca5949031
[codegen][NFC] add static mark for internal usage variable and function (#109431)
Detect by clang-tidy misc-use-internal-linkage
2024-09-24 07:25:07 +08:00
Lei Huang
62f3eae466
[PowerPC] Fix incorrect store alignment for __builtin_vsx_build_pair() (#108606)
Fixes #107229
2024-09-23 13:30:59 -04:00
Nikita Popov
ecb98f9fed [IRBuilder] Remove uses of CreateGlobalStringPtr() (NFC)
Since the migration to opaque pointers, CreateGlobalStringPtr()
is the same as CreateGlobalString(). Normalize to the latter.
2024-09-23 16:30:50 +02:00
Simon Pilgrim
f8f0a266e0
[clang][wasm] Replace the target integer sub saturate intrinsics with the equivalent generic __builtin_elementwise_sub_sat intrinsics (#109405)
Remove the Intrinsic::wasm_sub_sat_signed/wasm_sub_sat_unsigned entries
and just use sub_sat_s/sub_sat_u directly
2024-09-22 10:12:41 +01:00
Simon Pilgrim
2c90eb990a
[clang][wasm] Replace the target integer add saturate intrinsics with the equivalent generic __builtin_elementwise_add_sat intrinsics (#109269)
Noticed while working on #109160

I've left out the sub_sat intrinsics for now - not sure about the history behind them using Intrinsic::wasm_sub_sat_* instead of Intrinsic::*sub_sat
2024-09-20 11:49:31 +01:00
Simon Pilgrim
e5717fb61d
[clang][wasm] Replace the target iminmax intrinsics with the equivalent generic __builtin_elementwise_min/max intrinsics (#109259)
Noticed while working on #109160
2024-09-20 11:48:57 +01:00
Simon Pilgrim
0013f94b24
[clang][powerpc][wasm][systemz][x86] Replace target vector popcount intrinsics with __builtin_elementwise_popcount (#109160)
Now that we have the C/C++ `__builtin_elementwise_popcount` intrinsic (#108121) - remove custom target intrinsics that just immediately map to Intrinsic::ctpop and use the generic intrinsic directly.
2024-09-19 12:40:36 +01:00
Sarah Spall
67518a44fe
[HLSL] Implement elementwise popcount (#108121)
Add new elementwise popcount builtin to support HLSL function
'countbits'.
elementwise popcount only accepts integer types.
Add hlsl intrinsic 'countbits'
Closes #99094
2024-09-18 08:19:52 -07:00
Martin Storsjö
f710612584 Revert "[clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls (#108575)"
This reverts commit a56ca1a0fb248c6f38b5841323a74673748f43ea.

This commit broke code generation for x86 mingw targets, with regards
to long double math functions - see
https://github.com/llvm/llvm-project/pull/108575#issuecomment-2352574978
for details.
2024-09-16 13:51:16 +03:00