llvm-project

Author	SHA1	Message	Date
Matt Arsenault	90dc644d73	AMDGPU: Add v_smfmac_f32_32x32x32x64_fp8_bf8 for gfx950 (#117258 )	2024-11-22 12:08:15 -08:00
Matt Arsenault	8d3435f8a1	AMDGPU: Add v_smfmac_f32_32x32x64_bf8_fp8 for gfx950 (#117257 )	2024-11-22 12:02:18 -08:00
Matt Arsenault	8a5c24149d	AMDGPU: Add v_smfmac_f32_32x32x64_bf8_bf8 for gfx950 (#117256 )	2024-11-22 11:59:06 -08:00
Matt Arsenault	836d2dcf60	AMDGPU: Add v_smfmac_f32_16x16x128_fp8_fp8 for gfx950 (#117235 )	2024-11-21 17:06:06 -08:00
Matt Arsenault	33124910c9	AMDGPU: Add v_smfmac_f32_16x16x128_fp8_bf8 for gfx950 (#117234 )	2024-11-21 17:03:03 -08:00
Matt Arsenault	3678f8a8aa	AMDGPU: Add v_smfmac_f32_16x16x128_bf8_fp8 for gfx950 (#117233 )	2024-11-21 17:00:08 -08:00
Matt Arsenault	7baadb2a4e	AMDGPU: Add v_smfmac_f32_16x16x128_bf8_bf8 for gfx950 (#117232 )	2024-11-21 16:57:01 -08:00
Matt Arsenault	3e6f3508ad	AMDGPU: Add v_smfmac_i32_32x32x64_i8 for gfx950 (#117214 )	2024-11-21 15:01:03 -08:00
Matt Arsenault	8c53036146	AMDGPU: Add v_smfmac_i32_16x16x128_i8 for gfx950 (#117213 )	2024-11-21 14:58:11 -08:00
Matt Arsenault	42dd114a46	AMDGPU: Add v_smfmac_f32_32x32x32_bf16 for gfx950 (#117212 )	2024-11-21 14:52:11 -08:00
Matt Arsenault	95ddc1a63b	AMDGPU: Add v_smfmac_f32_16x16x64_bf16 for gfx950 (#117211 )	2024-11-21 14:46:43 -08:00
Matt Arsenault	e50eaa2cf1	AMDGPU: Add v_smfmac_f32_32x32x32_f16 for gfx950 (#117205 )	2024-11-21 14:43:33 -08:00
Matt Arsenault	2ab178820b	AMDGPU: Add v_smfmac_f32_16x16x64_f16 for gfx950 (#117202 )	2024-11-21 14:40:30 -08:00
Matt Arsenault	1c47d67abc	AMDGPU: Add v_mfma_f32_16x16x32_bf16 for gfx950 (#117053 )	2024-11-21 14:28:05 -08:00
Matt Arsenault	f4ed79b160	AMDGPU: Add v_mfma_i32_32x32x32_i8 for gfx950 (#117052 )	2024-11-21 09:08:15 -08:00
Matt Arsenault	76b24640e5	AMDGPU: Add v_mfma_i32_16x16x64_i8 for gfx950 (#116728 )	2024-11-21 09:02:12 -08:00
Matt Arsenault	01c9a14ccf	AMDGPU: Define v_mfma_f32_{16x16x128\|32x32x64}_f8f6f4 instructions (#116723 ) These use a new VOP3PX encoding for the v_mfma_scale_* instructions, which bundles the pre-scale v_mfma_ld_scale_b32. None of the modifiers are supported yet (op_sel, neg or clamp). I'm not sure the intrinsic should really expose op_sel (or any of the others). If I'm reading the documentation correctly, we should be able to just have the raw scale operands and auto-match op_sel to byte extract patterns. The op_sel syntax also seems extra horrible in this usage, especially with the usual assumed op_sel_hi=-1 behavior.	2024-11-21 08:51:58 -08:00
Matt Arsenault	130a3150ec	AMDGPU: Define v_mfma_f32_32x32x16_bf16 for gfx950 (#116679 ) Unlike the existing gfx940 intrinsics using short/i16 in place of bfloat, this uses the natural bfloat type.	2024-11-18 21:53:56 -08:00
Matt Arsenault	0c421687f8	AMDGPU: Add first gfx950 mfma instructions (#116312 ) Scheduling info and hazards are wrong and TBD.	2024-11-18 13:38:07 -08:00
Krzysztof Parzyszek	e44c28f07e	[clang] Replace "can't" and "can not" in diagnostics with "cannot" (#116623 ) See https://discourse.llvm.org/t/cant-cannot-can-not-in-diagnostic-messages/83171	2024-11-18 15:28:17 -06:00
Stanislav Mekhanoshin	ba1a09da8d	[AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (#113610 ) The same handling as for __builtin_amdgcn_mov_dpp.	2024-10-31 02:19:20 -07:00
Gang Chen	4ac0e7e400	[AMDGPU] Add a type for the named barrier (#113614 )	2024-10-25 11:24:47 -07:00
Stanislav Mekhanoshin	03fef62b84	[AMDGPU] Relax __builtin_amdgcn_update_dpp sema check (#113341 ) Recent change applied too strict check for old and src operands match. These shall be compatible, but not necessarily exactly the same. Fixes: SWDEV-493072	2024-10-22 12:32:08 -07:00
Stanislav Mekhanoshin	622e398d88	[AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (#112447 ) We need to support 64-bit data types (intrinsics do support it). We are also silently converting FP to integer argument now, also fixed.	2024-10-21 11:57:18 -07:00
Aaron Ballman	1881f648e2	Remove ^^ as a token in OpenCL (#108224 ) OpenCL has a reserved operator (^^), the use of which was diagnosed as an error (735c6cdebdcd4292928079cb18a90f0dd5cd65fb). However, OpenCL also encourages working with the blocks language extension. This token has a parsing ambiguity as a result. Consider: unsigned x=0; unsigned y=x^^{return 0;}(); This should result in y holding the value zero (0^0) through an immediately invoked block call as the right-hand side of the xor operator. However, it causes errors instead because of this reserved token: https://godbolt.org/z/navf7jTv1 This token is still reserved in OpenCL 3.0, so we still wish to issue a diagnostic for its use. However, we do not need to create a token for an extension point that's been unused for about a decade. So this patch moves the diagnostic from a parsing diagnostic to a lexing diagnostic and no longer forms a single token. The diagnostic behavior is slightly worse as a result, but still seems acceptable. Part of the reason this is coming up is because WG21 is considering using ^^ as a token for reflection, so this token may come back in the future.	2024-09-16 07:46:58 -04:00
Shilei Tian	af5352fe8e	[Clang][AMDGPU] Use unsigned data type for `__builtin_amdgcn_raw_buffer_store_*` (#99546 )	2024-07-18 16:34:59 -04:00
Shilei Tian	892c58cf74	[Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.ptr.buffer.load` (#99258 )	2024-07-18 15:33:03 -04:00
Stanislav Mekhanoshin	f363e30f15	[AMDGPU] Report error in clang if wave32 is requested where unsupported (#97633 )	2024-07-09 14:25:58 -07:00
Shilei Tian	c9f083a994	[Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.ptr.buffer.store` (#94576 ) Depends on https://github.com/llvm/llvm-project/pull/96313.	2024-06-25 09:55:37 -04:00
Shilei Tian	ad599211a7	[Clang][AMDGPU] Add a new builtin type for buffer rsrc (#94830 ) This patch adds a new builtin type for AMDGPU's buffer rsrc data type, which is effectively an AS 8 pointer. This is needed because we'd like to expose certain intrinsics to users via builtins which take buffer rsrc as argument.	2024-06-18 20:46:53 -04:00
Alex Voicu	88e2bb4092	[clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (#89796 ) This change seeks to add support for vendor flavoured SPIRV - more specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that carries some extra bits of information that are only usable by AMDGCN targets, forfeiting absolute genericity to obtain greater expressiveness for target features: - AMDGCN inline ASM is allowed/supported, under the assumption that the [SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc) extension is enabled/used - AMDGCN target specific builtins are allowed/supported, under the assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is enabled when using the downstream translator - the featureset matches the union of AMDGCN targets' features - the datalayout string is overspecified to affix both the program address space and the alloca address space, the latter under the assumption that the [SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc) extension is enabled/used, case in which the extant SPIRV datalayout string would lead to pointers to function pointing to the private address space, which would be wrong. Existing AMDGCN tests are extended to cover this new target. It is currently dormant / will require some additional changes, but I thought I'd rather put it up for review to get feedback as early as possible. I will note that an alternative option is to place this under AMDGPU, but that seems slightly less natural, since this is still SPIRV, albeit relaxed in terms of preconditions & constrained in terms of postconditions, and only guaranteed to be usable on AMDGCN targets (it is still possible to obtain pristine portable SPIRV through usage of the flavoured target, though).	2024-06-07 11:50:23 +01:00
Shilei Tian	170d45c0eb	[Clang][AMDGPU] Use `I` to decorate imm argument for `__builtin_amdgcn_global_load_lds` (#94376 )	2024-06-06 09:29:59 -04:00
Pierre van Houtryve	c1ac6d2dd4	[AMDGPU] Add amdgpu-as MMRA for fences (#78572 ) Using MMRAs, allow `builtin_amdgcn_fence` to emit fences that only target one or more address spaces, instead of fencing all address spaces at once. This is done through a `amdgpu-as` MMRA. Currently focused on OpenCL fences, but can very easily support more AS names and codegen on more than just fences.	2024-05-27 12:17:04 +02:00
Aaron Ballman	b49ce9c304	Fix more diagnostic wording for style; NFC (#93190 ) This tries to fix all of the places where a diagnostic message starts with a capital letter (other than acroynyms or proper nouns) or ends with punctuation (other than a question mark). This is in support of a planned change to tablegen to start diagnosing incorrect diagnostic message styles.	2024-05-23 14:50:29 -04:00
Shilei Tian	7e476eb11c	[AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (#93064 )	2024-05-23 10:36:03 -04:00
Fangrui Song	7c1d9b15ee	[test] %clang_cc1: remove redundant actions	2024-05-04 23:08:11 -07:00
Timm Bäder	3590ede848	[clang][Interp] Support vec_step	2024-04-29 18:21:24 +02:00
Mariya Podchishchaeva	8697bbe2d4	[clang] Use CPlusPlus language option instead of Bool (#80975 ) As it was pointed out in https://github.com/llvm/llvm-project/pull/80724, we should not be checking `getLangOpts().Bool` when determining something related to logical operators, since it only indicates that bool keyword is present, not which semantic logical operators have. As a side effect a missing `-Wpointer-bool-conversion` in OpenCL C was restored since like C23, OpenCL C has bool keyword but logical operators still return int.	2024-02-08 14:31:57 +01:00
Joseph Huber	d1722868d3	[Clang] Make AMDGPU OpenCL tests require AMD registered target Summary: These tests likely always failed but was hidden by the expected return value. Simply make them require AMDGPU as a registered target so they don't fail on other machines.	2024-02-05 09:08:31 -06:00
Joseph Huber	5249379d74	[AMDGPU] Allow w64 ballot to be used on w32 targets (#80183 ) Summary: Currently we cannot compile `__builtin_amdgcn_ballot_w64` on non-wave64 targets even though it is valid. This is relevant for making library code that can handle both without needing to check the wavefront size. This patch relaxes the semantic check for w64 so it can be used normally.	2024-02-05 08:42:28 -06:00
Mariusz Sikora	966416b9e8	[AMDGPU][GFX12] Add new v_permlane16 variants (#75475 )	2023-12-15 10:14:38 +01:00
Yaxun (Sam) Liu	00448a548c	[clang] Allow fp in atomic fetch max/min builtins LLVM IR already allows floating point type in atomicrmw. Update clang atomic fetch max/min builtins to accept floating point type like we did for fetch add/sub. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D150985 Fixes: SWDEV-401056	2023-05-31 15:19:31 -04:00
Tobias Hieta	dd3c26a045	[NFC][Py Reformat] Reformat python files in clang and clang-tools-extra This is an ongoing series of commits that are reformatting our Python code. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D150761	2023-05-23 08:29:52 +02:00
Aaron Ballman	5d8aaad445	[C2x] Implement support for empty brace initialization (WG14 N2900 and WG14 N3011) This implements support for allowing {} to consistently zero initialize objects. We already supported most of this work as a GNU extension, but the C2x feature goes beyond what the GNU extension allowed. The changes in this patch are: * Removed the -Wgnu-empty-initializer warning group. The extension is now a C2x extension warning instead. Note that use of `-Wno-gnu-empty-initializer seems` to be quite low in the wild (https://sourcegraph.com/search?q=context%3Aglobal+-file%3A.test.+%22-Wno-gnu-empty-initializer%22&patternType=standard&sm=1&groupBy=repo which currently only gives 8 hits total), so this is not expected to be an overly disruptive change. But I'm adding the clang vendors review group just in case this expectation is wrong. * Reworded the diagnostic wording to be about a C2x extension, added a pre-C2x compat warning. * Allow {} to zero initialize a VLA This functionality is exposed as an extension in all older C modes (same as the GNU extension was), but does not allow the extension for VLA initialization in C++ due to concern about handling non-trivially constructible types. Differential Revision: https://reviews.llvm.org/D147349	2023-04-03 15:22:52 -04:00
Ayal Zaks	eae70ccbf9	[Clang][OpenCL] Allow pointers in structs as kernel arguments from 2.0 Structs that contain global or local pointers can be passed as kernel arguments starting OpenCL v2.0 with the introduction of shared virtual memory. Differential Revision: https://reviews.llvm.org/D143849	2023-03-13 18:59:26 +02:00
Nikita Popov	8421307b6b	[Clang] Convert some tests to opaque pointers (NFC)	2023-02-16 15:48:10 +01:00
Sven van Haastregt	a60b8f4681	[OpenCL] Allow undefining header-only features `opencl-c-base.h` always defines 5 particular feature macros for SPIR-V, making it impossible to disable those features. To allow disabling any of those features, let the header recognize `__undef_<feature>` macros. The user can then pass the `-D__undef_<feature>` flag on the command line to disable a specific feature. The __undef macro could potentially also be set from `-cl-ext=-feature`, but for now only change the header and only provide __undef macros for the 5 features that are always enabled in `opencl-c-base.h`. Differential Revision: https://reviews.llvm.org/D141297	2023-01-16 11:32:12 +00:00
Matt Arsenault	e630d9b299	AMDGPU/clang: Remove target features from address space test builtins It turns out we can codegen these on targets without flat addressing, although the runtime probably didn't put anything useful there. The proper diagnostic would be to disallow flat pointer uses or languages with them, not this one edge case. Allows removing one of the special cases requiring subtarget support in the device libraries.	2022-12-29 18:46:41 -05:00
Matt Arsenault	f4bcd7f598	AMDGPU/clang: Add builtins for llvm.amdgcn.ballot Use explicit _w32/_w64 suffixes for the wave size to be consistent with the existing other wave dependent intrinsics. Also start diagnosing trying to use both wave32 and wave64. I would have preferred to avoid the +wavefrontsize64 spam on targets where that's the only option, but avoiding this seems to be more work than I expected.	2022-12-29 17:58:55 -05:00
Xiang Li	7e04c0ad63	[HLSL] Add groupshare address space. Added keyword, LangAS and TypeAttrbute for groupshared. Tanslate it to LangAS with asHLSLLangAS. Make sure it translated into address space 3 for DirectX target. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D135060	2022-10-20 09:29:09 -07:00

1 2 3 4 5 ...

546 Commits