llvm-project

Author	SHA1	Message	Date
Shilei Tian	e997dca333	[OpenMP] Introduce the initial support for OpenMP kernel language (#66844 ) This patch starts the support for OpenMP kernel language, basically to write OpenMP target region in SIMT style, similar to kernel languages such as CUDA. What included in this first patch is the `ompx_bare` clause for `target teams` directive. When `ompx_bare` exists, globalization is disabled such that local variables will not be globalized. The runtime init/deinit function calls will not be emitted. That being said, almost all OpenMP executable directives are not supported in the region, such as parallel, task. This patch doesn't include the Sema checks for that, so the use of them is UB. Simple directives, such as atomic, can be used. We provide a set of APIs (for C, they are prefix with `ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc. For more details, you can refer to https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.	2023-09-29 13:11:09 -04:00
Pavel Iliin	8ec50d6446	[AArch64] Fix FMV ifunc resolver usage on old Android APIs. Rename internal compiler-rt FMV functions. The patch fixes Function Multi Versioning features detection by ifunc resolver on Android API levels < 30. Ifunc hwcaps parameters are not supported on Android API levels 23-29, so all CPU features are set unsupported if they were not initialized before ifunc resolver call. There is no support for ifunc on Android API levels < 23, so Function Multi Versioning is disabled in this case. Also use two underscore prefix for FMV runtime support functions to avoid conflict with user program ones. Differential Revision: https://reviews.llvm.org/D158641	2023-09-29 17:10:48 +01:00
Jan Svoboda	8a2fb1391b	[clang] NFCI: Use `FileEntryRef` in `SourceManager::FileInfos` (#67742 )	2023-09-29 08:04:34 -07:00
Chuanqi Xu	cbbe555904	[C++20] [Modules] Generate init calls for the modules imported in GMF or PMF I just found that we didn't handle the imports in GMF of PMF when we're generating the init functions for the current module unit. This looks like a simple oversight and I'm going to fix that in this patch directly.	2023-09-29 22:16:31 +08:00
Chuanqi Xu	7e8a0e4bdc	[NFC] [C++20] [Modules] Rename NamedModuleHasInit to NamedModuleHasInit Address comments in https://github.com/llvm/llvm-project/pull/67638/files#r1340342453 to rename the field variable.	2023-09-29 21:49:10 +08:00
Jakub Chlanda	3f8d4a8ef2	Reland [NVPTX] Add support for maxclusterrank in launch_bounds (#66496 ) (#67667 ) This reverts commit 0afbcb20fd908f8bf9073697423da097be7db592.	2023-09-29 08:39:31 +02:00
Chuanqi Xu	989173c09c	[C++20] [Modules] Don't generate call to an imported module that dont init anything (#67638 ) Close https://github.com/llvm/llvm-project/issues/56794 And see https://github.com/llvm/llvm-project/issues/67582 for a detailed backgrond for the issue. As required by the Itanium ABI, the module units have to generate the initialization function. However, the importers are allowed to elide the call to the initialization function if they are sure the initialization function doesn't do anything. This patch implemented this semantics.	2023-09-28 23:29:24 +08:00
Nikita Popov	fb2bdbb83d	[CodeGen] Avoid use of ConstantExpr::getZExt() (NFC) Use the constant folding API instead. In preparation for dropping zext constant expressions.	2023-09-28 16:45:31 +02:00
Chuanqi Xu	9744909a12	[NFC] [C++20] [Modules] Refactor Module::getGlobalModuleFragment and Module::getPrivateModuleFragment The original implementation of `Module::getGlobalModuleFragment` and `Module::getPrivateModuleFragment` tried to find the global module fragment and the private module fragment by comparing strings, which smells bad. This patch tries to improve this.	2023-09-28 14:06:02 +08:00
Fangrui Song	0d8b864829	CGBuiltin: emit llvm.abs.* instead of neg+icmp+select for abs instcombine will combine neg+icmp+select to llvm.abs.. Let's just emit llvm.abs. in the first place.	2023-09-27 21:29:56 -07:00
Sam McCall	0afbcb20fd	Revert "[NVPTX] Add support for maxclusterrank in launch_bounds (#66496 )" This reverts commit dfab31b41b4988b6dc8129840eba68f0c36c0f13. SemaDeclAttr.cpp cannot depend on Basic's private headers (lib/Basic/Targets/NVPTX.h)	2023-09-27 10:59:04 +02:00
Jakub Chlanda	dfab31b41b	[NVPTX] Add support for maxclusterrank in launch_bounds (#66496 ) Since SM_90 CUDA supports specifying additional argument to the launch_bounds attribute: maxBlocksPerCluster, to express the maximum number of CTAs that can be part of the cluster. See: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank and https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds for details.	2023-09-27 08:51:26 +02:00
Zequan Wu	4d5d9a5390	Revert "[Coverage] Allow Clang coverage to be used with debug info correlation." This reverts commit 32db121b29f78e4c41116b2a8f1c730f9522b202 and subsequent commits. This causes time regression on llvm-cov even with debug info correlation off.	2023-09-26 20:57:09 -04:00
Arthur Eubanks	a42787d108	[clang] Add -mlarge-data-threshold for x86_64 medium code model (#66839 ) Error if not used with x86_64. Warn if not used with the medium code model (can update if other code models end up using this). Set TargetMachine option and add module flag.	2023-09-26 09:44:31 -07:00
Phoebe Wang	31631d307f	[X86][FP16] Add missing handling for FP16 constrained cmp intrinsics (#67400 )	2023-09-26 19:27:57 +08:00
Qiu Chaofan	3e97db89ae	[PowerPC] Emit IR module flag for current float abi This is part of the efforts adding .gnu_attribute support for PowerPC. In Clang, an extra metadata field will be added as float-abi to show current long double format. So backend can emit .gnu_attribute section data from this metadata. To avoid breaking existing behavior, the module metadata will only be emitted when this module makes use of long double. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D116016	2023-09-25 17:53:39 +08:00
Björn Pettersson	b4858c634e	[clang][CodeGen] Simplify code based on opaque pointers (#65624 ) - Update CodeGenTypeCache to use a single union for all pointers in address space zero. - Introduce a UnqualPtrTy in CodeGenTypeCache, and use that (for example instead of llvm::PointerType::getUnqual) in some places. - Drop some redundant bit/pointers casts from ptr to ptr.	2023-09-25 11:21:24 +02:00
Carlos Eduardo Seo	7523550853	[Clang][CodeGen] Add __builtin_bcopy (#67130 ) Add __builtin_bcopy to the list of GNU builtins. This was causing a series of test failures in glibc. Adjust the tests to reflect the changes in codegen. Fixes #51409. Fixes #63065.	2023-09-24 11:58:14 -03:00
Umesh Kalappa	2641d9b280	Propagate the volatile qualifier of exp to store /load operations . This changes to address the PR : 55207 We update the volatility on the LValue by looking at the LHS cast operation qualifier and propagate the RValue volatile-ness from the CGF data structure . Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157890	2023-09-23 19:40:24 +05:30
Bruno Cardoso Lopes	34415fd611	[Clang][LLVM][Coroutines] Prevent __coro_gro from outliving __promise (#66706 ) When dealing with short-circuiting coroutines (e.g. expected), the deferred calls that resolve the get_return_object are currently being emitted after we delete the coroutine frame. This was caught by ASAN when using optimizations -O1 and above: optimizations after inlining would place the __coro_gro in the heap, and subsequent delete of the coroframe followed by the conversion -> BOOM. This patch forbids the GRO to be placed in the coroutine frame, by adding a new metadata node that can be attached to `alloca` instructions. Fix #49843	2023-09-21 22:52:05 -07:00
Amy Huang	03c698a431	[MSVC, ARM64] Add _Copy* and _Count* intrinsics (#66554 ) Implement the _Count* and _Copy* Windows ARM intrinsics: ``` double _CopyDoubleFromInt64(__int64) float _CopyFloatFromInt32(__int32) __int32 _CopyInt32FromFloat(float) __int64 _CopyInt64FromDouble(double) unsigned int _CountLeadingOnes(unsigned long) unsigned int _CountLeadingOnes64(unsigned __int64) unsigned int _CountLeadingSigns(long) unsigned int _CountLeadingSigns64(__int64) unsigned int _CountLeadingZeros(unsigned long) unsigned int _CountLeadingZeros64(unsigned __int64) unsigned int _CountOneBits(unsigned long) unsigned int _CountOneBits64(unsigned __int64) ``` Full list of intrinsics here: [https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics](https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics) Bug: [65405](https://github.com/llvm/llvm-project/issues/65405)	2023-09-21 14:34:59 -07:00
Fangrui Song	9ee65a7618	Revert "[Coverage] Fix -Wswitch after D138847" This reverts commit ca22d6e40508f6d24a9352835bda9c152e3eee1b. The base patch 618a22144db5e45da8c95dc22064103e1b5e5b71 has been reverted.	2023-09-20 14:45:19 -07:00
Fangrui Song	ca22d6e405	[Coverage] Fix -Wswitch after D138847	2023-09-20 14:20:58 -07:00
Alex Voicu	de018f5ca4	[clang][CodeGen] The `eh_typeid_for` intrinsic needs special care too (#65699 ) This change is symmetric with the one reviewed in <https://reviews.llvm.org/D157452> and handles the exception handling specific intrinsic, which slipped through the cracks, in the same way, by inserting an address-space cast iff RTTI is in a non-default AS.	2023-09-20 17:12:19 +01:00
Juan Manuel Martinez Caamaño	69183f8eb9	[NFC][Clang] Address reviews about overrideFunctionFeaturesWithTargetFeatures (#65938 ) Addressing remarks after merge of D159257 * Add comment * Remove irrelevant CHECKs from test * Simplify function * Use llvm::sort before setting target-features as it is done in CodeGenModeule	2023-09-20 13:37:13 +02:00
Zequan Wu	816144bfd2	[Coverage] Skip visiting ctor member initializers with invalid source locations.	2023-09-19 14:59:41 -04:00
Zahira Ammarguellat	a292e7edf8	Fix math-errno issue (#66381 ) Update handling of math errno. This change updates the logic for generation of math intrinics in place of math library function calls. The previous logic https://reviews.llvm.org/D151834 was incorrectly using intrinsics when math errno handling was needed at optimization levels above -O0. This also fixes issue mentioned in https://reviews.llvm.org/D151834 by @uabelho This is joint work with @andykaylor Andy.	2023-09-19 09:13:02 -04:00
Louis Dionne	a52560c8dd	[clang] Remove spurious trailing whitespace	2023-09-15 17:26:16 -04:00
Zequan Wu	0b8df841f9	[Coverage] Add coverage for constructor member initializers. (#66441 ) Before, constructor member initializers are shown as not covered. This adds coverage info for them.	2023-09-15 17:06:04 -04:00
Zequan Wu	32db121b29	[Coverage] Allow Clang coverage to be used with debug info correlation. Debug info correlation is an option in InstrProfiling pass, which is used by both IR instrumentation and front-end instrumentation. So, Clang coverage can also benefits the binary size saving from it. Reviewed By: ellis Differential Revision: https://reviews.llvm.org/D157913	2023-09-15 13:47:23 -04:00
Anton Korobeynikov	51d5d7bbae	Extend `retcon.once` coroutines lowering to optionally produce a normal result (#66333 ) One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes. However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return` is transformed to a member / callback call on promise object). The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.	2023-09-15 09:54:38 -07:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Yaxun (Sam) Liu	d7e1932f85	[HIP] Fix comdat of template kernel handle (#66283 ) Currently, clang emits LLVM IR that fails verifier for the following code: ``` template<typename T> __global__ void foo(T x); void bar() { foo<<<1, 1>>>(0); } ``` This is due to clang putting the kernel handle for foo into comdat, which is not allowed, since the kernel handle is a declaration. The siutation is similar to calling a declaration-only template function. The callee will be a declaration in LLVM IR and won't be put into comdat. This is in contrast to calling a template function with body, which will be put into comdat. Fixes: SWDEV-419769	2023-09-14 15:56:02 -04:00
Matt Arsenault	ddc3346a6b	clang/AMDGPU: Fix accidental behavior change for __builtin_amdgcn_ldexph (#66340 )	2023-09-14 18:15:44 +03:00
Sergio Afonso	9058762789	[OpenMP][Flang][MLIR] Lowering of requires directive from MLIR to LLVM IR Default atomic ordering information is processed in the OpenMP dialect to LLVM IR lowering stage at every spot where an operation can be affected by it. The rest of clauses are stored globally in the OpenMPIRBuilderConfig object before starting that lowering stage, so that the OMPIRBuilder can conditionally modify code generation depending on these. At the end of the process, the omp.requires attribute is itself lowered into a global constructor that passes these clauses as flags to the OpenMP runtime. Depends on D147217, D147218 and D158278. Differential Revision: https://reviews.llvm.org/D147219	2023-09-14 10:35:44 +01:00
Sergio Afonso	094a63a20b	[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive This patch updates the `OpenMPIRBuilderConfig` structure to hold all available 'requires' clauses, and it replicates part of the code generation for the 'requires' registration function from clang in the `OMPIRBuilder`, to be used with flang. Porting the rest of features of the clang implementation to the IRBuilder and sharing it between clang and flang remains for a future patch, due to the complexity of the logic selecting the attributes of the generated registration function. Differential Revision: https://reviews.llvm.org/D147217	2023-09-14 10:33:54 +01:00
Reid Kleckner	c8c075e876	[MS] Follow up fix to pass aligned args to variadic x86_32 functions (#65692 ) MSVC allows users to pass structures with required alignments greater than 4 to variadic functions. It does not pass them indirectly to correctly align them. Instead, it passes them directly with the usual 4 byte stack alignment. This change implements the same logic in clang on the passing side. The receiving side (va_arg) never implemented any of this indirect logic, so it doesn't need to be updated. This issue pre-existed, but @aaron.ballman noticed it when we started passing structs containing aligned fields indirectly in D152752.	2023-09-13 16:29:11 -07:00
Joshua Cranmer	bf49237103	[Clang] Enable -print-pipeline-passes in clang. Reviewed By: arsenm, aeubanks Differential Revision: https://reviews.llvm.org/D127221	2023-09-13 08:57:10 -07:00
CarolineConcatto	ee31ba0dd9	[AArch64][SME]Update intrinsic interface for ld1/st1 (#65582 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-13 15:24:09 +01:00
Joseph Huber	1b7a095e27	[Clang][AMDGPU] Permit language address spaces for AMDGPU globals (#66205 ) Summary: Currently, there is an assertion that prevents us from emitting an AMDGPU global with a non-target specific address space (i.e. numerical attribute). I'm unsure what the original intentions of this assertion were, but we should be able to use OpenCL address spaces when compiling directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure what this assertion is guarding. The patch simply removes the assertion and adds a test to ensure that these emit the expected address spaces. Fixes https://github.com/llvm/llvm-project/issues/65069	2023-09-13 08:43:01 -05:00
Joseph Huber	49ff6a96a7	[Clang] Define AMDGPU ABI when referenced in CodeGen for ABI "none" (#66162 ) Summary: We use the `llvm.amgcn.abi.version` varaible to control code generation. This is emitted in every module now to indicate what should be used when compiling. Previously, the logic caused us to emit an external reference to this variable when creating the code for the `none` type. This would then cause us not to emit the actual definition. This patch refines the logic to create the external reference, and then update it if it is found unset by the time we emit the global. I had to remove the reference to `GetOrCreateLLVmGlobal` because it did not accept the proper address space.	2023-09-13 08:31:31 -05:00
Benjamin Kramer	88b7e06dcf	Revert "[clang][CodeGen] Emit annotations for function declarations." This reverts commit c6a33ff49dfb3498dae15c718820ea3d9c19f3cb. Makes clang segfault. // clang t.cc class a; class c { public: [[clang::annotate("")]] c(const c ) {} }; class d { d(const c , a , a ); c e; }; d::d(const c f, a , a *) : e(f) {}	2023-09-13 13:22:57 +02:00
Aaron Jarmusch	131ba0ae01	Revert "[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )" This reverts commit e831a32c93c1ab404785773cc7c08c01730d61e5.	2023-09-12 22:46:09 +00:00
Aaron Jarmusch	e3298bb275	fixup! [Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 )	2023-09-12 20:52:33 +00:00
Brendan Dahl	c6a33ff49d	[clang][CodeGen] Emit annotations for function declarations. Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations. Differential Revision: https://reviews.llvm.org/D156172/new/	2023-09-12 13:07:55 -07:00
Aaron Jarmusch	e831a32c93	[Clang][OpenMP] Clang adding the addrSpace according to DataLayout fix (#65483 ) Fix for an issue where clang was not adding the address space according to the data layout, instead was using the default which resulted in a crash at times. The fix includes changes to the cases of LargeCapMemAlloc and CGroupMemAlloc where we are setting the AddrSpace according to the DataLayout.	2023-09-12 15:44:39 -04:00
CarolineConcatto	dc8d2ecc5e	[AArch64][SME]Update intrinsic interface for read/write (#65594 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. This patch is the #2 of 3 patches to update the interface. Slice specifies the ZA slice number directly and needs to be explicity implemented by the "user" with the base register plus the immediate offset [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 18:08:57 +01:00
CarolineConcatto	7b8d4eff02	[AArch64][SME]Update intrinsic interface for ldr/str (#65593 ) The new ACLE PR#225[1] now combines the slice parameters for some builtins. [1]https://github.com/ARM-software/acle/pull/225/files	2023-09-12 17:31:51 +01:00
Adrian Prantl	167acac417	Propagate the DWARF version from the main compiler invocation to PCHC… (#66032 ) …ontainerGenerator Currently it remains uninitialized and thus always uses the LLVM default of 4.	2023-09-12 08:31:27 -07:00
Max Iyengar	dbeb3d029d	Add missing vrnd intrinsics This patch adds 8 missing intrinsics as specified in the Arm ACLE document section 2.12.1.1 : [[ https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3 \| https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#rounding-3]] The intrinsics implemented are: - vrnd32z_f64 - vrnd32zq_f64 - vrnd64z_f64 - vrnd64zq_f64 - vrnd32x_f64 - vrnd32xq_f64 - vrnd64x_f64 - vrnd64xq_f64 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D158626	2023-09-11 12:59:18 +01:00

1 2 3 4 5 ...

16346 Commits