llvm-project

Author	SHA1	Message	Date
Chuanqi Xu	c467245474	[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty Close https://github.com/llvm/llvm-project/issues/56301 Close https://github.com/llvm/llvm-project/issues/64151 See the summary and the discussion of https://reviews.llvm.org/D157070 to get the full context. As @rjmccall pointed out, the key point of the root cause is that currently we didn't implement the semantics for '@llvm.coro.save' well ("after the await-ready returns false, the coroutine is considered to be suspended ") well. Since the semantics implies that we (the compiler) shouldn't write the spills into the coroutine frame in the await_suspend. But now it is possible due to some combinations of the optimizations so the semantics are broken. And the inlining is the root optimization of such optimizations. So in this patch, we tried to add the `noinline` attribute to the await_suspend call. Also as an optimization, we don't add the `noinline` attribute to the await_suspend call if the awaiter is an empty class. This should be correct since the programmers can't access the local variables in await_suspend if the awaiter is empty. I think this is necessary for the performance since it is pretty common. Another potential optimization is: call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle, ptr @awaitSuspendFn) Then it is much easier to perform the safety analysis in the middle end. If it is safe to inline the call to awaitSuspend, we can replace it in the CoroEarly pass. Otherwise we could replace it in the CoroSplit pass. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D157833	2023-08-22 09:56:44 +08:00
Erik Pilkington	e698695fbb	Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas This reverts commit e26c24b849211f35a988d001753e0cd15e4a9d7b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: https://github.com/llvm/llvm-project/issues/38157 Link: https://github.com/llvm/llvm-project/issues/41896 Link: https://github.com/llvm/llvm-project/issues/43598 Link: https://github.com/ClangBuiltLinux/linux/issues/39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094	2023-08-16 15:21:46 -07:00
Martin Storsjö	d60c3d08e7	[clang] Skip stores in init for fields that are empty structs An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes https://github.com/llvm/llvm-project/issues/64253, and possibly https://github.com/llvm/llvm-project/issues/64077 and https://github.com/llvm/llvm-project/issues/64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332	2023-08-15 10:59:23 +03:00
Changpeng Fang	d77c62053c	[clang][AMDGPU]: Don't use byval for struct arguments in function ABI Summary: Byval requires allocating additional stack space, and always requires an implicit copy to be inserted in codegen, where it can be difficult to optimize. In this work, we use byref/IndirectAliased promotion method instead of byval with the implicit copy semantics. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D155986	2023-08-11 16:37:42 -07:00
Matt Arsenault	25bc999d1f	Intrinsics: Add type overload to stacksave and stackstore This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but this is enough to fix existing tests. https://reviews.llvm.org/D156666	2023-08-09 18:33:11 -04:00
Sander de Smalen	28b5f3087a	[Clang][AArch64] Add/implement ACLE keywords for SME. This patch adds all the language-level function keywords defined in: https://github.com/ARM-software/acle/pull/188 (merged) https://github.com/ARM-software/acle/pull/261 (update after D148700 landed) The keywords are used to control PSTATE.ZA and PSTATE.SM, which are respectively used for enabling the use of the ZA matrix array and Streaming mode. This information needs to be available on call sites, since the use of ZA or streaming mode may have to be enabled or disabled around the call-site (depending on the IR attributes set on the caller and the callee). For calls to functions from a function pointer, there is no IR declaration available, so the IR attributes must be added explicitly to the call-site. With the exception of '__arm_locally_streaming' and '__arm_new_za' the information is part of the function's interface, not just the function definition, and thus needs to be propagated through the FunctionProtoType::ExtProtoInfo. This patch adds the defintions of these keywords, as well as codegen and semantic analysis to ensure conversions between function pointers are valid and that no conflicting keywords are set. For example, '__arm_streaming' and '__arm_streaming_compatible' are mutually exclusive. Differential Revision: https://reviews.llvm.org/D127762	2023-08-08 07:00:59 +00:00
Jon Roelofs	ed83797f3c	[Intrinsics][ObjC] Mark objc_retain and friends as thisreturn. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain rdar://79869679 Differential revision: https://reviews.llvm.org/D105671	2023-08-01 18:02:00 -07:00
Yaxun (Sam) Liu	ac72531043	[Driver] Add `-f[no-]offload-uniform-block` By default, clang assumes HIP kernels are launched with uniform block size, which is the case for kernels launched through triple chevron or hipLaunchKernelGGL. Clang adds uniform-work-group-size function attribute to HIP kernels to allow the backend to do optimizations on that. However, in some rare cases, HIP kernels can be launched through hipExtModuleLaunchKernel where global work size is specified, which may result in non-uniform block size. To be able to support non-uniform block size for HIP kernels, an option `-f[no-]offload-uniform-block is added. This option is generic for offloading languages. Its default value is on for CUDA/HIP and off otherwise. Make -cl-uniform-work-group-size an alias to -foffload-uniform-block. Reviewed by: Siu Chi Chan, Matt Arsenault, Fangrui Song, Johannes Doerfert Differential Revision: https://reviews.llvm.org/D155213 Fixes: SWDEV-406592	2023-07-27 16:36:02 -04:00
Amy Huang	27dab4d305	Reland "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas."t This reverts commit 8ed7aa59f489715d39d32e72a787b8e75cfda151. Differential Revision: https://reviews.llvm.org/D154007	2023-07-26 16:13:36 -07:00
Craig Topper	d53d842d12	[RISCV][AArch64][IRGen] Add a special case to CodeGenFunction::EmitCall for scalable vector return being coerced to fixed vector. Before falling back to CreateCoercedStore, detect a scalable vector return being coerced to fixed vector. Handle it using a vector.extract intrinsic without going through memory. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D155495	2023-07-18 10:04:33 -07:00
Craig Topper	e8dc9dcd7d	[IRGen] Remove 'Sve' from the name of some IR names that are shared with RISC-V now. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D155220	2023-07-17 08:43:43 -07:00
Youngsuk Kim	6f986bffc5	[clang] Remove CGBuilderTy::CreateElementBitCast `CGBuilderTy::CreateElementBitCast()` no longer does what its name suggests. Remove remaining in-tree uses by one of the following methods. * drop the call entirely * fold it to an `Address` construction * replace it with `Address::withElementType()` This is a NFC cleanup effort. Reviewed By: barannikov88, nikic, jrtc27 Differential Revision: https://reviews.llvm.org/D154285	2023-07-02 10:40:16 -04:00
Elliot Goodrich	f0fa2d7c29	[llvm] Move AttributeMask to a separate header Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file `llvm/IR/AttributeMask.h`. After doing this we can remove the `#include <bitset>` and `#include <set>` directives from `Attributes.h`. Since there are many headers including `Attributes.h`, but not needing the definition of `AttributeMask`, this causes unnecessary bloating of the translation units and slows down compilation. This commit adds in the include directive for `llvm/IR/AttributeMask.h` to the handful of source files that need to see the definition. This reduces the total number of preprocessing tokens across the LLVM source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a reduction of ~0.76%. This should result in a small improvement in compilation time. Differential Revision: https://reviews.llvm.org/D153728	2023-06-27 15:26:17 +01:00
Eduard Zingerman	06eee734c1	[clang] Allow 'nomerge' attribute for function pointers Allow specifying 'nomerge' attribute for function pointers, e.g. like in the following C code: extern void (foo)(void) __attribute__((nomerge)); void bar(long i) { if (i) foo(); else foo(); } With the goal to attach 'nomerge' to both calls done through 'foo': @foo = external local_unnamed_addr global ptr, align 8 define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 { ; ... %0 = load ptr, ptr @foo, align 8, !tbaa !5 ; ... if.then: tail call void %0() #1 br label %if.end if.else: tail call void %0() #1 br label %if.end if.end: ret void } ; ... attributes #1 = { nomerge ... } Report a warning in case if 'nomerge' is specified for a variable that is not a function pointer, e.g.: t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes] 2 \| int j __attribute__((nomerge)); \| ^ The intended use-case is for BPF backend. BPF provides a sort of "standard library" functions that are called helpers. BPF also verifies usage of these helpers before program execution. Because of limitations of verification / runtime model it is important to keep calls to some of such helpers from merging. An example could be found by the link [1], there input C code: if (data_end - data > 1024) { bpf_for_each_map_elem(&map1, cb, &cb_data, 0); } else { bpf_for_each_map_elem(&map2, cb, &cb_data, 0); } Is converted to bytecode equivalent to: if (data_end - data > 1024) tmp = &map1; else tmp = &map2; bpf_for_each_map_elem(tmp, cb, &cb_data, 0); However, BPF verification/runtime requires to use the same map address for each particular `bpf_for_each_map_elem()` call. The 'nomerge' attribute is a perfect match for this situation, but unfortunately BPF helpers are declared as pointers to functions: static long (bpf_for_each_map_elem)(void map, ...) = (void ) 164; Hence, this commit, allowing to use 'nomerge' for function pointers. [1] https://lore.kernel.org/bpf/03bdf90f-f374-1e67-69d6-76dd9c8318a4@meta.com/ Differential Revision: https://reviews.llvm.org/D152986	2023-06-27 01:15:45 +03:00
Amy Huang	8ed7aa59f4	Revert "Try to implement lambdas with inalloca parameters by forwarding without use of inallocas." Causes a clang crash (see crbug.com/1457256). This reverts commit 015049338d7e8e0e81f2ad2f94e5a43e2e3f5220.	2023-06-22 11:42:33 -07:00
Amy Huang	015049338d	Try to implement lambdas with inalloca parameters by forwarding without use of inallocas. Differential Revision: https://reviews.llvm.org/D137872	2023-06-20 17:30:20 -07:00
Joseph Huber	8784b6a854	[Clang] Allow bitcode linking when the input is LLVM-IR Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode` options to insert LLVM-IR into the current TU. These are usefuly primarily for including LLVM-IR files that require special handling to be correct and cannot be linked normally, such as GPU vendor libraries like `libdevice.10.bc`. Currently these options can only be used if the source input goes through the AST consumer path. This patch makes the changes necessary to also support this when the input is LLVM-IR. This will allow the following operation: ``` clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc ``` Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D152391	2023-06-20 08:02:58 -05:00
NAKAMURA Takumi	0cbbfb8c2e	[CGCall] Prune ArgStruct [-Wunused-variable] It has been unused since b92ccc355acb	2023-06-16 08:00:57 +09:00
Nikita Popov	b92ccc355a	[CGCall] Directly create opaque pointers (NFCI)	2023-06-15 10:06:40 +02:00
Nikita Popov	8a19af513d	[Clang] Remove uses of PointerType::getWithSamePointeeType (NFC) No longer relevant with opaque pointers.	2023-06-12 12:18:28 +02:00
Nikita Popov	2c44168381	[Clang] Remove typed pointer consistency assertions (NFC) These are no-ops with opaque pointers.	2023-06-09 09:45:43 +02:00
pvanhout	23431b5246	[clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking Device libs make use of patterns like this: ``` __attribute__((target("gfx11-insts"))) static unsigned do_intrin_stuff(void) { return __builtin_amdgcn_s_sendmsg_rtnl(0x0); } ``` For functions that are assumed to be eliminated if the currennt GPU target doesn't support them. At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function. D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually. This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again. Fixes SWDEV-403642 Reviewed By: arsenm, yaxunl Differential Revision: https://reviews.llvm.org/D152251	2023-06-07 15:51:52 +02:00
Manna, Soumi	02ce49afb9	[NFC][CLANG] Fix bug with dereference null return value in GetFunctionTypeForVTable() This patch uses castAs instead of getAs which will assert if the type doesn't match in clang::CodeGen::CodeGenTypes::GetFunctionTypeForVTable(clang::GlobalDecl). Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D151957	2023-06-02 13:28:06 -07:00
Dmitri Gribenko	daa95c7de5	[clang][analyzer][NFC] Remove unnecessary FALLTHROUGH markers They are redundant with the [[fallthrough]]; attribute that follows. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D151723	2023-05-30 18:16:35 +02:00
Florian Hahn	f0687b47a0	[IRGen] Handle infinite cycles in findDominatingStoreToReturnValue. If there is an infinite cycle in the IR, the loop will never exit. Keep track of visited basic blocks in a set and return nullptr if a block is visited again. Fixes #62830. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D151076	2023-05-24 20:16:42 +01:00
eopXD	5e92298f76	[2/11][POC][Clang][RISCV] Define RVV tuple types For the cover letter of this patch-set, please checkout D146872. Depends on D146872. This is the 2nd patch of the patch-set. This patch originates from D97264. This patch further allows local variable declaration and function parameter passing by adjustment in clang lowering. Test cases are provided to demonstrate the LLVM IR generated. Note: This patch is currently only a proof-of-concept with only a single RVV tuple type declared here, the rest will be added when the concept of this patch-set is accepted. Authored-by: eop Chen <eop.chen@sifive.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146873	2023-05-22 00:50:40 -07:00
Matt Arsenault	bc37be1855	LangRef: Add "dynamic" option to "denormal-fp-math" This is stricter than the default "ieee", and should probably be the default. This patch leaves the default alone. I can change this in a future patch. There are non-reversible transforms I would like to perform which are legal under IEEE denormal handling, but illegal with flushing zero behavior. Namely, conversions between llvm.is.fpclass and fcmp with zeroes. Under "ieee" handling, it is legal to translate between llvm.is.fpclass(x, fcZero) and fcmp x, 0. Under "preserve-sign" handling, it is legal to translate between llvm.is.fpclass(x, fcSubnormal\|fcZero) and fcmp x, 0. I would like to compile and distribute some math library functions in a mode where it's callable from code with and without denormals enabled, which requires not changing the compares with denormals or zeroes. If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0, it is no longer possible to call the function from code with denormals enabled, or write an optimization to move the function into a denormal flushing mode. For the original function, if x was a denormal, the class would evaluate to false. If the function compiled with denormal handling was converted to or called from a preserve-sign function, the fcmp now evaluates to true. This could also be of use for strictfp handling, where code may be changing the denormal mode. Alternative name could be "unknown". Replaces the old AMDGPU custom inlining logic with more conservative logic which tries to permit inlining for callees with dynamic handling and avoids inlining other mismatched modes.	2023-04-29 08:44:59 -04:00
Harald van Dijk	6b86813945	[SYCL] Always set NoUnwind attribute for SYCL. Like CUDA and OpenCL, the SYCL specification says that throwing and catching exceptions in device functions is not supported, so this change extends the logic for adding the NoUnwind attribute to SYCL. The existing convergent.cpp test, which tests that the convergent attribute is added to functions by default, is renamed and reused to test that the nounwind attribute is added by default. This test now has -fexceptions added to it, which the driver adds by default as well. The obvious question here is why not simply change the driver to remove -fexceptions. This change follows the direction given by the TODO comment because removing -fexceptions would also disable the __EXCEPTIONS macro, which should reflect whether exceptions are enabled on the host, rather than on the device, to avoid conflicts in types shared between host and device. Reviewed By: bader Differential Revision: https://reviews.llvm.org/D147097	2023-03-30 02:18:52 +01:00
Qiu Chaofan	608212a0ff	[Clang] Check feature requirement from inlined callee Currently clang emits error when both always_inline and target attributes are on callee, but caller doesn't have some feature. This patch makes clang emit error when caller cannot meet target feature requirements from an always-inlined callee. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D143479	2023-03-15 14:21:52 +08:00
Matt Arsenault	dd81810554	clang: Emit nofpclass(nan inf) for -ffinite-math-only Set this on any source level floating-point type argument, return value, call return or outgoing parameter which is lowered to a valid IR type for the attribute. Currently this isn't applied to emitted intrinsics since those don't go through ABI code.	2023-03-15 01:13:08 -04:00
Jacob Young	6740991135	[Clang][CodeGen] Fix this argument type for certain destructors With the Microsoft ABI, some destructors need to offset a parameter to get the derived this pointer, in which case the type of that parameter should not be a pointer to the derived type. Fixes #60465	2023-02-28 16:43:03 -08:00
Akira Hatanaka	57865bc5ad	[CodeGen] Add a flag to `Address` and `Lvalue` that is used to keep track of whether the pointer is known not to be null The flag will be used for the arm64e work we plan to upstream in the future (see https://lists.llvm.org/pipermail/llvm-dev/2019-October/136091.html). Currently the flag has no effect on code generation. Differential Revision: https://reviews.llvm.org/D142584	2023-02-15 10:15:13 -08:00
Francesco Petrogalli	20f3ebd258	[clang][CGCall] Remove header file not used. [NFCI] Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D142976	2023-01-31 16:12:46 +01:00
Sven van Haastregt	1495210914	[OpenCL] Always add nounwind attribute for OpenCL Neither OpenCL nor C++ for OpenCL support exceptions, so add the `nounwind` attribute unconditionally for those languages. Differential Revision: https://reviews.llvm.org/D142033	2023-01-20 12:01:22 +00:00
Guillaume Chatelet	bf5c17ed0f	[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment	2023-01-13 15:01:29 +00:00
Guillaume Chatelet	eaa1f46f11	[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment	2023-01-13 13:19:19 +00:00
Guillaume Chatelet	6916ebd026	[clang][NFC] Use the TypeSize::getXXXValue() instead of TypeSize::getXXXSize) This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:07:48 +00:00
Matt Arsenault	ce6ae0b2a2	clang: Don't emit "frame-pointer"="none" This is the default behavior and cuts down on attribute spam. Probably should also do something to consolidate the option spellings; printing and parsing it is repeated in at least 3 different places. In the OpenMP tests, I had to manually delete some metadata check lines update_cc_test_checks was inserting that included the local build revision.	2023-01-03 19:42:46 -05:00
Dani Ferreira Franco Moura	0da4cecfb6	[clang][dataflow] Remove unused argument in getNullability This change will allow users to call getNullability() without providing an ASTContext. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D140104	2022-12-16 12:22:23 +01:00
Vitaly Buka	166c8cccde	[msan][CodeGen] Set noundef for C return value Msan needs noundef consistency between interface and implementation. If we call C++ from C we can have noundef on C++ side, and no noundef on caller C side, noundef implementation will not set TLS for return value, no noundef caller will expect it. Then we have false reports in msan. The workaround could be set TLS to zero even for noundef return values. However if we do that always it will increase binary size by about 10%. If we do that selectively we need to handle "address is taken" functions, any non local functions, and probably all function which have musttail callers. Which is still a lot. The existing implementation of HasStrictReturn refers to C standard as the reason not enforcing noundef. I believe it applies only to the case when return statement is omitted. Testing on Google codebase I never see such cases, however I've see tens of cases where C code returns actual uninitialized variables, but we ignore that it because of "omitted return" case. So this patch will: 1. fix false-positives with TLS missmatch. 2. detect bugs returning uninitialized variables for C as well. 3. report "omitted return" cases stricter than C, which is already a warning and very likely a bug in a code anyway. Reviewed By: kda Differential Revision: https://reviews.llvm.org/D139296	2022-12-05 22:58:29 -08:00
Vitaly Buka	a3b48e0945	[NFC][CodeGen] Extract HasStrictReturn	2022-12-04 19:10:17 -08:00
Vitaly Buka	487edfa432	[NFC][CodeGen] Add const to a method	2022-12-04 19:10:17 -08:00
Kazu Hirata	bb666c6930	[CodeGen] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 11:13:43 -08:00
Krzysztof Parzyszek	26424c96c0	Attributes: convert Optional to std::optional	2022-12-02 08:15:45 -06:00
Alex Richardson	f3a17d0595	[clang] Avoid duplicating ProgramAddressSpace in TargetInfo. NFCI This value was added to clang/Basic in D111566, but is only used during codegen, where we can use the LLVM IR DataLayout instead. I noticed this because the downstream CHERI targets would have to also set this value for AArch64/RISC-V/MIPS. Instead of duplicating more information between LLVM IR and Clang, this patch moves getTargetAddressSpace(QualType T) to CodeGenTypes, where we can consult the DataLayout. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138296	2022-12-01 20:40:58 +00:00
Alex Richardson	a602f76a24	[clang][TargetInfo] Use LangAS for getPointer{Width,Align}() Mixing LLVM and Clang address spaces can result in subtle bugs, and there is no need for this hook to use the LLVM IR level address spaces. Most of this change is just replacing zero with LangAS::Default, but it also allows us to remove a few calls to getTargetAddressSpace(). This also removes a stale comment+workaround in CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does return the expected size for ReferenceType (and handles address spaces). Differential Revision: https://reviews.llvm.org/D138295	2022-11-30 20:24:01 +00:00
Michele Scandale	b7d7c448df	Fix `unsafe-fp-math` attribute emission. The conditions for which Clang emits the `unsafe-fp-math` function attribute has been modified as part of `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`. In the backend code generators `"unsafe-fp-math"="true"` enable floating point contraction for the whole function. The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` was to prevent backend code generators performing contractions when that is not expected. However the change is inaccurate and incomplete because it allows `unsafe-fp-math` to be set also when only in-statement contraction is allowed. Consider the following example ``` float foo(float a, float b, float c) { float tmp = a * b; return tmp + c; } ``` and compile it with the command line ``` clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \ -O2 -mavx512f -S -o - ``` The resulting assembly has a `vfmadd213ss` instruction which corresponds to a fused multiply-add. From the user perspective there shouldn't be any contraction because the multiplication and the addition are not in the same statement. The optimized IR is: ``` define float @test(float noundef %a, float noundef %b, float noundef %c) #0 { %mul = fmul reassoc nsz arcp afn float %b, %a %add = fadd reassoc nsz arcp afn float %mul, %c ret float %add } attributes #0 = { [...] "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" [...] "unsafe-fp-math"="true" } ``` The `"unsafe-fp-math"="true"` function attribute allows the backend code generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`. In the current IR representation there is no way to determine the statement boundaries from the original source code. Because of this for in-statement only contraction the generated IR doesn't have instructions with the `contract` fast-math flag and `llvm.fmuladd` is being used to represent contractions opportunities that occur within a single statement. Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction across statements is allowed. Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't take into account that the floating point math function attributes can be refined during IR code generation of a function to handle the cases where the floating point math options are modified within a compound statement via pragmas (see `CGFPOptionsRAII`). For consistency `unsafe-fp-math` needs to be disabled if the contraction mode for any scope/operation is not `fast`. Similarly for consistency reason the initialization of `UnsafeFPMath` of in `TargetOptions` for the backend code generation should take into account the contraction mode as well. Reviewed By: zahiraam Differential Revision: https://reviews.llvm.org/D136786	2022-11-14 20:40:57 -08:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Ellis Hoag	970e1ea01a	[clang] Fix crash with -funique-internal-linkage-names Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage. I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm. `9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)` Fixes https://github.com/llvm/llvm-project/issues/54139 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D135926	2022-10-17 08:57:23 -07:00
Kazu Hirata	41ac5d258d	[clang] Fix a warning This patch fixes: clang/lib/CodeGen/CGCall.cpp:1867:64: error: '&&' within '\|\|' [-Werror,-Wlogical-op-parentheses]	2022-10-14 08:36:59 -07:00

1 2 3 4 5 ...

1112 Commits