llvm-project

Author	SHA1	Message	Date
Benjamin Maxwell	db6f627f3f	[clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (#116391 ) If `__attribute__((flatten))` is used on a function, or `[[clang::always_inline]]` on a statement, don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when these attributes are used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". Similarly, the (clang-only) `[[clang::always_inline]]` statement attribute is more relaxed than the GNU `__attribute__((always_inline))` (which says it should error it if it can't inline), saying only "If a statement is marked [[clang::always_inline]] and contains calls, the compiler attempts to inline those calls.". The docs also go on to show an example of where `[[clang::always_inline]]` has no effect.	2024-11-26 14:26:34 +00:00
Matt Arsenault	51b4ada458	clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (#102462 )	2024-10-17 17:10:45 +04:00
Helena Kotas	becb03f3c6	[DirectX] Add DirectXTargetCodeGenInfo (#104856 ) Adds target codegen info class for DirectX. For now it always translates `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)` (`RWBuffer<int>`). More work is needed to determine the actual target exp type and parameters based on the resource handle attributes. Part 1/2 of #95952	2024-09-10 12:41:08 -07:00
Daniel Kiss	9e9fa00dcb	[Arm][AArch64][Clang] Respect function's branch protection attributes. (#101978 ) Default attributes assigned to all functions according to the command line parameters. Some functions might have their own attributes and we need to set or remove attributes accordingly. Tests are updated to test this scenarios too.	2024-08-09 17:51:38 +02:00
Helena Kotas	52956b0f70	[HLSL] Implement intangible AST type (#97362 ) HLSL has a set of intangible types which are described in in the [draft HLSL Specification ([Basic.types])](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf): There are special implementation-defined types such as handle types, which fall into a category of standard intangible types. Intangible types are types that have no defined object representation or value representation, as such the size is unknown at compile time. A class type T is an intangible class type if it contains an base classes or members of intangible class type, standard intangible type, or arrays of such types. Standard intangible types and intangible class types are collectively called intangible types([9](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Intangible)). This PR implements one standard intangible type `__hlsl_resource_t` and sets up the infrastructure that will make it easier to add more in the future, such as samplers or raytracing payload handles. The HLSL intangible types are declared in `clang/include/clang/Basic/HLSLIntangibleTypes.def` and this file is included with related macro definition in most places that require edits when a new type is added. The new types are added as keywords and not typedefs to make sure they cannot be redeclared, and they can only be declared in builtin implicit headers. The `__hlsl_resource_t` type represents a handle to a memory resource and it is going to be used in builtin HLSL buffer types like this: template <typename T> class RWBuffer { [[hlsl::contained_type(T)]] [[hlsl::is_rov(false)]] [[hlsl::resource_class(uav)]] __hlsl_resource_t Handle; }; Part 1/3 of llvm/llvm-project#90631. --------- Co-authored-by: Justin Bogner <mail@justinbogner.com>	2024-08-05 10:50:34 -07:00
Matt Arsenault	e108853ac8	clang: Allow targets to set custom metadata on atomics (#96906 ) Use this to replace the emission of the amdgpu-unsafe-fp-atomics attribute in favor of per-instruction metadata. In the future new fine grained controls should be introduced that also cover the integer cases. Add a wrapper around CreateAtomicRMW that appends the metadata, and update a few use contexts to use it.	2024-07-26 09:57:28 +04:00
Daniil Kovalev	146fd7cd45	[PAC][Driver] Support `pauthtest` ABI for AArch64 Linux triples (#97237 ) When `pauthtest` is either passed as environment part of AArch64 Linux triple or passed via `-mabi=`, enable the following ptrauth flags: - `intrinsics`; - `calls`; - `returns`; - `auth-traps`; - `vtable-pointer-address-discrimination`; - `vtable-pointer-type-discrimination`; - `init-fini`. Some related stuff is still subject to change, and the ABI itself might be changed, so end users are not expected to use this and the ABI name has 'test' suffix. If `-mabi=pauthtest` option is used, it's normalized to effective triple. When the environment part of the effective triple is `pauthtest`, try to use `aarch64-linux-pauthtest` as multilib directory. The following is not supported: - combination of `pauthtest` ABI with any branch protection scheme except BTI; - explicit set of environment part of the triple to a value different from `pauthtest` in combination with `-mabi=pauthtest`; - usage on non-Linux OS. --------- Co-authored-by: Anatoly Trosinenko <atrosinenko@accesssoftek.com>	2024-07-22 21:18:39 +03:00
Daniel Kiss	f34a1654d6	[NFC][Clang] Move set functions out BranchProtectionInfo. (#98451 ) To reduce build times move them to TargetCodeGenInfo. Refactor of #98329	2024-07-12 10:58:34 +02:00
ostannard	1fd196c8df	[AArch64] Diagnose more functions when FP not enabled (#90832 ) When using a hard-float ABI for a target without FP registers, it's not possible to correctly generate code for functions with arguments which must be passed in floating-point registers. This is diagnosed in CodeGen instead of Sema, to more closely match GCC's behaviour around inline functions, which is relied on by the Linux kernel. Previously, this only checked function signatures as they were code-generated, but this missed some cases: * Calls to functions not defined in this translation unit. * Calls through function pointers. * Calls to variadic functions, where the variadic arguments have a floating-point type. This adds checks to function calls, as well as definitions, so that these cases are correctly diagnosed.	2024-05-07 09:17:05 +01:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Akira Hatanaka	f75eebab88	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 )" (#86898 ) This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c. The commit broke ubsan bots.	2024-03-27 18:14:04 -07:00
Akira Hatanaka	d9a685a9dd	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.	2024-03-27 12:24:49 -07:00
Akira Hatanaka	b311756450	Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 )" (#86674 ) This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6. It appears that the commit broke msan bots.	2024-03-26 07:37:57 -07:00
Akira Hatanaka	8bd1f9116a	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects.	2024-03-25 18:05:42 -07:00
ostannard	ef395a492a	[AArch64] Add soft-float ABI (#84146 ) This is re-working of #74460, which adds a soft-float ABI for AArch64. That was reverted because it causes errors when building the linux and fuchsia kernels. The problem is that GCC's implementation of the ABI compatibility checks when using the hard-float ABI on a target without FP registers does it's checks after optimisation. The previous version of this patch reported errors for all uses of floating-point types, which is stricter than what GCC does in practice. This changes two things compared to the first version: * Only check the types of function arguments and returns, not the types of other values. This is more relaxed than GCC, while still guaranteeing ABI compatibility. * Move the check from Sema to CodeGen, so that inline functions are only checked if they are actually used. There are some cases in the linux kernel which depend on this behaviour of GCC.	2024-03-19 13:58:51 +00:00
Prabhuk	ea9ec80b7a	Revert "[AArch64] Add soft-float ABI (#74460 )" (#82032 ) This reverts commit 9cc98e336980f00cbafcbed8841344e6ac472bdc. Issue: https://github.com/ClangBuiltLinux/linux/issues/1997	2024-02-16 16:43:50 -08:00
ostannard	9cc98e3369	[AArch64] Add soft-float ABI (#74460 ) This adds support for the AArch64 soft-float ABI. The specification for this ABI was added by https://github.com/ARM-software/abi-aa/pull/232. Because all existing AArch64 hardware has floating-point hardware, we expect this to be a niche option, only used for embedded systems on R-profile systems. We are going to document that SysV-like systems should only ever use the base (hard-float) PCS variant: https://github.com/ARM-software/abi-aa/pull/233. For that reason, I've not added an option to select the ABI independently of the FPU hardware, instead the new ABI is enabled iff the target architecture does not have an FPU. For testing, I have run this through an ABI fuzzer, but since this is the first implementation it can only test for internal consistency (callers and callees agree on the PCS), not for conformance to the ABI spec.	2024-02-15 12:39:16 +00:00
Wang Pengcheng	3ac9fe69f7	[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777 ) This commit includes the necessary changes to clang and LLVM to support codegen of `RVE` and the `ilp32e`/`lp64e` ABIs. The differences between `RVE` and `RVI` are: * `RVE` reduces the integer register count to 16(x0-x16). * The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits. `RVE` can be combined with all current standard extensions. The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are: * Only 6 integer argument registers (rather than 8). * Only 2 callee-saved registers (rather than 12). * A Stack Alignment of 32bits (rather than 128bits). * ilp32e isn't compatible with D ISA extension. If `ilp32e` or `lp64` is used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries. To be compatible with the implementation of ilp32e in GCC, we don't use aligned registers to pass variadic arguments and set stack alignment\ to 4-bytes for types with length of 2*XLEN. FastCC is also supported on RVE, while GHC isn't since there is only one avaiable register. Differential Revision: https://reviews.llvm.org/D70401	2024-01-16 20:44:30 +08:00
Saiyedul Islam	f616c3eeb4	[OpenMP][DeviceRTL][AMDGPU] Support code object version 5 Update DeviceRTL and the AMDGPU plugin to support code object version 5. Default is code object version 4. CodeGen for __builtin_amdgpu_workgroup_size generates code for cov4 as well as cov5 if -mcode-object-version=none is specified. DeviceRTL compilation passes this argument via Xclang option to generate abi-agnostic code. Generated code for the above builtin uses a clang control constant "llvm.amdgcn.abi.version" to branch on the abi version, which is available during linking of user's OpenMP code. Load of this constant gets eliminated during linking. AMDGPU plugin queries the ELF for code object version and then prepares various implicitargs accordingly. Differential Revision: https://reviews.llvm.org/D139730 Reviewed By: jhuber6, yaxunl	2023-08-29 06:35:44 -05:00
Sergei Barannikov	63534779b4	[clang][CodeGen] Break up TargetInfo.cpp [7/8] Wrap calls to XXXTargetCodeGenInfo constructors into factory functions. This allows moving implementations of TargetCodeGenInfo to dedicated cpp files without a change. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D150215	2023-06-17 07:14:43 +03:00
Sergei Barannikov	e8bd2a5784	[clang][CodeGen] Break up TargetInfo.cpp [6/8] Make `qualifyWindowsLibrary` and `addStackProbeTargetAttributes` protected members of `TargetCodeGenInfo`. These are helper functions used by `getDependentLibraryOption` and `setTargetAttributes` methods when targeting Windows. The change will allow these functions to be reused after splitting `TargetInfo.cpp`. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150178	2023-06-04 14:35:32 +03:00
Sergei Barannikov	0a86e05d1d	[clang][CodeGen] Break up TargetInfo.cpp [4/8] Remove `getABIInfo` overrides returning references to target-specific implementations of `ABIInfo`. The methods may be convenient, but they are only used in one place and prevent from `ABIInfo` implementations from being put into anonymous namespaces in different cpp files. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D148092	2023-05-19 23:55:15 +03:00
Fangrui Song	ad31a2dcad	Change -fsanitize=function to place two words before the function entry The current implementation of -fsanitize=function places two words (the prolog signature and the RTTI proxy) at the function entry, which makes the feature incompatible with Intel Indirect Branch Tracking (IBT) that needs an ENDBR instruction at the function entry. To allow the combination, move the two words before the function entry, similar to -fsanitize=kcfi. Armv8.5 Branch Target Identification (BTI) has a similar requirement. Note: for IBT and BTI, whether a function gets a marker instruction at the entry generally cannot be assumed (it can be disabled by a function attribute or stronger LTO optimizations). It is extremely unlikely for two words preceding a function entry to be inaccessible. One way to achieve this is by ensuring that a function is aligned at a page boundary and making the preceding page unmapped or unreadable. This is not reasonable for application or library code. (Think: the first text section has crt* code not instrumented by -fsanitize=function.) We use 0xc105cafe for all targets. .long 0xc105cafe disassembles to invalid instructions on all architectures I have tested, except Power where it is `lfs 8, -13570(5)` (Load Floating-Point with a weird offset, unlikely to be used in real code). --- For the removed function in AsmPrinter.cpp, remove an assert: `mdconst::extract` already asserts non-nullness. For compiler-rt/test/ubsan/TestCases/TypeCheck/Function/function.cpp, when the function doesn't have prolog/epilog (-O1 and above), after moving the two words, the address of the function equals the address of ret instruction, so symbolizing the function will additionally get a non-zero column number. Adjust the test to allow an optional column number. ``` .long 3238382334 .long .L__llvm_rtti_proxy-_Z1fv _Z1fv: // symbolizing here retrieves the line table entry from the second .loc .file 0 ... .loc 0 1 0 .cfi_startproc .loc 0 2 1 prologue_end retq ``` Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D148665	2023-05-19 07:50:29 -07:00
Juan Manuel MARTINEZ CAAMAÑO	488185cca3	[Clang][DebugInfo][AMDGPU] Emit zero size bitfields in the debug info to delimit bitfields in different allocation units. Consider the following sturctures when targetting: struct foo { int space[4]; char a : 8; char b : 8; char x : 8; char y : 8; }; struct bar { int space[4]; char a : 8; char b : 8; char : 0; char x : 8; char y : 8; }; Even if both structs have the same layout in memory, they are handled differenlty by the AMDGPU ABI. With the following code: // clang --target=amdgcn-amd-amdhsa -g -O1 example.c -S char use_foo(struct foo f) { return f.y; } char use_bar(struct bar b) { return b.y; } For use_foo, the 'y' field is passed in v4 ; v_ashrrev_i32_e32 v0, 24, v4 ; s_setpc_b64 s[30:31] For use_bar, the 'y' field is passed in v5 ; v_bfe_i32 v0, v5, 8, 8 ; s_setpc_b64 s[30:31] To make this distinction, we record a single 0-size bitfield for every member that is preceded by it. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D144870	2023-03-28 10:07:32 +02:00
Paulo Matos	8d0c889752	[clang][WebAssembly] Initial support for reference type funcref in clang This is the funcref counterpart to 890146b. We introduce a new attribute that marks a function pointer as a funcref. It also implements builtin __builtin_wasm_ref_null_func(), that returns a null funcref value. Differential Revision: https://reviews.llvm.org/D128440	2023-03-17 18:31:44 +01:00
Joshua Cranmer	bcad161db3	[Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D141008	2023-03-13 14:20:24 -04:00
Paulo Matos	890146b192	[WebAssembly] Initial support for reference type externref in clang This patch introduces a new type __externref_t that denotes a WebAssembly opaque reference type. It also implements builtin __builtin_wasm_ref_null_extern(), that returns a null value of __externref_t. This lays the ground work for further builtins and reference types. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122215	2023-02-17 18:48:48 -08:00
Vitaly Buka	bccf5999d3	Revert "[clang][WebAssembly] Initial support for reference type externref in clang" Very likely breaks stage 3 of msan build bot. Good: 764c88a50ac76a2df2d051a0eb5badc6867aabb6 https://lab.llvm.org/buildbot/#/builders/74/builds/17058 Looks unrelated: 48b5a06dfcab12cf093a1a3df42cb5b684e2be4c Bad: 48b5a06dfcab12cf093a1a3df42cb5b684e2be4c https://lab.llvm.org/buildbot/#/builders/74/builds/17059 This reverts commit eb66833d19573df97034a81279eda31b8d19815b.	2023-02-05 21:41:48 -08:00
Paulo Matos	eb66833d19	[clang][WebAssembly] Initial support for reference type externref in clang This patch introduces a new type __externref_t that denotes a WebAssembly opaque reference type. It also implements builtin __builtin_wasm_ref_null_extern(), that returns a null value of __externref_t. This lays the ground work for further builtins and reference types. Differential Revision: https://reviews.llvm.org/D122215	2023-01-31 17:34:01 +01:00
Matt Arsenault	52c28d7cf9	clang/OpenCL: Don't use a Function for the block type The AMDGPU value for this is not really a function. Currently we're emitting IR that isn't true to what will eventually be emitted.	2023-01-30 15:03:14 -04:00
Sergei Barannikov	87dc7d4b61	[clang][CodeGen] Factor out Swift ABI hooks (NFCI) Swift calling conventions stands out in the way that they are lowered in mostly target-independent manner, with very few customization points. As such, swift-related methods of ABIInfo do not reference the rest of ABIInfo and vice versa. This change follows interface segregation principle; it removes dependency of SwiftABIInfo on ABIInfo. Targets must now implement SwiftABIInfo separately if they support Swift calling conventions. Almost all targets implemented `shouldPassIndirectly` the same way. This de-facto default implementation has been moved into the base class. `isSwiftErrorInRegister` used to be virtual, now it is not. It didn't accept any arguments which could have an effect on the returned value. This is now a static property of the target ABI. Reviewed By: rusyaev-roman, inclyc Differential Revision: https://reviews.llvm.org/D130394	2022-08-08 00:23:23 +08:00
Sergei Barannikov	37502e042f	[clang][CodeGen] Only include ABIInfo.h where required (NFC) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D130322	2022-07-22 10:45:02 -07:00
Nikita Popov	6e1e99dc07	[CodeGen] Avoid pointer element type access for blocks Pass the block struct type down to the TargetInfo hooks.	2022-03-17 16:56:31 +01:00
Kazu Hirata	d1b127b5b7	[clang] Remove unused forward declarations (NFC)	2022-01-08 11:56:40 -08:00
Alexandros Lamprineas	29b263a34f	[Clang][AArch64] Inline assembly support for the ACLE type 'data512_t' In LLVM IR terms the ACLE type 'data512_t' is essentially an aggregate type { [8 x i64] }. When emitting code for inline assembly operands, clang tries to scalarize aggregate types to an integer of the equivalent length, otherwise it passes them by-reference. This patch adds a target hook to tell whether a given inline assembly operand is scalarizable so that clang can emit code to pass/return it by-value. Differential Revision: https://reviews.llvm.org/D94098	2021-07-31 09:51:28 +01:00
Jonas Paulsson	e57bd1ff4f	[CFE, SystemZ] New target hook testFPKind() for checks of FP values. The recent commit 00a6254 "Stop traping on sNaN in builtin_isnan" changed the lowering in constrained FP mode of builtin_isnan from an FP comparison to integer operations to avoid trapping. SystemZ has a special instruction "Test Data Class" which is the preferred way to do this check. This patch adds a new target hook "testFPKind()" that lets SystemZ emit the s390_tdc intrinsic instead. testFPKind() takes the BuiltinID as an argument and is expected to soon handle more opcodes than just 'builtin_isnan'. Review: Thomas Preud'homme, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96568	2021-02-18 12:36:46 -06:00
Akira Hatanaka	41b1e97b12	[CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as notail on x86-64 This is needed because the epilogue code inserted before tail calls on x86-64 breaks the handshake between the caller and callee. Calls to objc_retainAutoreleasedReturnValue used to have the same problem, which was fixed in https://reviews.llvm.org/D59656. rdar://problem/66029552 Differential Revision: https://reviews.llvm.org/D84540	2020-08-03 13:25:25 -07:00
Erich Keane	2831a317b6	Implement AVX ABI Warning/error The x86-64 "avx" feature changes how >128 bit vector types are passed, instead of being passed in separate 128 bit registers, they can be passed in 256 bit registers. "avx512f" does the same thing, except it switches from 256 bit registers to 512 bit registers. The result of both of these is an ABI incompatibility between functions compiled with and without these features. This patch implements a warning/error pair upon an attempt to call a function that would run afoul of this. First, if a function is called that would have its ABI changed, we issue a warning. Second, if said call is made in a situation where the caller and callee are known to have different calling conventions (such as the case of 'target'), we instead issue an error. Differential Revision: https://reviews.llvm.org/D82562	2020-07-01 07:14:31 -07:00
Nigel Perks	dc3f8913d2	Fix crash on XCore on unused inline in EmitTargetMetadata EmitTargetMetadata passed to emitTargetMD a null pointer as returned from GetGlobalValue, for an unused inline function which has been removed from the module at that point. A FIXME in CodeGenModule.cpp commented that the calling code in EmitTargetMetadata should be moved into the one target that needs it (XCore). A review comment agreed. So the calling loop has been moved into the XCore subclass. The check for null is done in that loop. Differential Revision: https://reviews.llvm.org/D77068	2020-06-24 12:48:17 -07:00
jasonliu	e0c356582d	[NFC][clang] Replace raw new/delete with unique_ptr to store ABIInfo in TargetCodeGenInfo Use unique_ptr to manage the lifetime of ABIInfo member inside TargetCodeGenInfo. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D79033	2020-04-30 12:31:50 +00:00
Michael Liao	5be9b8cbe2	[cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Re-commit after fix Sema checks on partial template specialization. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365	2020-03-27 17:18:49 -04:00
Artem Belevich	fe8063e1a0	Revert "[cuda][hip] Add CUDA builtin surface/texture reference support." This reverts commit 6a9ad5f3f4ac66f0cae592e911f4baeb6ee5eca6. The patch breaks CUDA copmilation. Differential Revision: https://reviews.llvm.org/D76365	2020-03-27 10:01:38 -07:00
Michael Liao	6a9ad5f3f4	[cuda][hip] Add CUDA builtin surface/texture reference support. Summary: - Even though the bindless surface/texture interfaces are promoted, there are still code using surface/texture references. For example, [PR#26400](https://bugs.llvm.org/show_bug.cgi?id=26400) reports the compilation issue for code using `tex2D` with texture references. For better compatibility, this patch proposes the support of surface/texture references. - Due to the absent documentation and magic headers, it's believed that `nvcc` does use builtins for texture support. From the limited NVVM documentation[^nvvm] and NVPTX backend texture/surface related tests[^test], it's believed that surface/texture references are supported by replacing their reference types, which are annotated with `device_builtin_surface_type`/`device_builtin_texture_type`, with the corresponding handle-like object types, `cudaSurfaceObject_t` or `cudaTextureObject_t`, in the device-side compilation. On the host side, that global handle variables are registered and will be established and updated later when corresponding binding/unbinding APIs are called[^bind]. Surface/texture references are most like device global variables but represented in different types on the host and device sides. - In this patch, the following changes are proposed to support that behavior: + Refine `device_builtin_surface_type` and `device_builtin_texture_type` attributes to be applied on `Type` decl only to check whether a variable is of the surface/texture reference type. + Add hooks in code generation to replace that reference types with the correponding object types as well as all accesses to them. In particular, `nvvm.texsurf.handle.internal` should be used to load object handles from global reference variables[^texsurf] as well as metadata annotations. + Generate host-side registration with proper template argument parsing. --- [^nvvm]: https://docs.nvidia.com/cuda/pdf/NVVM_IR_Specification.pdf [^test]: https://raw.githubusercontent.com/llvm/llvm-project/master/llvm/test/CodeGen/NVPTX/tex-read-cuda.ll [^bind]: See section 3.2.11.1.2 ``Texture reference API` in [CUDA C Programming Guide](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf). [^texsurf]: According to NVVM IR, `nvvm.texsurf.handle` should be used. But, the current backend doesn't have that supported. We may revise that later. Reviewers: tra, rjmccall, yaxunl, a.sidorin Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76365	2020-03-26 14:44:52 -04:00
Anastasia Stulova	960ff0810d	[OpenCL][PR41727] Prevent ICE on global dtors Pass NULL to pointer arg of __cxa_atexit if addr space is not matching with its param. This doesn't align yet with how dtors are generated that should be changed too. Differential Revision: https://reviews.llvm.org/D62413 llvm-svn: 366059	2019-07-15 11:58:10 +00:00
Konstantin Zhuravlyov	ec28a1dcef	AMDGPU: Add support for cross address space synchronization scopes (clang) Differential Revision: https://reviews.llvm.org/D59494 llvm-svn: 356947	2019-03-25 20:54:00 +00:00
Akira Hatanaka	65bb3f92bd	[CodeGen][ObjC] Annotate calls to objc_retainAutoreleasedReturnValue with notail on x86-64. On x86-64, the epilogue code inserted before the tail jump blocks the autoreleased return optimization. rdar://problem/38675807 Differential Revision: https://reviews.llvm.org/D59656 llvm-svn: 356705	2019-03-21 19:59:49 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Yaxun Liu	6c10a66ec7	[CUDA][HIP] Set kernel calling convention before arrange function Currently clang set kernel calling convention for CUDA/HIP after arranging function, which causes incorrect kernel function type since it depends on calling convention. This patch moves setting kernel convention before arranging function. Differential Revision: https://reviews.llvm.org/D47733 llvm-svn: 334457	2018-06-12 00:16:33 +00:00
Yaxun Liu	4306f2086f	[CUDA] Set LLVM calling convention for CUDA kernel Some targets need special LLVM calling convention for CUDA kernel. This patch does that through a TargetCodeGenInfo hook. It only affects amdgcn target. Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45223 llvm-svn: 330447	2018-04-20 17:01:03 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00

1 2 3

111 Commits