This patch migrates the emitOffloadingArrays and EmitNonContiguousDescriptor functions from Clang codegen to OpenMPIRBuilder.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D149872
Apply my post-commit comment on D81995. The negative name misguided commit
d8a8e5d6240a1db809cd95106910358e69bbf299 (`[clang][cli] Remove marshalling from
Opt{In,Out}FFlag`) to:
* accidentally flip the option to not emit the xray_fn_idx section.
* change -fno-xray-function-index (instead of -fxray-function-index) to emit xray_fn_idx
This patch renames XRayOmitFunctionIndex and makes -fxray-function-index emit
xray_fn_idx, but the default remains -fno-xray-function-index .
The corresponding definition was removed by:
commit 3cc1f1fc1d97952136185f4eafb827694875de17
Author: Joseph Huber <jhuber6@vols.utk.edu>
Date: Thu Oct 8 12:03:11 2020 -0400
This is an alternative to currently existing hostcall implementation and uses printf buffer similar to OpenCL,
The data stored in the buffer (i.e the data frame) for each printf call are as follows,
1. Control DWord - contains info regarding stream, format string constness and size of data frame
2. Hash of the format string (if constant) else the format string itself
3. Printf arguments (each aligned to 8 byte boundary)
The format string Hash is generated using LLVM's MD5 Message-Digest Algorithm implementation and only low 64 bits are used.
The implementation still uses amdhsa metadata and hash is stored as part of format string itself to ensure
minimal changes in runtime.
Differential Revision: https://reviews.llvm.org/D150427
This commit implements support for WebAssembly table types and
respective builtins. Table tables are WebAssembly objects to store
reference types. They have a large amount of semantic restrictions
including, but not limited to, only being allowed to be declared
at the top-level as static arrays of zero-length. Not being arguments
or result of functions, not being stored ot memory, etc.
This commit introduces the __attribute__((wasm_table)) to attach to
arrays of WebAssembly reference types. And the following builtins to
manage tables:
* ref __builtin_wasm_table_get(table, idx)
* void __builtin_wasm_table_set(table, idx, ref)
* uint __builtin_wasm_table_size(table)
* uint __builtin_wasm_table_grow(table, ref, uint)
* void __builtin_wasm_table_fill(table, idx, ref, uint)
* void __builtin_wasm_table_copy(table, table, uint, uint, uint)
This commit also enables reference-types feature at bleeding-edge.
This is joint work with Alex Bradbury (@asb).
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D139010
Migration of clang tests to opaque pointers is finished, so remove
the -no-opaque-pointers flag.
Differential Revision: https://reviews.llvm.org/D152447
Reasons for rolling forward:
- the crash reported from Chromium was fixed in D151824 (not related to this patch at all)
- since D152824 was committed, it should now be safe to roll this forward.
New change:
- add an additional _ in name check
This reverts commit 4980eead4d0b4666d53dad07afb091375b3a13a0.
Device libs make use of patterns like this:
```
__attribute__((target("gfx11-insts")))
static unsigned do_intrin_stuff(void)
{
return __builtin_amdgcn_s_sendmsg_rtnl(0x0);
}
```
For functions that are assumed to be eliminated if the currennt GPU target doesn't support them.
At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function.
D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually.
This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again.
Fixes SWDEV-403642
Reviewed By: arsenm, yaxunl
Differential Revision: https://reviews.llvm.org/D152251
This change tries to move registerTargetglobalVariable and
getAddrOfDeclareTargetVar out of Clang's CGOpenMPRuntime
and into the OMPIRBuilder for shared use with MLIR's OpenMPDialect
and Flang (or other languages that may want to utilise it).
This primarily does this by trying to hoist the Clang specific
types into arguments or callback functions in the form of
lambdas, replacing it with LLVM equivelants and
utilising shared OMPIRBuilder enumerators for
the clauses, rather than Clang's own variation.
Reviewers: jsjodin, jdoerfert
Differential Revision: https://reviews.llvm.org/D149162
AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.
Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using `llvm::demangle`. Add a
release note.
Differential Revision: https://reviews.llvm.org/D149104
Correctly account for the fact that certain targets do not use the generic address space for the implicit VTT argument. This entails adjusting `ItaniumCXXABI::buildStructorSignature`, `ItaniumCXXABI::addImplicitStructorParams` and `ItaniumCXXABI::getImplicitConstructorArgs` to use the target's global variable address space. The associated test is temporarily marked `XFAIL` as additional fixes are needed.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D150746
This patch uses castAs instead of getAs which will assert if the type doesn't match to resolve dereference issue with nullptr FPT when calling getThisType() in clang::CodeGen::CGDebugInfo::CreateType(clang::MemberPointerType const *, llvm::DIFile *).
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151947
Make `qualifyWindowsLibrary` and `addStackProbeTargetAttributes`
protected members of `TargetCodeGenInfo`.
These are helper functions used by `getDependentLibraryOption` and
`setTargetAttributes` methods when targeting Windows. The change will
allow these functions to be reused after splitting `TargetInfo.cpp`.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150178
This patch uses castAs instead of getAs which will assert if the type doesn't match in clang::CodeGen::CodeGenTypes::GetFunctionTypeForVTable(clang::GlobalDecl).
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151957
LLVM IR already allows floating point type in atomicrmw.
Update clang atomic fetch max/min builtins to accept
floating point type like we did for fetch add/sub.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D150985
Fixes: SWDEV-401056
There are places in the runtime, like __kmp_init_indirect_csptr, which
assume these pointers are aligned to sizeof(void*), so make sure we emit
them with the correct alignment.
Fixes#62668
Reviewed By: jlpeyton
Differential Revision: https://reviews.llvm.org/D150723
Currently compiler assert when passing variable "memspace" in
omp_init_allocator.
omp_allocator_handle_t alloc=omp_init_allocator(memspace,1,traits)
The problem is memspace is not mapping to the target region. During
the call to emitAllocatorInit, calls to EmitVarDecl for "alloc", then
emit initialization of "alloc" that cause to assert.
If I understant correct, it is not necessary to emit variable
initialization, since "allocator" is private to target region.
To fix this call CGF.EmitAutoVarAlloca(allocator) instead
CGF.EmitVarDecl(allocator).
Differential Revision: https://reviews.llvm.org/D151743
The rest of the fetch/op intrinsics were added in e13246a2ec3 but sub
was conspicuous by its absence.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D151701
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svld1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_za8 // also for _za16, _za32, _za64 and _za128
- svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_za8 // also for _za16, _za32, _za64 and _za128
- svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
SveEmitter.cpp is extended to generate arm_sme.h (currently named
arm_sme_draft_spec_subject_to_change.h) and other SME definitions from
arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions
are moved into arm_sve_sme_incl.td.
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D127910
The corresponding function definition was removed by:
commit cf8ff75bade763b054476321dcb82dcb2e7744c7
Author: Leonard Chan <leonardchan@google.com>
Date: Tue Jul 14 14:56:38 2020 -0700
CUDA support can be enabled in clang-repl with --cuda flag.
Device code linking is not yet supported. inline must be used with all
__device__ functions.
Differential Revision: https://reviews.llvm.org/D146389
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
architectures except ARM. This change has been made in
accordance with the finalization of the mangling for the
std::bfloat16_t type, as discussed at
https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
the __bf16 type. This applies to hardware architectures that do not
natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.
Reviewed By: rjmccall, pengfei, zahiraam
Differential Revision: https://reviews.llvm.org/D150913
First, removes the invocation of the memprof instrumentation passes from
the end of the module simplification pass builder, where it doesn't
really belong. However, it turns out that this was never being invoked,
as it is guarded by an internal option not used anywhere (even tests).
These passes are actually added via clang under the -fmemory-profile
option. Changed this to add via the EP callback interface, similar to
the sanitizer passes. They are added to the EP for the end of the
optimization pipeline, which is roughly where they were being added
already (end of the pre-LTO link pipelines and non-LTO optimization
pipeline).
Ideally we should plumb the output file through to LLVM and set it up
there, so I have added a TODO.
Differential Revision: https://reviews.llvm.org/D151593
It seems load of traits.addr should be passed in runtime call. Currently
the load of load traits.addr gets passed cause runtime to fail.
To fix this, skip the call to EmitLoadOfScalar for extra load.
Differential Revision: https://reviews.llvm.org/D151576
The corresponding function definition was removed by:
commit 56e5a2e13e3048fc2ff39029cde406d9f4eb55f3
Author: George Burgess IV <george.burgess.iv@gmail.com>
Date: Sat Mar 10 01:11:17 2018 +0000
The last use was removed by:
commit c9a52de0026093327daedda7ea2eead8b64657b4
Author: Akira Hatanaka <ahatanaka@apple.com>
Date: Wed Jun 3 16:41:50 2020 -0700
Reported by Static Code Analyzer Tool:
Inside "CGExprConstant.cpp" file, VisitObjCEncodeExpr() returns null value which is dereferenced without checking.
This patch adds an assert.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151280
This patch fixes the issue that list items in `has_device_addr` are still mapped
to the target device because front end emits map type `OMP_MAP_TO`.
Fix#59160.
Reviewed By: jyu2
Differential Revision: https://reviews.llvm.org/D141627
Function pointers are checked by loading a prefix structure from just
before the function's entry point. However, on Arm, the function
pointer is not always exactly equal to the address of the entry point,
because Thumb function pointers have the low bit set to tell the BX
instruction to enter them in Thumb state. So the generated code loads
from an odd address and suffers an alignment fault.
Fixed by clearing the low bit of the function pointer before
subtracting 8.
Differential Revision: https://reviews.llvm.org/D151308
If there is an infinite cycle in the IR, the loop will never exit. Keep
track of visited basic blocks in a set and return nullptr if a block is
visited again.
Fixes#62830.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D151076
flexible array member
A zero-element array type was incorrectly being used when an incomplete
array was being initialized with a non-empty initializer.
This fixes an assertion failure in AddInitializerToStaticVarDecl. See
the discussion here: https://reviews.llvm.org/D123649#4362210
Differential Revision: https://reviews.llvm.org/D151172
Reported by Coverity static analyzer tool:
Inside "ItaniumCXXABI.cpp" file, in <unnamed>::ItaniumCXXABI::EmitLoadOfMemberFunctionPointer(clang::CodeGen::CodeGenFunction &, clang::Expr const *, clang::CodeGen::Address, llvm::Value *&, llvm::Value *, clang::MemberPointerType const *): Return value of function which returns null is dereferenced without checking.
//returned_null: getAs returns nullptr (checked 130 out of 156 times).
//var_assigned: Assigning: FPT = nullptr return value from getAs.
const FunctionProtoType *FPT =
MPT->getPointeeType()->getAs<FunctionProtoType>();
auto *RD =
cast<CXXRecordDecl>(MPT->getClass()->castAs<RecordType>()->getDecl());
// Dereference null return value (NULL_RETURNS)
//dereference: Dereferencing a pointer that might be nullptr FPT when calling arrangeCXXMethodType.
llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(
CGM.getTypes().arrangeCXXMethodType(RD, FPT, /*FD=*/nullptr));
This patch uses castAs instead of getAs which will assert if the type doesn't match.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D151054