SPIR target can be used for C/C++ inputs too (i.e. in OpenCL compatible mode for the libs creation).
Patch by Neil Henning!
Review: http://reviews.llvm.org/D19478
llvm-svn: 267561
instantiation is in a module.
This patch fixes the condition for determining whether the debug info for a
template instantiation will exist in an imported clang module by:
- checking whether the ClassTemplateSpecializationDecl is complete and
- checking that the instantiation was in a module by looking at the first field.
I also added a negative check to make sure that a typedef to a forward-declared
template (with the definition outside of the module) is handled correctly.
http://reviews.llvm.org/D19443
rdar://problem/25553724
llvm-svn: 267464
The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed.
The next code will be generated for the taskloop directive:
#pragma omp taskloop num_tasks(N) lastprivate(j)
for( i=0; i<N*GRAIN*STRIDE-1; i+=STRIDE ) {
int th = omp_get_thread_num();
#pragma omp atomic
counter++;
#pragma omp atomic
th_counter[th]++;
j = i;
}
Generated code:
task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct
task),sizeof(struct shar),&task_entry);
psh = task->shareds;
psh->pth_counter = &th_counter;
psh->pcounter = &counter;
psh->pj = &j;
task->lb = 0;
task->ub = N*GRAIN*STRIDE-2;
task->st = STRIDE;
__kmpc_taskloop(
NULL, // location
gtid, // gtid
task, // task structure
1, // if clause value
&task->lb, // lower bound
&task->ub, // upper bound
STRIDE, // loop increment
0, // 1 if nogroup specified
2, // schedule type: 0-none, 1-grainsize, 2-num_tasks
N, // schedule value (ignored for type 0)
(void*)&__task_dup_entry // tasks duplication routine
);
llvm-svn: 267395
LLVM really wants a debug location on every inlinable call in a function
with debug info, because it otherwise cannot set up inlining debug info.
This change applies an artificial line 0 debug location (which is how
DWARF marks automatically generated code that has no corresponding
source code) to the .__kmpc_global_dtor_. functions to avoid the
LLVM Verifier complaining.
llvm-svn: 267369
LLVM stopped using MDString-based type references, and DIBuilder no
longer fills 'retainedTypes:' with every DICompositeType that has an
'identifier:' field. There are just minor changes to keep the same
behaviour in CFE.
Leaving 'retainedTypes:' unfilled has a dramatic impact on the output
order of the IR though. There are a huge number of testcase changes,
which were unfortunately not really scriptable.
llvm-svn: 267297
Write out the PGOFuncName meta data if PGOFuncName is different from function's
raw name. This should only apply to internal linkage functions. This is to be
consumed by indirect-call promotion when called in LTO optimization pass.
Differential Revision: http://reviews.llvm.org/D18624
llvm-svn: 267224
For various reasons, involving dllexport and class linkage compuations,
we have to wait until after the semicolon after a class declaration to
emit inline methods. These are "deferred" decls. Before this change,
finishing the tag decl would trigger us to deserialize some PCH so that
we could make a "pretty" IR-level type. Deserializing the PCH triggered
calls to HandleTopLevelDecl, which, when done, checked the deferred decl
list, and emitted some dllexported decls that weren't ready.
Avoid this re-entrancy. Deferred decls should not get emitted when a tag
is finished, they should only be emitted after a real top level decl in
the main file.
llvm-svn: 267186
causes code generation failure.
The codegen part of firstprivate clause for member decls used type of
original variable without skipping reference type from
OMPCapturedExprDecl. Patch fixes this problem.
llvm-svn: 267125
If loop control variable for simd-based directives is explicitly marked
as linear/lastprivate in clauses, codegen for such construct would
crash. Patch fixes this problem.
llvm-svn: 267101
Summary:
Adds a framework to enable the instrumentation pass for the new
EfficiencySanitizer ("esan") family of tools. Adds a flag for esan's
cache fragmentation tool via -fsanitize=efficiency-cache-frag.
Adds appropriate tests for the new flag.
Reviewers: eugenis, vitalybuka, aizatsky, filcab
Subscribers: filcab, kubabrecka, llvm-commits, zhaoqin, kcc
Differential Revision: http://reviews.llvm.org/D19169
llvm-svn: 267059
in the compile unit that contains their implementation even if their
interface is declared in a module.
The private @implementation of an @interface may have additional
hidden ivars so we should not defer to the public version of the
type that is found in the module.
<rdar://problem/25541798>
llvm-svn: 266937
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266853
Summary:
This is a follow-on to apply Duncan's new DIType ODR uniquing from
r266549 and r266713 in more places.
When invoking ThinLTO backend compiles via clang (for a distributed
build), invoke enableDebugTypeODRUniquing() before parsing the module.
Reviewers: dexonsmith, joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19264
llvm-svn: 266852
The intrinsic is now called llvm.thread.pointer, not
llvm.aarch64.thread.pointer. Also, the code handling it in CGBuiltin.cpp
is dead - it's already covered by GCCBuiltin. Remove it.
Differential Revision: http://reviews.llvm.org/D19099
llvm-svn: 266817
r259537 added vfma/vfms to armv7, but the builtin was only lowered
on the AArch64 side. Instead of supporting it on ARM, get rid of it.
The vfms builtin lowered to:
%nb = fsub float -0.0, %b
%r = @llvm.fma.f32(%a, %nb, %c)
Instead, define the operation in terms of vfma, and swap the
multiplicands. It now lowers to:
%na = fsub float -0.0, %a
%r = @llvm.fma.f32(%na, %b, %c)
This matches the instruction more closely, and lets current LLVM
generate the "natural" operand ordering:
fmls.2s v0, v1, v2
instead of the crooked (but equivalent):
fmls.2s v0, v2, v1
Except for theses changes, assembly is identical.
LLVM accepts both commutations, and the LLVM tests in:
test/CodeGen/AArch64/arm64-fmadd.ll
test/CodeGen/AArch64/fp-dp3.ll
test/CodeGen/AArch64/neon-fma.ll
test/CodeGen/ARM/fusedMAC.ll
already check either the new one only, or both.
Also verified against the test-suite unittests.
llvm-svn: 266807
Currently, for the ppc64--gnu and aarch64 ABIs, we recognize:
typedef __attribute__((__ext_vector_type__(3))) float v3f32;
typedef __attribute__((__ext_vector_type__(16))) char v16i8;
struct HFA {
v3f32 a;
v16i8 b;
};
as an HFA. Since the first type encountered is used as the base type,
we pass the HFA as:
[2 x <3 x float>]
Which leads to incorrect IR (relying on padding values) when the
second field is used.
Instead, explicitly widen the vector (after size rounding) in
isHomogeneousAggregate.
Differential Revision: http://reviews.llvm.org/D18998
llvm-svn: 266784
If the untied clause is present on a task construct, any thread in the
team can resume the task region after a suspension. Patch adds proper
codegen for untied tasks.
llvm-svn: 266754
If the untied clause is present on a task construct, any thread in the team can resume the task region after a suspension. Patch adds proper codegen for untied tasks.
llvm-svn: 266722
Since elements of most kinds of DICompositeType have back references,
most are involved in uniquing cycles. Except via the ODR 'identifier:'
field, which doesn't care about the storage type (see r266549),
they have no hope of being uniqued.
Distinct nodes are far more efficient, so use them for most kinds of
DICompositeType definitions (i.e., when DIType::isForwardDecl is false).
The exceptions:
- DW_TAG_array_type, since their elements never have back-references
and they never have ODR 'identifier:' fields;
- DW_TAG_enumeration_type when there is no ODR 'identifier:' field,
since their elements usually don't have back-references.
This breaks the last major uniquing cycle I'm aware of in the debug info
graph. The impact won't be enormous for C++ because references to
ODR-uniqued nodes still use string-based DITypeRefs; but this should
prevent a regression in C++ when we drop the string-based references.
This wouldn't have been reasonable until r266549, when composite types
stopped relying on being uniqued by structural equivalence to prevent
blow-ups at LTO time.
llvm-svn: 266556
Since this patch provided support for the __float128 type but disabled it
on all platforms by default, some platforms can't compile type_traits with
-std=gnu++11 since there is a specialization with __float128.
This reverts the patch until D19125 is approved (i.e. we know which platforms
need this support enabled).
llvm-svn: 266460
Non-owning pointers that cache LLVM types and constants can use
'nullptr' default member initializers so that we don't need to mention
them in the constructor initializer list.
Owning pointers should use std::unique_ptr so that we don't need to
manually delete them in the destructor. They also don't need to be
mentioned in the constructor at that point.
NFC
llvm-svn: 266263
of a table entry in the corresponding decl, store an offset from the current
record to the relevant CXX_CTOR_INITIALIZERS record. This results in fewer
indirections and a minor .pcm file size reduction.
llvm-svn: 266254
This patch corresponds to review:
http://reviews.llvm.org/D15120
It adds support for the __float128 keyword, literals and a target feature to
enable it. This support is disabled by default on all targets and any target
that has support for this type is free to add it.
Based on feedback that I've received from target maintainers, this appears to
be the right thing for most targets. I have not heard from the maintainers of
X86 which I believe supports this type. I will subsequently investigate the
impact of enabling this on X86.
llvm-svn: 266186
Putting OpenCLImageTypes.def to clangAST library violates layering requirement: "It's not OK for a Basic/ header to include an AST/ header".
This fixes the modules build.
Differential revision: http://reviews.llvm.org/D18954
Reviewers: Richard Smith, Vassil Vassilev.
llvm-svn: 266180
exiting the for-in loop.
This commit fixes a bug where EmitObjCForCollectionStmt didn't pop
cleanups for captures.
For example, in the following for-in loop, a block which captures self
is passed to foo1:
for (id x in [self foo1:^{ use(self); }]) {
use(x);
break;
}
Previously, the code in EmitObjCForCollectionStmt wouldn't pop the
cleanup for the captured self before exiting the loop, which caused
code-gen to generate an IR in which objc_release was called twice on
the captured self.
This commit fixes the bug by entering a RunCleanupsScope before the
loop condition is evaluated and forcing its cleanup before exiting the
loop.
rdar://problem/16865751
Differential Revision: http://reviews.llvm.org/D18618
llvm-svn: 266147
Clang should pass -backend-option to LLVM even though there is no target machine, since LLVM passes are used when emitting LLVM IR.
Differential Revision: http://reviews.llvm.org/D17552
llvm-svn: 266117
In codegen different address spaces may be mapped to the same address
space for a target, e.g. in x86/x86-64 all address spaces are mapped
to 0. Therefore AddressSpaceConversion should be translated by
CreatePointerBitCastOrAddrSpaceCast instead of CreateAddrSpaceCast.
Differential Revision: http://reviews.llvm.org/D18713
llvm-svn: 266107
OpenMP 4.0 defines clause 'uniform' in 'declare simd' directive:
'uniform' '(' <argument-list> ')'
The uniform clause declares one or more arguments to have an invariant value for all concurrent invocations of the function in the execution of a single SIMD loop.
The special this pointer can be used as if was one of the arguments to the function in any of the linear, aligned, or uniform clauses.
llvm-svn: 266041
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. LLVM
patch http://reviews.llvm.org/D15525
Differential Revision: http://reviews.llvm.org/D15524
llvm-svn: 265917
profiling and optimization remarks and indicate that no debug info shall
be emitted for these compile units.
http://reviews.llvm.org/D18808
<rdar://problem/25427165>
llvm-svn: 265862
It is possible to argue that the EABIVersion field is similar in spirit to the
ABI field in TargetOptions. It represents the embedded ABI that the target
follows. This will allow us to thread this information into the target
information construction.
llvm-svn: 265807