We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.
This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously.
llvm-svn: 334208
Previous, if no Decl's were checked, visibility was set to false. Switch it
so that in cases of no Decl's, return true. These are the Decl's after being
filtered. Also remove an unreachable return statement since it is directly
after another return statement.
llvm-svn: 334160
Summary:
We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't.
I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim.
The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform.
I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it.
Reviewers: tkrupa, RKSimon, spatel, GBuella
Reviewed By: tkrupa
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D47724
llvm-svn: 334159
We were already performing checks on non-template variables,
but the checks on templated ones were missing.
Differential Revision: https://reviews.llvm.org/D45231
llvm-svn: 334143
Previously we were just using extended vector operations in the header file.
This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load.
By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count.
User code still has the option to use extended vector operations themselves if they need non-constant indexing.
llvm-svn: 334057
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent _Fract types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Fixed the test that was failing by not checking for dso_local on some
targets.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333923
```
// Primary fixed point types
signed short _Accum s_short_accum;
signed _Accum s_accum;
signed long _Accum s_long_accum;
unsigned short _Accum u_short_accum;
unsigned _Accum u_accum;
unsigned long _Accum u_long_accum;
// Aliased fixed point types
short _Accum short_accum;
_Accum accum;
long _Accum long_accum;
```
This diff only allows for declaration of the fixed point types. Assignment and other operations done on fixed point types according to http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf will be added in future patches. The saturated versions of these types and the equivalent `_Fract` types will also be added in future patches.
The tests included are for asserting that we can declare these types.
Differential Revision: https://reviews.llvm.org/D46084
llvm-svn: 333814
Ensure latest MPT decl has a MSInheritanceAttr when instantiating
templates, to avoid null MSInheritanceAttr deref in
CXXRecordDecl::getMSInheritanceModel().
See PR#37399 for repo / details.
Patch by Andrew Rogers!
Differential Revision: https://reviews.llvm.org/D46664
llvm-svn: 333680
For example, given:
enum __attribute__((deprecated)) T *p;
-ast-print produced:
enum T *p;
The attribute was lost because the enum forward decl was lost.
Another example is the loss of enum forward decls from C++ namespaces
(in MS compatibility mode).
The trouble was that the EnumDecl node was suppressed, as revealed by
-ast-dump. The suppression of the EnumDecl was intentional in
r116122, but I don't understand why. The suppression isn't needed for
the test suite to behave.
Reviewed by: rsmith
Differential Revision: https://reviews.llvm.org/D46846
llvm-svn: 333574
This patch replaces all packed (and scalar without rounding
mode) fused intrinsics with fmadd/fmaddsub variations.
Then fmadd/fmaddsub are lowered to native IR.
Patch by tkrupa
Reviewers: craig.topper, sroland, spatel, RKSimon
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D47444
llvm-svn: 333555
Summary:
Skipping them was clearly not intentional. It's impossible to
guarantee correctness if the bodies are skipped.
Also adds a test case for r327504, now that it does not produce
invalid errors that made the test fail.
Reviewers: aaron.ballman, sammccall, rsmith
Reviewed By: rsmith
Subscribers: rayglover-ibm, rwols, cfe-commits
Differential Revision: https://reviews.llvm.org/D44480
llvm-svn: 333538
Summary:
The previous implementation misses an opportunity to apply NRVO (Named Return Value
Optimization) below. That discourages user to write early return code.
```
struct Foo {};
Foo f(bool b) {
if (b)
return Foo();
Foo oo;
return oo;
}
```
That is, we can/should apply RVO for a local variable if:
* It's directly returned by at least one return statement.
* And, all reachable return statements in its scope returns the variable directly.
While, the previous implementation disables the RVO in a scope if there are multiple return
statements that refers different variables.
On the new algorithm, local variables are in NRVO_Candidate state at first, and a return
statement changes it to NRVO_Disabled for all visible variables but the return statement refers.
Then, at the end of the function AST traversal, NRVO is enabled for variables in NRVO_Candidate
state and refers from at least one return statement.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: xbolva00, Quuxplusone, arthur.j.odwyer, cfe-commits
Differential Revision: https://reviews.llvm.org/D47067
llvm-svn: 333500
Codebases that need to be compatible with the Microsoft ABI can pass
this flag to avoid issues caused by the lack of a fixed ABI for
incomplete member pointers.
Differential Revision: https://reviews.llvm.org/D47503
llvm-svn: 333498
Summary:
This patch adds the newly added `%sub` diagnostic modifier to cleanup repetition in the overload candidate diagnostics.
I think this should be good to go.
@rsmith: Some of the notes now emit `function template` where they only said `function` previously. It seems OK to me, but I would like your sign off on it.
Reviewers: rsmith, EricWF
Reviewed By: EricWF
Subscribers: cfe-commits, rsmith
Differential Revision: https://reviews.llvm.org/D47101
llvm-svn: 333485
-Warc-repeated-use-of-weak may trigger a segmentation fault when the Decl
being checked is outside of a function scope, leaving the current function
info pointer null. This adds a check before using the function info.
llvm-svn: 333471
Handling of the third parameter was only checking for *_n and not for the C11 variant, which means that cmpxchg of a 'desired' 0 value was erroneously warning. Handle C11 properly, and add extgensive tests for this as well as NULL pointers in a bunch of places.
Fixes r333246 from D47229.
llvm-svn: 333290
Currently getting such completions requires source correction, reparsing
and calling completion again. And if it shows no results and rollback is
required then it costs one more reparse.
With this change it's possible to get all results which can be later
filtered to split changes which require correction.
Differential Revision: https://reviews.llvm.org/D41537
llvm-svn: 333272
Summary:
As a companion to libc++ patch https://reviews.llvm.org/D47225, mark builtin atomic non-member functions which accept pointers as nonnull.
The atomic non-member functions accept pointers to std::atomic / std::atomic_flag as well as to the non-atomic value. These are all dereferenced unconditionally when lowered, and therefore will fault if null. It's a tiny gotcha for new users, especially when they pass in NULL as expected value (instead of passing a pointer to a NULL value).
<rdar://problem/18473124>
Reviewers: arphaman
Subscribers: aheejin, cfe-commits
Differential Revision: https://reviews.llvm.org/D47229
llvm-svn: 333246
Summary:
Remove the call to DiagnoseUseOfDecl in LookupMemberExpr because:
1. LookupMemberExpr eagerly lookup both getter and setter, reguardless
if they are used or not. It causes wrong diagnostics if you are only
using getter.
2. LookupMemberExpr only diagnoses getter, but not setter.
3. ObjCPropertyOpBuilder already DiagnoseUseOfDecl when building getter
and setter. Doing it again in LookupMemberExpr causes duplicated
diagnostics.
rdar://problem/38479756
Reviewers: erik.pilkington, arphaman, doug.gregor
Reviewed By: arphaman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D47280
llvm-svn: 333148
It caused asserts, see PR37560.
> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166
llvm-svn: 333067
more reliably.
Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.
In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.
Differential Revision: https://reviews.llvm.org/D47166
llvm-svn: 333044
Handle attributes before checking the record layout (e.g. underalignment check
during `alignas` processing), as layout may be cached without taking into
account attributes that may affect it.
Differential Revision: https://reviews.llvm.org/D46439
llvm-svn: 332843
Summary:
Previously this triggered a -Wundefined-internal warning. But it's not
an undefined variable -- any variable of this form is a pointer to the
base of GPU core's shared memory.
Reviewers: tra
Subscribers: sanjoy, rsmith
Differential Revision: https://reviews.llvm.org/D46782
llvm-svn: 332621
Clang used to pass the base lvalue of a non-type template parameter
to the template instantiation phase when the base part is __uuidof
and it's running in C++17 mode.
However, that drops its LValuePath, and unintentionally transforms
&__uuidof(...) to __uuidof(...).
This CL fixes that by passing whole expr. Fixes PR24986.
https://reviews.llvm.org/D46820?id=146557
Patch from Taiju Tsuiki <tzik@chromium.org>!
llvm-svn: 332614
The added test case was triggering assertion
> Assertion failed: (!SpecializedTemplate.is<SpecializedPartialSpecialization*>() && "Already set to a class template partial specialization!"), function setInstantiationOf, file clang/include/clang/AST/DeclTemplate.h, line 1825.
It was happening with ClassTemplateSpecializationDecl
`enable_if_not_same<int, int>`. Because this template is specialized for
equal types not to have a definition, it wasn't instantiated and its
specialization kind remained TSK_Undeclared. And because it was implicit
instantiation, we didn't mark the decl as invalid. So when we try to
find the best matching partial specialization the second time, we hit
the assertion as partial specialization is already set.
Fix by reusing stored partial specialization when available, instead of
looking for the best match every time.
rdar://problem/39524996
Reviewers: rsmith, arphaman
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46909
llvm-svn: 332509
Clang often tries to create implicit module import for error recovery,
which does a great job helping out with diagnostics. However, sometimes
clang does not have enough information given that it's using an invalid
context to move on. Be more strict in those cases to avoid crashes.
We hit crash on invalids because of this but unfortunately there are no
testcases and I couldn't manage to create one. The crashtrace however
indicates pretty clear why it's happening.
rdar://problem/39313933
llvm-svn: 332491
Add support for __declspec(code_seg("segname"))
This patch is built on the existing support for #pragma code_seg. The code_seg
declspec is allowed on functions and classes. The attribute enables the
placement of code into separate named segments, including compiler-generated
members and template instantiations.
For more information, please see the following:
https://msdn.microsoft.com/en-us/library/dn636922.aspx
A new CodeSeg attribute is used instead of adding a new spelling to the existing
Section attribute since they don’t apply to the same Subjects. Section
attributes are also added for the code_seg declspec since they are used for
#pragma code_seg. No CodeSeg attributes are added to the AST.
The patch is written to match with the Microsoft compiler’s behavior even where
that behavior is a little complicated (see https://reviews.llvm.org/D22931, the
Microsoft feedback page is no longer available since MS has removed the page).
That code is in getImplicitSectionAttrFromClass routine.
Diagnostics messages are added to match with the Microsoft compiler for code-seg
attribute mismatches on base and derived classes and virtual overrides.
Differential Revision: https://reviews.llvm.org/D43352
llvm-svn: 332470
Like other conversion warnings, allow float overflow warnings to be disabled
in known dead paths of template instantiation. This often occurs when a
template template type is a numeric type and the template will check the
range of the numeric type before performing the conversion.
llvm-svn: 332310
After a fatal error Sema::InstantiatingTemplate doesn't allow further
instantiation and doesn't push a CodeSynthesisContext. When we tried to
synthesize implicit deduction guides from constructors we hit the
assertion
> Assertion failed: (!CodeSynthesisContexts.empty() && "Cannot perform an instantiation without some context on the " "instantiation stack"), function SubstType, file clang/lib/Sema/SemaTemplateInstantiate.cpp, line 1580.
Fix by avoiding deduction guide synthesis if InstantiatingTemplate is invalid.
rdar://problem/39051732
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46446
llvm-svn: 332307
If the name after 'template' is an unresolved using declaration (not containing
'typename'), then we don't yet know if it's a valid template-name, so don't
reject it prior to instantiation. Instead, treat it as naming a dependent
member of the current instantiation.
llvm-svn: 332291
The fixit is actively harmful, as it encourages developers to ignore the
warning and to write unsafe code.
It is almost impossible to write safe code while capturing autoreleasing
variables in the block, as in order to check that the block is never
called in the autoreleasing pool the developer has to check the
transitive closure of all potential callers of the block.
Differential Revision: https://reviews.llvm.org/D46778
llvm-svn: 332288
For example, given:
void fn() {
struct T *p0;
struct T { int i; } *p1;
}
-ast-print produced:
void fn() {
struct T { int i; } *p0;
struct T { int i; } *p1;
}
Compiling that fails with a redefinition error.
Given:
void fn() {
struct T *p0;
struct __attribute__((deprecated)) T *p1;
}
-ast-print dropped the attribute.
Details:
For a tag specifier (that is, struct/union/class/enum used as a type
specifier in a declaration) that was also a tag declaration (that is,
first occurrence of the tag) or tag redeclaration (that is, later
occurrence that specifies attributes or a member list), clang printed
the tag specifier as either (1) the full tag definition if one
existed, or (2) the first tag declaration otherwise. Redefinition
errors were sometimes introduced, as in the first example above. Even
when that was impossible because no member list was ever specified,
attributes were sometimes lost, thus changing semantics and
diagnostics, as in the second example above.
This patch fixes a major culprit for these problems. It does so by
creating an ElaboratedType with a new OwnedDecl member wherever an
occurrence of a tag type is a (re)declaration of that tag type.
PrintingPolicy's IncludeTagDefinition used to trigger printing of the
member list, attributes, etc. for a tag specifier by using a tag
(re)declaration selected as described above. Now, it triggers the
same thing except it uses the tag (re)declaration stored in the
OwnedDecl. Of course, other tooling can now make use of the new
OwnedDecl as well.
Also, to be more faithful to the original source, this patch
suppresses printing of attributes inherited from previous
declarations.
Reviewed by: rsmith, aaron.ballman
Differential Revision: https://reviews.llvm.org/D45463
llvm-svn: 332281