Pointer auth protection of the block descriptor pointer is only
supported in some constrained environments so we do actually need it to
be configurable.
We had made it non configurable in the first PR to protect block
metadata because we believed that was an option but subsequently
realised it does need to remain configurable.
This PR revives the flags that permit this.
The #cir.global_view attribute was initially added without support for
the optional index list. This change adds index list support. This is
used when the address of an array or structure member is used as an
initializer.
This patch does not include support for taking the address of a
structure or class member. That will be added later.
The 'cfi_salt' attribute specifies a string literal that is used as a
"salt" for Control-Flow Integrity (CFI) checks to distinguish between
functions with the same type signature. This attribute can be applied
to function declarations, function definitions, and function pointer
typedefs.
This attribute prevents function pointers from being replaced with
pointers to functions that have a compatible type, which can be a CFI
bypass vector.
The attribute affects type compatibility during compilation and CFI
hash generation during code generation.
Attribute syntax: [[clang::cfi_salt("<salt_string>")]]
GNU-style syntax: __attribute__((cfi_salt("<salt_string>")))
- The attribute takes a single string of non-NULL ASCII characters.
- It only applies to function types; using it on a non-function type
will generate an error.
- All function declarations and the function definition must include
the attribute and use identical salt values.
Example usage:
// Header file:
#define __cfi_salt(S) __attribute__((cfi_salt(S)))
// Convenient typedefs to avoid nested declarator syntax.
typedef int (*fp_unsalted_t)(void);
typedef int (*fp_salted_t)(void) __cfi_salt("pepper");
struct widget_ops {
fp_unsalted_t init; // Regular CFI.
fp_salted_t exec; // Salted CFI.
fp_unsalted_t teardown; // Regular CFI.
};
// bar.c file:
static int bar_init(void) { ... }
static int bar_salted_exec(void) __cfi_salt("pepper") { ... }
static int bar_teardown(void) { ... }
static struct widget_generator _generator = {
.init = bar_init,
.exec = bar_salted_exec,
.teardown = bar_teardown,
};
struct widget_generator *widget_gen = _generator;
// 2nd .c file:
int generate_a_widget(void) {
int ret;
// Called with non-salted CFI.
ret = widget_gen.init();
if (ret)
return ret;
// Called with salted CFI.
ret = widget_gen.exec();
if (ret)
return ret;
// Called with non-salted CFI.
return widget_gen.teardown();
}
Link: https://github.com/ClangBuiltLinux/linux/issues/1736
Link: https://github.com/KSPP/linux/issues/365
---------
Signed-off-by: Bill Wendling <morbo@google.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Both clang and gfortran support the -fopenmp-simd flag, which enables
OpenMP support only for simd constructs, while disabling the rest of
OpenMP.
Implement the appropriate parse tree rewriting to remove non-SIMD OpenMP
constructs at the parsing stage.
Add a new SimdOnly flang OpenMP IR pass which rewrites generated OpenMP
FIR to handle untangling composite simd constructs, and clean up OpenMP
operations leftover after the parse tree rewriting stage.
With this approach, the two parts of the logic required to make the flag
work can be self-contained within the parse tree rewriter and the MLIR
pass, respectively. It does not need to be implemented within the core
lowering logic itself.
The flag is expected to have no effect if -fopenmp is passed explicitly,
and is only expected to remove OpenMP constructs, not things like OpenMP
library functions calls. This matches the behaviour of other compilers.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
These warnings are reported on a per expression basis, however some
potential misaligned accesses are discarded before that happens. The
problem is when a new expression starts while processing another
expression. The new expression will end first and emit all potential
misaligned accesses collected up to that point. That includes candidates
that were found in the parent expression, even though they might have
gotten discarded later.
Fixed by checking if the candidate is located withing the currently
processed expression.
Fixes#144729
Added constant evaluation support for `__builtin_elementwise_abs` on integer, float and vector type.
fixes#152276
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
Introduces the use of pointer authentication to protect the invocation,
copy and dispose, reference, and descriptor pointers in Objective-C
block objects.
Resolves#141176
The cleanup structs expect that pointers and (u)int64_t have the same
alignment requirements, which isn't true on sparc32, which causes
SIGBUSes.
See also: https://github.com/llvm/llvm-project/issues/66620
There is no documentation for -mimplicit-float, -mno-implicit-float here
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-mimplicit-float
I believe this is because the text is taken from the positive option
when there is a no- version. Add HelpText to the positive option to
hopefully fix this.
These options also affect vector and not just FP so having text here
that mentions vectors is helpful to users.
The goal is to correctly identify diagnostics that are emitted by virtue
of -Wformat-signedness.
Before this change, diagnostic messages triggered by -Wformat-signedness
might look like:
format specifies type 'unsigned int' but the argument has type 'int'
[-Wformat]
signedness of format specifier 'u' is incompatible with 'c' [-Wformat]
With this change:
format specifies type 'unsigned int' but the argument has type 'int',
which differs in signedness [-Wformat-signedness]
signedness of format specifier 'u' is incompatible with 'c'
[-Wformat-signedness]
Fix:
- handleFormatSignedness can now return NoMatchSignedness. Callers
handle this.
- warn_format_conversion_argument_type extends the message it used to
emit by a string that
mentions "signedness".
- warn_format_cmp_specifier_sign_mismatch is now correctly categorized
as a
diagnostic controlled by -Wformat-signedness.
This PR introduces the `LabelOp`, which is required for implementing
`GotoOp` lowering in the future.
Lowering to LLVM IR is **not** included in this patch, since it depends
on the upcoming `GotoSolver`.
The `GotoSolver` traverses the function body, and if it finds a
`LabelOp` without a matching `GotoOp`, it erases the label.
This means our implementation differs from the classic codegen approach,
where labels may be retained even if unused.
Example:
https://godbolt.org/z/37Mvr4MMr
The OpenACC standard is going to change to clarify that init, shutdown,
and set should only have a single architecture in each 'device_type'
clause. This patch implements that restriction.
See: https://github.com/OpenACC/openacc-spec/pull/550
This commit optimizes `tok::isLiteral` by replacing a succession of `13`
conditions with a range-based check.
I am not sure whether this is allowed. I believe it is done nowhere else
in the codebase ; however, I have seen range-based conditions being used
with other enums.
---------
Co-authored-by: Corentin Jabot <corentinjabot@gmail.com>
Only set a target guard if it deviates from its default value[1].
When a target guard is set, it is automatically AND'd with its default
value. This means there is no need to use SVETargetGuard="sve,bf16"
because SVETargetGuard="bf16" is sufficient.
[1] Defaults: SVETargetGuard="sve", SMETargetGuard="sme"
Replaces the XOP/AVX512 per-element rotation/funnel shift builtins with the generic __builtin_elementwise_fshl/fshr
We still have uniform immediate variants to handle next.
Part of #153152
SwiftConformsTo specifies an additional conformance that should be
applied on import. Allow this on typedefs, because those can be imported
as wrapper types.
This change introduces the #cir.global_view attribute and adds support
for using that attribute to handle initializing a global variable with
the address of another global variable.
This does not yet include support for the optional list of indices to
get an offset from the base address. Those will be added in a follow-up
patch.
This adds support for initializing the vptr member of a dynamic class in
the constructor of that class.
This does not include support for lowering the
`cir.vtable.address_point` operation to the LLVM dialect. That handling
will be added in a follow-up patch.
Support for normal cleanups was introduced with a simplified
implementation compared to what's in the incubator (which corresponds
closely to the classic codegen implementation).
This change introduces more of the infrastructure that will later be
needed to handle non-trivial cleanup cases, including exception
handling.
The following intrinsics were replaced by a combination of
`__builtin_shufflevector` and `__builtin_convertvector`:
- `__builtin_ia32_vcvtph2ps`
- `__builtin_ia32_vcvtph2ps256`
Fixes#152749
If a resource array does not have an explicit binding attribute,
SemaHLSL will add an implicit one. The attribute will be used to
transfer implicit binding order ID to the codegen, the same way as it is
done for HLSLBufferDecls. This is necessary in order to generate correct
initialization of resources in an array that does not have an explicit
binding.
Depends on #152450
Part 1 of #145424
This continues my patch series started as #142541 where multiple kinds
of Expr all use the same getUnusedResultAttrImpl.
The test suite indicates there is no change in behavior happening here.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
Static analysis complained that:
child_range(&Init, &Init+1);
in the children member function was potentially out of bounds. This is
false b/c it is forming an iterator range but it would be invalid if
Init was a nullptr.
I add an assertion in the constructor for this and remove to FIXME
checks that are related to this. I checked the various usages and we
always valid the argument is not nullptr.
The following intrinsics were replaced by `__builtin_elementwise_fma`:
- `__builtin_ia32_vfmaddps(256)`
- `__builtin_ia32_vfmaddpd(256)`
- `__builtin_ia32_vfmaddph(256)`
- `__builtin_ia32_vfmaddbf16(128 | 256 | 512)`
All the aforementioned `__builtin_ia32_vfmadd*` intrinsics are
equivalent to a `__builtin_elementwise_fma`, so keeping them is an
unnecessary indirection.
Fixes [#152461](https://github.com/llvm/llvm-project/issues/152461)
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
Binding a value to location can happen when a new value is created or
when and existing value is updated. This modification exposes whether
the value binding happens at a declaration.
This helps simplify the hacky logic of the BindToImmutable checker.
This patch updates the pmulhw/pmulhuw builtins to support constant
expression handling - extending the VectorExprEvaluator::VisitCallExpr
handling code that handles elementwise integer binop builtins.
Hopefully this can be used as reference patch to show how to add future
target specific constexpr handling with minimal code impact.
I've also enabled pmullw constexpr handling (which are tagged on
#152490) as they all use very similar tests.
I've also had to tweak the MMX -> SSE2 wrapper as undefs are not
permitted in constexpr shuffle masks
Fixes#152524
When the target enables +sme2p2, the svcompact intrinsic is now
available in streaming SVE mode, through updating the guards in
arm_sve.td. Included Sema test acle_sve_compact.cpp.
Followup work of #140498 to continue the work on clangd/clangd#529
Introduce the use of the Clang doxygen parser to parse the documentation
of hovered code.
- ASTContext independent doxygen parsing
- Parsing doxygen commands to markdown for hover information
Note: after this PR I have planned another patch to rearrange the
information shown in the hover info.
This PR is just for the basic introduction of doxygen parsing for hover
information.
---------
Co-authored-by: Maksim Ivanov <emaxx@google.com>
HIP runtime support for compressed bundle format v3 is in place,
therefore switch the default compressed bundle format to v3 in compiler.
This allows both compressed and decompressed fat binary size to exceed
4GB by default.
Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for
backward compatibility for older HIP runtimes not supporting v3.
Fixes: SWDEV-548879
This patch introduces the ability to customize the fork process with an external lambda function. This is useful for downstream clients where they want to do stream redirection.
Adds the `isHLSLResourceRecordArray()` method to the `Type` class. This method returns `true` if the `Type` represents an array of HLSL resource records. Defining this method on `Type` makes it accessible from both sema and codegen.
I fixed support for varargs functions
(previously it didn't crash but the codegen was incorrect).
I added tests for structs and unions which already work. With the
multivalue abi they crash in the backend, so I added a sema check that
rejects structs and unions for that abi.
It will also crash in the backend if passed an int128 or float128 type.
When the cleanup handling code was initially upstreamed, a SmallVector
was used to simplify the handling of the stack of cleanup objects.
However, that mechanism won't scale well enough for the rate at which
cleanup handlers are going to be pushed and popped while compiling a
large program. This change introduces the custom memory allocator which
is used in classic codegen and the CIR incubator.
Thiis does not otherwise change the cleanup handling implementation and
many parts of the infrastructure are still missing.
This is not intended to have any observable effect on the generated CIR,
but it does change the internal implementation significantly, so it's
not exactly an NFC change. The functionality is covered by existing
tests.
This change adds the definition of VTableAddrPointOp and the related
AddressPointAttr to the CIR dialect, along with tests for the parsing
and verification of these elements.
Code to generate this operation will be added in a later change.