1773 Commits

Author SHA1 Message Date
Koakuma
8071d279fd
[WIP] [clang] Align cleanup structs to prevent SIGBUS on sparc32 (#152866)
The cleanup structs expect that pointers and (u)int64_t have the same
alignment requirements, which isn't true on sparc32, which causes
SIGBUSes.

See also: https://github.com/llvm/llvm-project/issues/66620
2025-08-13 23:26:36 +07:00
Matheus Izvekov
91cdd35008
[clang] Improve nested name specifier AST representation (#147835)
This is a major change on how we represent nested name qualifications in
the AST.

* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.

This has great results on compile-time-tracker as well:

![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831)

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.

The rest and bulk of the changes are mostly consequences of the changes
in API.

PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
2025-08-09 05:06:53 -03:00
Yingwei Zheng
ac8295550b
[Clang][CodeGen] Move EmitPointerArithmetic into CodeGenFunction. NFC. (#152634)
`CodeGenFunction::EmitPointerArithmetic` is needed by
https://github.com/llvm/llvm-project/pull/152575. Separate the NFC
changes into a new PR for smooth review.
2025-08-08 21:41:03 +08:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Aaron Ballman
77618a9253
Delete copy constructor/assignment; NFC (#145689)
This is an RAII object and static analysis was flagging it for not
following the rule of three (or five).
2025-06-25 09:11:00 -04:00
Steven Perron
01d648a429
[HLSL][SPIRV] Reapply "[HLSL][SPIRV] Add vk::constant_id attribute." (#144902)
- **Reapply "[HLSL][SPIRV] Add vk::constant_id attribute." (#144812)**
- **Fix memory leak.**
2025-06-19 11:52:55 -04:00
Steven Perron
5f69d680e2
Revert "[HLSL][SPIRV] Add vk::constant_id attribute." (#144812)
Reverts llvm/llvm-project#143544
2025-06-18 19:30:43 -04:00
Steven Perron
acde20b560
[HLSL][SPIRV] Add vk::constant_id attribute. (#143544)
The vk::constant_id attribute is used to indicate that a global const
variable
represents a specialization constant in SPIR-V. This PR adds this
attribute to clang.

The documentation for the attribute is
[here](https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/SPIR-V.rst#specialization-constants).

The strategy is to to modify the initializer to get the value of a
specialize constant for a builtin defined in the SPIR-V backend.

Implements https://github.com/llvm/wg-hlsl/pull/287

Fixes https://github.com/llvm/llvm-project/issues/142448

---------

Co-authored-by: Nathan Gauër <github@keenuts.net>
2025-06-18 06:39:52 -04:00
Thurston Dang
428afa62b0
[ubsan] Add more -fsanitize-annotate-debug-info checks (#141997)
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals.

Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name.

Updates the tests from https://github.com/llvm/llvm-project/pull/141814.

---------

Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2025-06-06 14:59:32 -07:00
Orlando Cazalet-Hyams
54d544b831
[KeyInstr][Clang] Ret atom (#134652)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

When returning a value, stores to the `retval` allocas and branches to `return`
block are put in the same atom group. They are both rank 1, which could in
theory introduce an extra step in some optimized code. This low risk currently
feels an acceptable for keeping the code a bit simpler (as opposed to adding
scaffolding to make the store rank 2).

In the case of a single return (no control flow) the return instruction inherits
the atom group of the branch to the return block when the blocks get folded
togather.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-04 15:43:49 +01:00
Vitaly Buka
2622e6bfa0
[NFC][CodeGen] Extract SanitizerHandler into own header (#142527) 2025-06-02 22:47:08 -07:00
Nikita Popov
e2b536431d
[CodeGen] Move CodeGenPGO behind unique_ptr (NFC) (#142155)
The InstrProf headers are very expensive. Avoid including them in all of
CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr.

This reduces clang build time by 0.8%.
2025-06-02 09:51:54 +02:00
Orlando Cazalet-Hyams
123bf5f46c [KeyInstr][Clang] If stmt atom (#134642)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-05-23 13:46:37 +01:00
Orlando Cazalet-Hyams
0ee40ca5cc
[KeyInstr][Clang] Aggregate init + copy (#134639)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-05-22 17:35:49 +01:00
Orlando Cazalet-Hyams
9459c8309c
[KeyInstr][Clang] Add ApplyAtomGroup (#134632)
This is a scoped helper similar to ApplyDebugLocation that creates a new source
location atom group which instructions can be added to.

A source atom is a source construct that is "interesting" for debug stepping
purposes. We use an atom group number to track the instruction(s) that implement
the functionality for the atom, plus backup instructions/source locations.

This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-05-21 17:40:45 +01:00
Thurston Dang
b24c33a9d7
[cfi] Enable -fsanitize-annotate-debug-info functionality for CFI checks (#139809)
This connects the -fsanitize-annotate-debug-info plumbing (https://github.com/llvm/llvm-project/pull/138577) to CFI check codegen, using SanitizerAnnotateDebugInfo() (https://github.com/llvm/llvm-project/pull/139965) and SanitizerInfoFromCFIKind (https://github.com/llvm/llvm-project/pull/140117).

Note: SanitizerAnnotateDebugInfo() is updated to a public function because it is needed in ItaniumCXXABI.

Updates the tests from https://github.com/llvm/llvm-project/pull/139149.
2025-05-19 09:39:26 -07:00
Thurston Dang
5defe490c9
[sanitizer][NFCI] Add 'SanitizerAnnotateDebugInfo' (#139965)
This generalizes the debug info annotation code from https://github.com/llvm/llvm-project/pull/139149 and moves it into a helper function, SanitizerAnnotateDebugInfo().

Future work can use 'ApplyDebugLocation ApplyTrapDI(*this, SanitizerAnnotateDebugInfo(Ordinal));' to add annotations to additional checks.
2025-05-15 09:24:04 -07:00
Bill Wendling
9ae3bce175
[Clang][counted_by] Add support for 'counted_by' on struct pointers (#137250)
The 'counted_by' attribute is now available for pointers in structs.
It generates code for sanity checks as well as
__builtin_dynamic_object_size()
calculations. For example:

  struct annotated_ptr {
    int count;
    char *buf __attribute__((counted_by(count)));
  };

If the pointer's type is 'void *', use the 'sized_by' attribute, which
works similarly to 'counted_by', but can handle the 'void' base type:

  struct annotated_ptr {
    int count;
    void *buf __attribute__((sized_by(count)));
  };

If the 'count' field member occurs after the pointer, use the
'-fexperimental-late-parse-attributes' flag during compilation.

Note that 'counted_by' cannot be applied to a pointer to an incomplete
type, because the size isn't known.

  struct foo;
  struct annotated_ptr {
    int count;
    struct foo *buf __attribute__((counted_by(count))); /* invalid */
  };

Signed-off-by: Bill Wendling <morbo@google.com>
2025-05-13 16:01:36 -07:00
Matt Arsenault
a11d86461e
clang: Fix broken implicit cast to generic address space (#138863)
This fixes emitting undefined behavior where a 64-bit generic
pointer is written to a 32-bit slot allocated for a private pointer.
This can be seen in test/CodeGenOpenCL/amdgcn-automatic-variable.cl's
wrong_pointer_alloca.
2025-05-08 07:51:57 +02:00
Shafik Yaghmour
47681736cb
[Clang][NFC] Explicitly delete copy ctor and assignment for CGAtomicOptionsRAII (#137275)
Static analysis flagged CGAtomicOptionsRAII as having an explicit
destructor but not having explicit copy ctor and assignment. Rule of
three says we should. We are just using this as an RAII object, no need
for either so they will be specified as deleted.
2025-05-01 07:51:58 -07:00
Vitaly Buka
c3000333cd
Revert "[Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers" (#136402)
Reverts llvm/llvm-project#135135

Breaks several bots, details in #135135.
2025-04-18 22:14:03 -07:00
Yingwei Zheng
909a9feda9
[Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (#135135)
This patch relands https://github.com/llvm/llvm-project/pull/130990.
If the check value is passed by reference, add `memory(read)`.

Original PR description:

This patch adds `memory(argmem: read, inaccessiblemem: readwrite)` to
**recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).
2025-04-17 23:23:30 +08:00
Akira Hatanaka
a3283a92ae
[PAC] Add support for __ptrauth type qualifier (#100830)
The qualifier allows programmer to directly control how pointers are
signed when they are stored in a particular variable.

The qualifier takes three arguments: the signing key, a flag specifying
whether address discrimination should be used, and a non-negative
integer that is used for additional discrimination.

```
typedef void (*my_callback)(const void*);
my_callback __ptrauth(ptrauth_key_process_dependent_code, 1, 0xe27a) callback;
```

Co-Authored-By: John McCall rjmccall@apple.com
2025-04-15 12:54:25 -07:00
Jan Górski
ff687af04f
[clang][CodeGen] Add range metadata for atomic load of boolean type. #131476 (#133546)
Fixes #131476.

For `x86_64` it folds
```
movzbl	t1(%rip), %eax
andb	$1, %al
```
into
```
movzbl	t1(%rip), %eax
```
when run: `clang -S atomic-ops-load.c -o atomic-ops-load.s -O1
--target=x86_64`.

But for riscv replaces:
```
lb	a0, %lo(t1)(a0)
andi	a0, a0, 1
```
with
```
lb	a0, %lo(t1)(a0)
zext.b	a0, a0
``` 
when run: `clang -S atomic-ops-load.c -o atomic-ops-load.s -O1
--target=riscv64`.
2025-04-14 14:26:10 -07:00
Yingwei Zheng
04c38981a9
[Clang][CodeGen] Do not set inbounds flag in EmitMemberDataPointerAddress when the base pointer is null (#130952)
See also https://github.com/llvm/llvm-project/pull/130734 for the
original motivation.

This pattern (`container_of`) is also widely used by real-world
programs.
Examples:

1d89d7d5d7/llvm/include/llvm/IR/SymbolTableListTraits.h (L77-L87)

a2a53cb728/src/util-inl.h (L134-L137)
https://github.com/search?q=*%29nullptr-%3E*&type=code
2025-04-11 10:51:08 +08:00
Yingwei Zheng
1711996805
[Clang][CodeGen] Do not set inbounds flag for struct GEP with null base pointers (#130734)
In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`:
https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs
(https://github.com/dtcxzyw/llvm-opt-benchmark/pull/55#issuecomment-1870963906).
Generally, it exists in some (actually) unreachable blocks, which is
introduced by JumpThreading.

However, some old-style offsetof macros are still widely used in
real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking
existing code and inconvenience to downstream users, this patch removes
the inbounds flag from the struct gep if the base pointer is null.
2025-04-11 09:04:23 +08:00
Aaron Ballman
5c8ba28c75
[C11] Implement WG14 N1285 (temporary lifetimes) (#133472)
This feature largely models the same behavior as in C++11. It is
technically a breaking change between C99 and C11, so the paper is not
being backported to older language modes.

One difference between C++ and C is that things which are rvalues in C
are often lvalues in C++ (such as the result of a ternary operator or a
comma operator).

Fixes #96486
2025-04-10 08:12:14 -04:00
Orlando Cazalet-Hyams
308654608c [Clang][NFC] Move some static functions into CodeGenFunction (#134634)
Patches in the Key Instructions (KeyInstr) stack need to access CGF in these
functions. 2 CGF fields are passed to these functions already; at this point it
felt natural to promote them to CGF methods.
2025-04-08 08:44:10 +01:00
Farzon Lotfi
16c84c4475
[DirectX] Add target builtins (#134439)
- fixes #132303
- Moves dot2add from a language builtin to a target builtin.
-  Sets the scaffolding for Sema checks for DX builtins
-  Setup DirectX backend as able to have target builtins
- Adds a DX TargetBuiltins emitter in
`clang/lib/CodeGen/TargetBuiltins/DirectX.cpp`
2025-04-07 12:06:57 -04:00
Nikita Popov
b384d6d6cc
[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100)
This is an expensive header, only include it where needed. Move some
functions out of line to achieve that.

This reduces time to build clang by ~0.5% in terms of instructions
retired.
2025-04-03 08:04:19 +02:00
Lukacma
6c3adaafe3
[AARCH64][Neon] switch to using bitcasts in arm_neon.h where appropriate (#127043)
Currently arm_neon.h emits C-style casts to do vector type casts. This
relies on implicit conversion between vector types to be enabled, which
is currently deprecated behaviour and soon will disappear. To ensure
NEON code will keep working afterwards, this patch changes all this
vector type casts into bitcasts.


Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>
2025-04-01 09:45:16 +01:00
Younan Zhang
f4218753ad
[Clang] Implement P0963R3 "Structured binding declaration as a condition" (#130228)
This implements the R2 semantics of P0963.

The R1 semantics, as outlined in the paper, were introduced in Clang 6.
In addition to that, the paper proposes swapping the evaluation order of
condition expressions and the initialization of binding declarations
(i.e. std::tuple-like decompositions).
2025-03-11 15:41:56 +08:00
erichkeane
d5cec386c1 [OpenACC] Implement 'cache' construct AST/Sema
This statement level construct takes no clauses and has no associated
statement, and simply labels a number of array elements as valid for
caching. The implementation here is pretty simple, but it is a touch of
a special case for parsing, so the parsing code reflects that.
2025-03-03 13:57:23 -08:00
Yaxun (Sam) Liu
240f2269ff
Add clang atomic control options and attribute (#114841)
Add option and statement attribute for controlling emitting of
target-specific
metadata to atomicrmw instructions in IR.

The RFC for this attribute and option is

https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641,
Originally a pragma was proposed, then it was changed to clang
attribute.

This attribute allows users to specify one, two, or all three options
and must be applied
to a compound statement. The attribute can also be nested, with inner
attributes
overriding the options specified by outer attributes or the target's
default
options. These options will then determine the target-specific metadata
added to atomic
instructions in the IR.

In addition to the attribute, three new compiler options are introduced:
`-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`,
 `-f[no-]atomic-ignore-denormal-mode`.
These compiler options allow users to override the default options
through the
Clang driver and front end. `-m[no-]unsafe-fp-atomics` is aliased to
`-f[no-]ignore-denormal-mode`.

In terms of implementation, the atomic attribute is represented in the
AST by the
existing AttributedStmt, with minimal changes to AST and Sema.

During code generation in Clang, the CodeGenModule maintains the current
atomic options,
which are used to emit the relevant metadata for atomic instructions.
RAII is used
to manage the saving and restoring of atomic options when entering
and exiting nested AttributedStmt.
2025-02-27 10:41:04 -05:00
Chris B
761d422441
[HLSL] Implement HLSL intialization list support (#123141)
This PR implements HLSL's initialization list behvaior as specified in
the draft language specifcation under

[*Decl.Init.Agg*](https://microsoft.github.io/hlsl-specs/specs/hlsl.html#Decl.Init.Agg).

This behavior is a bit unusual for C/C++ because intermediate braces in
initializer lists are ignored and a whole array of additional
conversions occur unintuitively to how initializaiton works in C.

The implementaiton in this PR generates a valid C/C++ initialization
list AST for the HLSL initializer so that there are no changes required
to Clang's CodeGen to support this. This design will also allow us to
use Clang's rewrite to convert HLSL initializers to valid C/C++
initializers that are equivalent. It does have the downside that it will
generate often redundant accesses during codegen. The IR optimizer is
extremely good at eliminating those so this will have no impact on the
final executable performance.

There is some opportunity for optimizing the initializer list generation
that we could consider in subsequent commits. One notable opportunity
would be to identify aggregate objects that occur in the same place in
both initializers and do not require converison, those aggregates could
be initialized as aggregates rather than fully scalarized.

Closes #56067

---------

Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com>
Co-authored-by: Helena Kotas <hekotas@microsoft.com>
Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-02-15 13:21:36 -06:00
Zahira Ammarguellat
cf69b4c668
[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#126927)
This patch was reviewed and approved here:
https://github.com/llvm/llvm-project/pull/119891
However it has been reverted here:
083df25dc2
due to a build issue here:
https://lab.llvm.org/buildbot/#/builders/51/builds/10694

This patch is reintroducing the support.
2025-02-13 07:14:36 -05:00
Peter Rong
53c618c071
[clang] run clang-format on some CGObjC files (#126644)
These files are relatively old and don't confront our formatting rules.
It's hard to change them without massive clang-format changes.

---------

Signed-off-by: Peter Rong <PeterRong@meta.com>
2025-02-12 11:52:49 -08:00
Kazu Hirata
67e1e98811 Revert "[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891)"
This reverts commit 070f84ebc89b11df616a83a56df9ac56efbab783.

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/51/builds/10694
2025-02-11 12:39:01 -08:00
Zahira Ammarguellat
070f84ebc8
[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891)
Implement basic parsing and semantic support for `#pragma omp stripe`
constuct introduced in
https://www.openmp.org/wp-content/uploads/[OpenMP-API-Specification-6-0.pdf](https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf),
section 11.7.
2025-02-11 13:58:21 -05:00
Sarah Spall
3f8e280206
[HLSL] Implement HLSL Elementwise casting (excluding splat cases); Re-land #118842 (#126258)
Implement HLSLElementwiseCast excluding support for splat cases
Do not support casting types that contain bitfields.
Partly closes https://github.com/llvm/llvm-project/issues/100609 and
partly closes https://github.com/llvm/llvm-project/issues/100619
Re-land #118842 after fixing warning as an error, found by a buildbot.
2025-02-07 09:12:55 -08:00
Sarah Spall
14716f2e4b
Revert "[HLSL] Implement HLSL Flat casting (excluding splat cases)" (#126149)
Reverts llvm/llvm-project#118842
2025-02-06 15:25:20 -08:00
Sarah Spall
01072e546f
[HLSL] Implement HLSL Flat casting (excluding splat cases) (#118842)
Implement HLSLElementwiseCast excluding support for splat cases
Do not support casting types that contain bitfields.
Partly closes #100609 and partly closes #100619
2025-02-06 14:38:01 -08:00
erichkeane
99a9133a68 [OpenACC] Implement Sema/AST for 'atomic' construct
The atomic construct is a particularly complicated one.  The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.

'read' accepts:
  v = x;
'write' accepts:
  x = expr;
'update' (or no clause) accepts:
  x++;
  x--;
  ++x;
  --x;
  x binop= expr;
  x = x binop expr;
  x = expr binop x;

'capture' accepts either a compound statement, or:
  v = x++;
  v = x--;
  v = ++x;
  v = --x;
  v = x binop= expr;
  v = x = x binop expr;
  v = x = expr binop x;

IF 'capture' has a compound statement, it accepts:
  {v = x; x binop= expr; }
  {x binop= expr; v = x; }
  {v = x; x = x binop expr; }
  {v = x; x = expr binop x; }
  {x = x binop expr ;v = x; }
  {x = expr binop x; v = x; }
  {v = x; x = expr; }
  {v = x; x++; }
  {v = x; ++x; }
  {x++; v = x; }
  {++x; v = x; }
  {v = x; x--; }
  {v = x; --x; }
  {x--; v = x; }
  {--x; v = x; }

While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.

This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
2025-02-03 07:22:22 -08:00
Bill Wendling
cff0a460ae
[Clang][counted_by] Refactor __builtin_dynamic_object_size on FAMs (#122198)
Refactoring of how __builtin_dynamic_object_size() is calculated for
flexible array members (in preparation for adding support for the
'counted_by' attribute on pointers in structs).

The only functionality change is that we use the already emitted Expr
code to build our calculations off of rather than re-emitting the Expr.
That allows the 'StructFieldAccess' visitor to sift through all casts
and ArraySubscriptExprs to find the first MemberExpr. We build our GEPs
and calculate offsets based off of relative distances from that
MemberExpr.

The testcase passes execution tests.

Calculate the flexible array member's object size using these formulae
(note: if the calculation is negative, we return 0.): 

     struct p;
     struct s {
         /* ... */
         int count;
         struct p *array[] __attribute__((counted_by(count)));
     };   

1) 'ptr->array':

   count = ptr->count;

   flexible_array_member_base_size = sizeof (*ptr->array);
   flexible_array_member_size =
           count * flexible_array_member_base_size;

   if (flexible_array_member_size < 0) 
       return 0;
   return flexible_array_member_size;

2) '&ptr->array[idx]':

   count = ptr->count;
   index = idx; 

   flexible_array_member_base_size = sizeof (*ptr->array);
   flexible_array_member_size =
           count * flexible_array_member_base_size;

   index_size = index * flexible_array_member_base_size;

   if (flexible_array_member_size < 0 || index < 0) 
       return 0;
   return flexible_array_member_size - index_size;

3) '&ptr->field':

   count = ptr->count;
   sizeof_struct = sizeof (struct s);

   flexible_array_member_base_size = sizeof (*ptr->array);
   flexible_array_member_size =
           count * flexible_array_member_base_size;

   field_offset = offsetof (struct s, field);
   offset_diff = sizeof_struct - field_offset;

   if (flexible_array_member_size < 0) 
       return 0;
   return offset_diff + flexible_array_member_size;

4) '&ptr->field_array[idx]':

   count = ptr->count;
   index = idx; 
   sizeof_struct = sizeof (struct s);

   flexible_array_member_base_size = sizeof (*ptr->array);
   flexible_array_member_size =
           count * flexible_array_member_base_size;

   field_base_size = sizeof (*ptr->field_array);
   field_offset = offsetof (struct s, field)
   field_offset += index * field_base_size;

   offset_diff = sizeof_struct - field_offset;

   if (flexible_array_member_size < 0 || index < 0) 
       return 0;
   return offset_diff + flexible_array_member_size;

---------

Signed-off-by: Bill Wendling <morbo@google.com>
2025-01-30 15:36:13 -08:00
Stephen Tozer
822f74a911
[Clang] Cleanup docs and comments relating to -fextend-variable-liveness (#124767)
This patch contains a number of changes relating to the above flag;
primarily it updates comment references to the old flag names,
"-fextend-lifetimes" and "-fextend-this-ptr" to refer to the new names,
"-fextend-variable-liveness[={all,this}]". These changes are all NFC.

This patch also removes the explicit -fextend-this-ptr-liveness flag
alias, and shortens the help-text for the main flag; these are both
changes that were meant to be applied in the initial PR (#110000), but
due to some user-error on my part they were not included in the merged
commit.
2025-01-28 18:25:32 +00:00
Wolfgang Pieb
4424c44c8c [Clang] Add fake use emission to Clang with -fextend-lifetimes (#110102)
Following the previous patch which adds the "extend lifetimes" flag
without (almost) any functionality, this patch adds the real feature by
allowing Clang to emit fake uses. These are emitted as a new form of cleanup,
set for variable addresses, which just emits a fake use intrinsic when the
variable falls out of scope. The code for achieving this is simple, with most
of the logic centered on determining whether to emit a fake use for a given
address, and on ensuring that fake uses are ignored in a few cases.

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2025-01-28 12:30:31 +00:00
Momchil Velikov
f75860f895
[AArch64] Implement NEON FP8 intrinsics for fused multiply-add (#123615)
This patch adds the following intrinsics:

* Fused multiply-add non-indexed

float16x8_t vmlalbq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t,
fpm_t)
float16x8_t vmlaltq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t,
fpm_t)
        
float32x4_t vmlallbbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlallbtq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlalltbq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)
float32x4_t vmlallttq_f32_mf8_fpm(float32x4_t, mfloat8x16_t,
mfloat8x16_t, fpm_t)

* Floating-point multiply-add long to half-precision (vector, by
element)

float16x8_t vmlalbq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlalbq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlaltq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vmlaltq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
    
* Floating-point multiply-add long-long to single-precision (vector, by
element)

float32x4_t vmlallbbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbtq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallbtq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlalltbq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlalltbq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallttq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vmlallttq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
2025-01-28 00:38:44 +00:00
Momchil Velikov
804b81d39f
[AArch64] Add FP8 Neon intrinsics for dot-product (#123613)
This patch adds the following intrinsics:

float16x4_t vdot_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn, mfloat8x8_t
vm, fpm_t fpm)
float16x8_t vdotq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, fpm_t fpm)
    
float16x4_t vdot_lane_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x4_t vdot_laneq_f16_mf8_fpm(float16x4_t vd, mfloat8x8_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vdotq_lane_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float16x8_t vdotq_laneq_f16_mf8_fpm(float16x8_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
    
float32x2_t vdot_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn, mfloat8x8_t
vm, fpm_t fpm)
float32x4_t vdotq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, fpm_t fpm)

float32x2_t vdot_lane_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x2_t vdot_laneq_f32_mf8_fpm(float32x2_t vd, mfloat8x8_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vdotq_lane_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
float32x4_t vdotq_laneq_f32_mf8_fpm(float32x4_t vd, mfloat8x16_t vn,
mfloat8x16_t vm, __builtin_constant_p(lane), fpm_t fpm)
2025-01-27 21:14:16 +00:00
Momchil Velikov
99bd2e3f12
[AArch64] Add Neon FP8 conversion intrinsics (#123612)
The patch adds the following intrinsics:

    bfloat16x8_t vcvt1_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt1_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_bf16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_low_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    bfloat16x8_t vcvt1_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    bfloat16x8_t vcvt2_high_bf16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt1_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_f16_mf8_fpm(mfloat8x8_t vn, fpm_t fpm)
    float16x8_t vcvt2_low_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
    float16x8_t vcvt1_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    float16x8_t vcvt2_high_f16_mf8_fpm(mfloat8x16_t vn, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f32_fpm(float32x4_t vn, float32x4_t vm, fpm_t fpm)
mfloat8x16_t vcvt_high_mf8_f32_fpm(mfloat8x8_t vd, float32x4_t vn,
float32x4_t vm, fpm_t fpm)
    
mfloat8x8_t vcvt_mf8_f16_fpm(float16x4_t vn, float16x4_t vm, fpm_t fpm)
mfloat8x16_t vcvtq_mf8_f16_fpm(float16x8_t vn, float16x8_t vm, fpm_t
fpm)

Co-Authored-By: Caroline Concatto <caroline.concatto@arm.com>
2025-01-27 17:32:47 +00:00
Jeremy Morse
e14962a39c
[NFC][DebugInfo] Use iterators for instruction insertion in more places (#124291)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators.
This patch changes some more complex call-sites, those crossing file
boundaries and where I've had to perform some minor rewrites.
2025-01-27 15:25:17 +00:00