1650 Commits

Author SHA1 Message Date
Bill Wendling
712d7dba4f
[Clang] Improve testing for the flexible array member (#89462)
Testing for the name of the flexible array member isn't as robust as
testing the FieldDecl pointers.
2024-04-24 19:39:33 +00:00
Utkarsh Saxena
9d8be24087
Revert "[codegen] Emit missing cleanups for stmt-expr and coro suspensions" and related commits (#88884)
The original change caused widespread breakages in msan/ubsan tests and
causes `use-after-free`. Most likely we are adding more cleanups than
necessary.
2024-04-16 15:30:32 +02:00
Utkarsh Saxena
5a46123ddf
Fix missing dtor in function calls accepting trivial ABI structs (#88751)
Fixes https://github.com/llvm/llvm-project/issues/88478

Promoting the `EHCleanup` to `NormalAndEHCleanup` in `EmitCallArgs`
surfaced another bug with deactivation of normal cleanups. Here we
missed emitting CPP scope ends for deactivated normal cleanups. This
patch also fixes that bug.

We missed emitting CPP scope ends because we remove the `fallthrough`
(clears the insertion point) before deactivating normal cleanups. This
is to make the emitted "normal" cleanup code unreachable. But we still
need to emit CPP scope ends in the original basic block even for a
deactivated normal cleanup.
(This worked correctly before we did not remove `fallthrough` for
`EHCleanup`s).
2024-04-16 11:01:03 +02:00
Utkarsh Saxena
89ba7e183e
[codegen] Emit missing cleanups for stmt-expr and coro suspensions [take-2] (#85398)
Fixes https://github.com/llvm/llvm-project/issues/63818 for control flow
out of an expressions.

#### Background

A control flow could happen in the middle of an expression due to
stmt-expr and coroutine suspensions.

Due to branch-in-expr, we missed running cleanups for the temporaries
constructed in the expression before the branch.
Previously, these cleanups were only added as `EHCleanup` during the
expression and as normal expression after the full expression.

Examples of such deferred cleanups include:

`ParenList/InitList`: Cleanups for fields are performed by the
destructor of the object being constructed.
`Array init`: Cleanup for elements of an array is included in the array
cleanup.
`Lifetime-extended temporaries`: reference-binding temporaries in
braced-init are lifetime extended to the parent scope.
`Lambda capture init`: init in the lambda capture list is destroyed by
the lambda object.

---

#### In this PR

In this PR, we change some of the `EHCleanups` cleanups to
`NormalAndEHCleanups` to make sure these are emitted when we see a
branch inside an expression (through statement expressions or coroutine
suspensions).

These are supposed to be deactivated after full expression and destroyed
later as part of the destructor of the aggregate or array being
constructed. To simplify deactivating cleanups, we add two utilities as
well:
* `DeferredDeactivationCleanupStack`: A stack to remember cleanups with
deferred deactivation.
* `CleanupDeactivationScope`: RAII for deactivating cleanups added to
the above stack.

---

#### Deactivating normal cleanups
These were previously `EHCleanups` and not `Normal` and **deactivation**
of **required** `Normal` cleanups had some bugs. These specifically
include deactivating `Normal` cleanups which are not the top of
`EHStack`
[source1](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L1319)),
[2](92b56011e6/clang/lib/CodeGen/CGCleanup.cpp (L722-L746)).
This has not been part of our test suite (maybe it was never required
before statement expressions). In this PR, we also fix the emission of
required-deactivated-normal cleanups.
2024-04-10 12:59:24 +02:00
Axel Lundberg
708c8cd743
Fix "[clang][UBSan] Add implicit conversion check for bitfields" (#87761)
Fix since #75481 got reverted.

- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
-       llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+      llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+      llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
-     if (Kind == CK_IntegralCast) {
+     if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`

---------

Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
2024-04-08 12:30:27 -07:00
Vitaly Buka
029e1d7515
Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields""" (#87562)
Reverts llvm/llvm-project#87529

Reverts #87518

https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
2024-04-03 15:19:03 -07:00
Vitaly Buka
8a5a1b7704
Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields"" (#87529)
Reverts llvm/llvm-project#87518

Revert is not needed as the regression was fixed with
1189e87951e59a81ee097eae847c06008276fef1.

I assumed the crash and warning are different issues, but according to
https://lab.llvm.org/buildbot/#/builders/240/builds/26629
fixing warning resolves the crash.
2024-04-03 10:58:39 -07:00
Vitaly Buka
5822ca5a01
Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#87518)
Reverts llvm/llvm-project#75481

Breaks multiple bots, see #75481
2024-04-03 10:27:09 -07:00
Axel Lundberg
450f1952ac
[clang][UBSan] Add implicit conversion check for bitfields (#75481)
This patch implements the implicit truncation and implicit sign change
checks for bitfields using UBSan. E.g.,
`-fsanitize=implicit-bitfield-truncation` and
`-fsanitize=implicit-bitfield-sign-change`.
2024-04-03 08:55:03 -04:00
Nathan Gauër
0f61051f54
[clang][HLSL][SPRI-V] Add convergence intrinsics (#80680)
HLSL has wave operations and other kind of function which required the
control flow to either be converged, or respect certain constraints as
where and how to re-converge.

At the HLSL level, the convergence are mostly obvious: the control flow
is expected to re-converge at the end of a scope.
Once translated to IR, HLSL scopes disapear. This means we need a way to
communicate convergence restrictions down to the backend.

For this, the SPIR-V backend uses convergence intrinsics. So this commit
adds some code to generate convergence intrinsics when required.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-03-28 17:18:05 +01:00
Akira Hatanaka
84780af4b0
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
2024-03-28 06:54:36 -07:00
Akira Hatanaka
f75eebab88
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)" (#86898)
This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c.

The commit broke ubsan bots.
2024-03-27 18:14:04 -07:00
Akira Hatanaka
d9a685a9dd
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
2024-03-27 12:24:49 -07:00
Akira Hatanaka
b311756450
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)" (#86674)
This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6.

It appears that the commit broke msan bots.
2024-03-26 07:37:57 -07:00
Akira Hatanaka
8bd1f9116a
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
2024-03-25 18:05:42 -07:00
mikaoP
4e3310a813
[clang] Fix OMPT ident flag in combined distribute parallel for pragma (#80987)
Authored-by: Raúl Peñacoba Veigas <rpenacob@bsc.es>
2024-03-12 07:50:35 -04:00
fpasserby
f786881340
[coroutine] Implement llvm.coro.await.suspend intrinsic (#79712)
Implement `llvm.coro.await.suspend` intrinsics, to deal with performance
regression after prohibiting `.await_suspend` inlining, as suggested in
#64945.
Actually, there are three new intrinsics, which directly correspond to
each of three forms of `await_suspend`:
```
void llvm.coro.await.suspend.void(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
```
There are three different versions instead of one, because in `bool`
case it's result is used for resuming via a branch, and in
`coroutine_handle` case exceptions from `await_suspend` are handled in
the coroutine, and exceptions from the subsequent `.resume()` are
propagated to the caller.

Await-suspend block is simplified down to intrinsic calls only, for
example for symmetric transfer:
```
%id = call token @llvm.coro.save(ptr null)
%handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @wrapperFunction)
call void @llvm.coro.resume(%handle)
%result = call i8 @llvm.coro.suspend(token %id, i1 false)
switch i8 %result, ...
```
All await-suspend logic is moved out into a wrapper function, generated
for each suspension point.
The signature of the function is `<type> wrapperFunction(ptr %awaiter,
ptr %frame)` where `<type>` is one of `void` `i1` or `ptr`, depending on
the return type of `await_suspend`.
Intrinsic calls are lowered during `CoroSplit` pass, right after the
split.

Because I'm new to LLVM, I'm not sure if the helper function generation,
calls to them and lowering are implemented in the right way, especially
with regard to various metadata and attributes, i. e. for TBAA. All
things that seemed questionable are marked with `FIXME` comments.

There is another detail: in case of symmetric transfer raw pointer to
the frame of coroutine, that should be resumed, is returned from the
helper function and a direct call to `@llvm.coro.resume` is generated.
C++ standard demands, that `.resume()` method is evaluated. Not sure how
important is this, because code has been generated in the same way
before, sans helper function.
2024-03-11 10:00:00 +08:00
gulfemsavrun
23f895f656
[InstrProf] Single byte counters in coverage (#75425)
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.

RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26 14:44:55 -08:00
Farzon Lotfi
82acec15af
[HLSL] Implementation of dot intrinsic (#81190)
This change implements https://github.com/llvm/llvm-project/issues/70073

HLSL has a dot intrinsic defined here:

https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-dot

The intrinsic itself is defined as a HLSL_LANG LangBuiltin in
Builtins.td.
This is used to associate all the dot product typdef defined
hlsl_intrinsics.h
with a single intrinsic check in CGBuiltin.cpp & SemaChecking.cpp.

In IntrinsicsDirectX.td we define the llvmIR for the dot product.
A few goals were in mind for this IR. First it should operate on only
vectors. Second the return type should be the vector element type. Third
the second parameter vector should be of the same size as the first
parameter. Finally `a dot b` should be the same as `b dot a`.

In CGBuiltin.cpp hlsl has built on top of existing clang intrinsics via
EmitBuiltinExpr. Dot
product though is language specific intrinsic and so is guarded behind
getLangOpts().HLSL.
The call chain looks like this: EmitBuiltinExpr -> EmitHLSLBuiltinExp

EmitHLSLBuiltinExp dot product intrinsics makes a destinction
between vectors and scalars. This is because HLSL supports dot product
on scalars which simplifies down to multiply.

Sema.h & SemaChecking.cpp saw the addition of
CheckHLSLBuiltinFunctionCall, a language specific semantic validation
that can be expanded for other hlsl specific intrinsics.

Fixes #70073
2024-02-26 10:08:59 -06:00
Pavel Iliin
568babab7e
[AArch64] Implement __builtin_cpu_supports, compiler-rt tests. (#82378)
The patch complements https://github.com/llvm/llvm-project/pull/68919
and adds AArch64 support for builtin
`__builtin_cpu_supports("feature1+...+featureN")`
which return true if all specified CPU features in argument are
detected. Also compiler-rt aarch64 native run tests for features
detection mechanism were added and 'cpu_model' check was fixed after its
refactor merged https://github.com/llvm/llvm-project/pull/75635 Original
RFC was https://reviews.llvm.org/D153153
2024-02-22 23:33:54 +00:00
Erich Keane
f655778300
[OpenACC] Implement AST for OpenACC Compute Constructs (#81188)
'serial', 'parallel', and 'kernel' constructs are all considered
'Compute' constructs. This patch creates the AST type, plus the required
infrastructure for such a type, plus some base types that will be useful
in the future for breaking this up.

The only difference between the three is the 'kind'( plus some minor
 clause legalization rules, but those can be differentiated easily
enough), so rather than representing them as separate AST nodes, it
seems
to make sense to make them the same.

Additionally, no clause AST functionality is being implemented yet, as
that fits better in a separate patch, and this is enough to get the
'naked' constructs implemented.

This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
so there aren't any tests.  I did this to break up the review workload
and to get feedback on the layout.
2024-02-13 06:02:13 -08:00
Vlad Serebrennikov
35737beaef [clang][NFC] Annotate CodeGenFunction.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:11:49 +03:00
Bill Wendling
00b6d032a2 [Clang] Implement the 'counted_by' attribute (#76348)
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member holding the
count of elements in the flexible array. This information is used to
improve the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' builtin. The 'count' field member must
be within the same non-anonymous, enclosing struct as the flexible array
member. For example:

```
  struct bar;
  struct foo {
    int count;
    struct inner {
      struct {
        int count; /* The 'count' referenced by 'counted_by' */
      };
      struct {
        /* ... */
        struct bar *array[] __attribute__((counted_by(count)));
      };
    } baz;
  };
```

This example specifies that the flexible array member 'array' has the
number of elements allocated for it in 'count':

```
  struct bar;
  struct foo {
    size_t count;
     /* ... */
    struct bar *array[] __attribute__((counted_by(count)));
  };
```

This establishes a relationship between 'array' and 'count';
specifically that 'p->array' must have *at least* 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained throughout changes to the structure.

In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:

```
  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }
```

The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:

```
  void use_foo(int index, int val) {
    p->count += 42;
    p->array[index] = val; /* The sanitizer can't properly check this access */
  }
```

In this example, an update to 'p->count' maintains the relationship
requirement:

```
  void use_foo(int index, int val) {
    if (p->count == 0)
      return;
    --p->count;
    p->array[index] = val;
  }
```
2024-01-16 14:26:12 -08:00
Rashmi Mudduluru
a511c1a9ec
Revert "[Clang] Implement the 'counted_by' attribute (#76348)"
This reverts commit 164f85db876e61cf4a3c34493ed11e8f5820f968.
2024-01-15 18:37:52 -08:00
Bill Wendling
164f85db87 [Clang] Implement the 'counted_by' attribute (#76348)
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member holding the
count of elements in the flexible array. This information is used to
improve the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' builtin. The 'count' field member must
be within the same non-anonymous, enclosing struct as the flexible array
member. For example:

```
  struct bar;
  struct foo {
    int count;
    struct inner {
      struct {
        int count; /* The 'count' referenced by 'counted_by' */
      };
      struct {
        /* ... */
        struct bar *array[] __attribute__((counted_by(count)));
      };
    } baz;
  };
```

This example specifies that the flexible array member 'array' has the
number of elements allocated for it in 'count':

```
  struct bar;
  struct foo {
    size_t count;
     /* ... */
    struct bar *array[] __attribute__((counted_by(count)));
  };
```

This establishes a relationship between 'array' and 'count';
specifically that 'p->array' must have *at least* 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained throughout changes to the structure.

In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:

```
  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }
```

The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:

```
  void use_foo(int index, int val) {
    p->count += 42;
    p->array[index] = val; /* The sanitizer can't properly check this access */
  }
```

In this example, an update to 'p->count' maintains the relationship
requirement:

```
  void use_foo(int index, int val) {
    if (p->count == 0)
      return;
    --p->count;
    p->array[index] = val;
  }
```
2024-01-10 22:20:31 -08:00
Nico Weber
2dce77201c Revert "[Clang] Implement the 'counted_by' attribute (#76348)"
This reverts commit fefdef808c230c79dca2eb504490ad0f17a765a5.

Breaks check-clang, see
https://github.com/llvm/llvm-project/pull/76348#issuecomment-1886029515

Also revert follow-on "[Clang] Update 'counted_by' documentation"

This reverts commit 4a3fb9ce27dda17e97341f28005a28836c909cfc.
2024-01-10 21:05:19 -05:00
Bill Wendling
fefdef808c
[Clang] Implement the 'counted_by' attribute (#76348)
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member holding the
count of elements in the flexible array. This information is used to
improve the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' builtin. The 'count' field member must
be within the same non-anonymous, enclosing struct as the flexible array
member. For example:

```
  struct bar;
  struct foo {
    int count;
    struct inner {
      struct {
        int count; /* The 'count' referenced by 'counted_by' */
      };
      struct {
        /* ... */
        struct bar *array[] __attribute__((counted_by(count)));
      };
    } baz;
  };
```

This example specifies that the flexible array member 'array' has the
number of elements allocated for it in 'count':

```
  struct bar;
  struct foo {
    size_t count;
     /* ... */
    struct bar *array[] __attribute__((counted_by(count)));
  };
```

This establishes a relationship between 'array' and 'count';
specifically that 'p->array' must have *at least* 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained throughout changes to the structure.

In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:

```
  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }
```

The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:

```
  void use_foo(int index, int val) {
    p->count += 42;
    p->array[index] = val; /* The sanitizer can't properly check this access */
  }
```

In this example, an update to 'p->count' maintains the relationship
requirement:

```
  void use_foo(int index, int val) {
    if (p->count == 0)
      return;
    --p->count;
    p->array[index] = val;
  }
```
2024-01-10 15:21:10 -08:00
Alan Phipps
8b2bdfbca7 [Coverage][clang] Enable MC/DC Support in LLVM Source-based Code Coverage (3/3)
Part 3 of 3. This includes the MC/DC clang front-end components.

Differential Revision: https://reviews.llvm.org/D138849
2024-01-04 12:29:18 -06:00
Bill Wendling
cca4d6cfd2
Revert counted_by attribute feature (#75857)
There are many issues that popped up with the counted_by feature. The
patch #73730 has grown too large and approval is blocking Linux testing.

Includes reverts of:
commit 769bc11f684d ("[Clang] Implement the 'counted_by' attribute
(#68750)")
commit bc09ec696209 ("[CodeGen] Revamp counted_by calculations
(#70606)")
commit 1a09cfb2f35d ("[Clang] counted_by attr can apply only to C99
flexible array members (#72347)")
commit a76adfb992c6 ("[NFC][Clang] Refactor code to calculate flexible
array member size (#72790)")
commit d8447c78ab16 ("[Clang] Correct handling of negative and
out-of-bounds indices (#71877)")
Partial commit b31cd07de5b7 ("[Clang] Regenerate test checks (NFC)")

Closes #73168
Closes #75173
2023-12-18 15:16:09 -08:00
Bill Wendling
a76adfb992
[NFC][Clang] Refactor code to calculate flexible array member size (#72790)
The code that calculates the flexible array member size is big enough to
warrant its own method.
2023-11-19 19:25:10 -08:00
Joseph Huber
237adfca4e
[OpenMP] Rework handling of global ctor/dtors in OpenMP (#71739)
Summary:
This patch reworks how we handle global constructors in OpenMP.
Previously, we emitted individual kernels that were all registered and
called individually. In order to provide more generic support, this
patch moves all handling of this to the target backend and the runtime
plugin. This has the benefit of supporting the GNU extensions for
constructors an destructors, removing a class of failures related to
shared library destruction order, and allows targets other than OpenMP
to use the same support without needing to change the frontend.

This is primarily done by calling kernels that the backend emits to
iterate a list of ctor / dtor functions. For x64, this is automatic and
we get it for free with the standard `dlopen` handling. For AMDGPU, we
emit `amdgcn.device.init` and `amdgcn.device.fini` functions which
handle everything atuomatically and simply need to be called. For NVPTX,
a patch https://github.com/llvm/llvm-project/pull/71549 provides the
kernels to call, but the runtime needs to set up the array manually by
pulling out all the known constructor / destructor functions.

One concession that this patch requires is the change that for GPU
targets in OpenMP offloading we will use `llvm.global_dtors` instead of
using `atexit`. This is because `atexit` is a separate runtime function
that does not mesh well with the handling we're trying to do here. This
should be equivalent in all cases except for cases where we would need
to destruct manually such as:

```
struct S { ~S() { foo(); } };
void foo() {
  static S s;
}
```

However this is broken in many other ways on the GPU, so it is not
regressing any support, simply increasing the scope of what we can
handle.

This changes the handling of ctors / dtors. This patch now outputs a
information message regarding the deprecation if the old format is used.
This will be completely removed in a later release.

Depends on: https://github.com/llvm/llvm-project/pull/71549
2023-11-10 14:53:53 -06:00
Bill Wendling
bc09ec6962
[CodeGen] Revamp counted_by calculations (#70606)
Break down the counted_by calculations so that they correctly handle
anonymous structs, which are specified internally as IndirectFieldDecls.

Improves the calculation of __bdos on a different field member in the struct.
And also improves support for __bdos in an index into the FAM. If the index
is further out than the length of the FAM, then we return __bdos's "can't
determine the size" value (zero or negative one, depending on type).

Also simplify the code to use helper methods to get the field referenced
by counted_by and the flexible array member itself, which also had some
issues with FAMs in sub-structs.
2023-11-09 10:18:17 -08:00
Pravin Jagtap
1f21e49870
Revert "Revert "[AMDGPU] const-fold imm operands of (#71669)
amdgcn_update_dpp intrinsic (#71139)""

This reverts commit d1fb9307951319eea3e869d78470341d603c8363 and fixes
the lit test clang/test/CodeGenHIP/dpp-const-fold.hip

---------

Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2023-11-09 10:09:22 +05:30
Mitch Phillips
d1fb930795 Revert "[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139)"
This reverts commit 32a3f2afe6ea7ffb02a6a188b123ded6f4c89f6c.

Reason: Broke the sanitizer buildbots. More details at
32a3f2afe6
2023-11-08 12:50:53 +01:00
Pravin Jagtap
32a3f2afe6
[AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (#71139)
Operands of `__builtin_amdgcn_update_dpp` need to evaluate to constant
to match the intrinsic requirements.

Fixes: SWDEV-426822, SWDEV-431138
---------

Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
2023-11-08 15:09:10 +05:30
Kerry McLaughlin
48fb8ee081
[Clang][SME2] Add multi-vector add/sub builtins (#69725)
Adds the following SME2 builtins:
 - sv(add|sub)
 - sv(add|sub)_za32/za64,
 - sv(add|sub)_write_za32/za64

Other changes in this patch:
 - CGBuiltin.cpp: The GetAArch64SMEProcessedOperands function is created
    to avoid duplicating existing code from EmitAArch64SVEBuiltinExpr.
 - arm_sve.td: The add/sub SME2 builtins which do not operate on ZA have
been added to arm_sve.td, matching the corrosponding LLVM IR intrinsic
    names which start with @llvm.aarch64.sve for this reason.
- SveEmitter.cpp: Adds the createCoreHeaderIntrinsics function to remove
    duplicated code in createHeader & createSMEHeader. Uses a new enum
(ACLEKind) to choose either "__builtin_sme_" or "__builtin_sve_" when
    emitting the intrinsics.

See https://github.com/ARM-software/acle/pull/217/files
2023-11-07 15:42:43 +00:00
Kerry McLaughlin
8f59c168a9
[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70959)
This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and
EmitAArch64SMEBuiltinExpr by creating a new function called
GetAArch64SVEProcessedOperands which handles splitting up multi-vector
arguments using vector extracts.

These changes are non-functional.
2023-11-02 15:47:37 +00:00
Kerry McLaughlin
e2550b7aa0 Revert "[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662)"
This reverts commit c34efe3c2734629b925d9411b3c86a710911a93a.
2023-11-01 15:57:14 +00:00
Kerry McLaughlin
c34efe3c27
[AArch64][Clang] Refactor code to emit SVE & SME builtins (#70662)
This patch removes duplicated code in EmitAArch64SVEBuiltinExpr and
EmitAArch64SMEBuiltinExpr by creating a new function called
GetAArch64SVEProcessedOperands which handles splitting up multi-vector
arguments using vector extracts.

These changes are non-functional.
2023-11-01 15:21:08 +00:00
Caroline Concatto
7cad5a9eb4 [Clang][SVE2.1] Add svpext builtins
As described in: https://github.com/ARM-software/acle/pull/257

Reviewed By: hassnaa-arm

Differential Revision: https://reviews.llvm.org/D151081
2023-10-17 16:15:22 +00:00
Sam Tebbs
4b8f23e93d
[AArch64][SME] Remove immediate argument restriction for svldr and svstr (#68908)
The svldr_vnum_za and svstr_vnum_za builtins/intrinsics currently
require that the vnum argument be an immediate, but since vnum is used
to modify the base register via a mul and add, that restriction is not
necessary. This patch removes that restriction.
2023-10-17 16:02:36 +01:00
Bill Wendling
769bc11f68
[Clang] Implement the 'counted_by' attribute (#68750)
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member in the same
structure holding the count of elements in the flexible array. This
information can be used to improve the results of the array bound
sanitizer and the '__builtin_dynamic_object_size' builtin.

This example specifies the that the flexible array member 'array' has
the number of elements allocated for it in 'count':

  struct bar;
  struct foo {
    size_t count;
     /* ... */
    struct bar *array[] __attribute__((counted_by(count)));
  };

This establishes a relationship between 'array' and 'count',
specifically that 'p->array' must have *at least* 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained through changes to the structure.

In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:

  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }

The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:

  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }

  void use_foo(int index) {
    p->count += 42;
    p->array[index] = 0; /* The sanitizer cannot properly check this access */
  }

Reviewed By: nickdesaulniers, aaron.ballman

Differential Revision: https://reviews.llvm.org/D148381
2023-10-14 04:18:02 -07:00
alexfh
67b675ee55
Revert "[Clang] Implement the 'counted_by' attribute" (#68603)
This reverts commit 9a954c693573281407f6ee3f4eb1b16cc545033d, which
causes clang crashes when compiling with `-fsanitize=bounds`. See

9a954c6935 (commitcomment-129529574)
for details.
2023-10-09 20:53:48 +02:00
Bill Wendling
9a954c6935 [Clang] Implement the 'counted_by' attribute
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member in the same
structure holding the count of elements in the flexible array. This
information can be used to improve the results of the array bound sanitizer
and the '__builtin_dynamic_object_size' builtin.

This example specifies the that the flexible array member 'array' has the
number of elements allocated for it in 'count':

  struct bar;
  struct foo {
    size_t count;
     /* ... */
    struct bar *array[] __attribute__((counted_by(count)));
  };

This establishes a relationship between 'array' and 'count', specifically
that 'p->array' must have *at least* 'p->count' number of elements available.
It's the user's responsibility to ensure that this relationship is maintained
through changes to the structure.

In the following, the allocated array erroneously has fewer elements than
what's specified by 'p->count'. This would result in an out-of-bounds access not
not being detected:

  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }

The next example updates 'p->count', breaking the relationship requirement that
'p->array' must have at least 'p->count' number of elements available:

  struct foo *p;

  void foo_alloc(size_t count) {
    p = malloc(MAX(sizeof(struct foo),
                   offsetof(struct foo, array[0]) + count *
                       sizeof(struct bar *)));
    p->count = count + 42;
  }

  void use_foo(int index) {
    p->count += 42;
    p->array[index] = 0; /* The sanitizer cannot properly check this access */
  }

Reviewed By: nickdesaulniers, aaron.ballman

Differential Revision: https://reviews.llvm.org/D148381
2023-10-04 18:26:15 -07:00
Corentin Jabot
af4751738d [C++] Implement "Deducing this" (P0847R7)
This patch implements P0847R7 (partially),
CWG2561 and CWG2653.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D140828
2023-10-02 14:33:02 +02:00
Chuanqi Xu
572cc8d38f Revert "[C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty"
This reverts commit 9d9c25f81456aace2bec4b58498a420e650007d9.
This reverts commit 19ab2664ad3182ffa8fe3a95bb19765e4ae84653.
This reverts commit c4672454743e942f148a1aff1e809dae73e464f6.

As the issue https://github.com/llvm/llvm-project/issues/65018 shows,
the previous fix introduce a regression actually. So this commit reverts
the fix by our policies.
2023-08-28 13:21:17 +08:00
Kazu Hirata
5ab7c285fb [CodeGen] Modernize PeepholeProtection (NFC) 2023-08-27 09:24:28 -07:00
Fangrui Song
7a41af8604 [X86] Support arch=x86-64{,-v2,-v3,-v4} for target_clones attribute
GCC 12 (https://gcc.gnu.org/PR101696) allows `arch=x86-64`
`arch=x86-64-v2` `arch=x86-64-v3` `arch=x86-64-v4` in the
target_clones function attribute. This patch ports the feature.

* Set KeyFeature to `x86-64{,-v2,-v3,-v4}` in `Processors[]`, to be used
  by X86TargetInfo::multiVersionSortPriority
* builtins: change `__cpu_features2` to an array like libgcc. Define
  `FEATURE_X86_64_{BASELINE,V2,V3,V4}` and depended ISA feature bits.
* CGBuiltin.cpp: update EmitX86CpuSupports to handle `arch=x86-64*`.

Close https://github.com/llvm/llvm-project/issues/55830

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D158329
2023-08-23 22:08:55 -07:00
Manna, Soumi
30c60ec52f [NFC][CLANG] Fix static analyzer bugs about large copy by values
Static Analyzer Tool complains about a large function call parameter which is is passed by value in CGBuiltin.cpp file.

1. In CodeGenFunction::EmitSMELdrStr(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.

2. In CodeGenFunction::EmitSMEZero(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.

3. In CodeGenFunction::EmitSMEReadWrite(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.

4. In CodeGenFunction::EmitSMELd1St1(clang::SVETypeFlags, llvm::SmallVectorImpl<llvm::Value *> &, unsigned int): We are passing parameter TypeFlags of type clang::SVETypeFlags by value.

I see many places in CGBuiltin.cpp file, we are passing parameter TypeFlags of type clang::SVETypeFlags by reference.

clang::SVETypeFlags inherits several other types.

This patch passes parameter TypeFlags by reference instead of by value in the function.

Reviewed By: tahonermann, sdesmalen

Differential Revision: https://reviews.llvm.org/D158522
2023-08-23 07:57:04 -07:00
Chuanqi Xu
c467245474 [C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty
Close https://github.com/llvm/llvm-project/issues/56301
Close https://github.com/llvm/llvm-project/issues/64151

See the summary and the discussion of https://reviews.llvm.org/D157070
to get the full context.

As @rjmccall pointed out, the key point of the root cause is that
currently we didn't implement the semantics for '@llvm.coro.save' well
("after the await-ready returns false, the coroutine is considered to be
suspended ") well.
Since the semantics implies that we (the compiler) shouldn't write the
spills into the coroutine frame in the await_suspend. But now it is possible
due to some combinations of the optimizations so the semantics are
broken. And the inlining is the root optimization of such optimizations.
So in this patch, we tried to add the `noinline` attribute to the
await_suspend call.

Also as an optimization, we don't add the `noinline` attribute to the
await_suspend call if the awaiter is an empty class. This should be
correct since the programmers can't access the local variables in
await_suspend if the awaiter is empty. I think this is necessary for the
performance since it is pretty common.

Another potential optimization is:

    call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
                                  ptr @awaitSuspendFn)

Then it is much easier to perform the safety analysis in the middle
end.
If it is safe to inline the call to awaitSuspend, we can replace it
in the CoroEarly pass. Otherwise we could replace it in the CoroSplit
pass.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D157833
2023-08-22 09:56:44 +08:00