This patch adds support for atomic loads and stores. Specifically, it
adds support for the following intrinsic calls:
- `__atomic_load` and `__atomic_store`;
- `__c11_atomic_load` and `__c11_atomic_store`.
This patch adds the constant attribute to cir.global, the appropriate
lowering to LLVM constant and updates the tests.
---------
Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
This PR fixes the access to bitfields inside a union.
Previously, we were using a `getMemberOp` to access the field, but
because it is a union, `getMemberOp` would always use index `0`.
For example, given:
```c
typedef union {
int x;
int y : 4;
int z : 8;
} demo;
```
```mlir
!rec_demo = !cir.record<union "demo" {!s32i, !u8i, !u8i}>
```
In the case of:
```c
d.y = 2;
```
It would generate:
```mlir
cir.get_member %0[0] {name = "y"} : !cir.ptr<!rec_demo> -> !cir.ptr<!s32i>
```
with a return type of `!s32i`, when it should be `!u8i`.
the get_member verifier would detect that the return type does not match
the `y` member.
To fix this, we now use `bitcast` to get the start of the union.
If an array initializer list leaves eight or more elements that require
zero fill, we had been generating an individual zero element for every
one of them. This change instead follows the behavior of classic
codegen, which creates a constant structure with the specified elements
followed by a zero-initializer for the trailing zeros.
This patch is the last of the 'firstprivate' clause lowering patches. It
takes the already generated 'copy' init from Sema and uses it to
generate the IR for the copy section of the recipe.
However, one thing that this patch had to do, was come up with a way to
hijack the decl registration in CIRGenFunction. Because these decls are
being created in a 'different' place, we need to remove the things we've
added. We could alternatively generate these 'differently', but it seems
worth a little extra effort here to avoid having to re-implement
variable initialization.
Depends on #153625
This patch adds support for statement expressions. It also changes
emitCompoundStmt and emitCompoundStmtWithoutScope to accept an Address
that the optional result is written to. This allows the creation of the
alloca ahead of the creation of the scope which saves us from hoisting
the alloca to its parent scope.
This change adds support for calling virtual functions. This includes
adding the cir.vtable.get_virtual_fn_addr operation to lookup the
address of the function being called from an object's vtable.
This PR upstreams `GotoOp`. It moves some tests from the `goto` test
file to the `label` test file, and adds verify logic to `FuncOp`. The
gotosSolver, required for lowering, will be implemented in a future PR.
The original patch to implement basic lowering for firstprivate didn't
have the Sema work to change the name of the variable being generated
from openacc.private.init to openacc.firstprivate.init. I forgot about
that when I merged the Sema changes this morning, so the tests now
failed. This patch fixes those up.
Additionally, Suggested on #153622 post-commit, it seems like a good idea to
use a size of APInt that matches the size-type, so this changes us to use that
instead.
This patch implements the basic lowering infrastructure, but does not
quite implement the copy initialization, which requires #153622.
It does however pass verification for the 'copy' section, which just
contains a yield.
This adds ReturnAddrOp and FrameAddrOp that represent
__builtin_return_address and __builtin_frame_address and the respective
lowering to LLVM parts.
---------
Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
The #cir.global_view attribute was initially added without support for
the optional index list. This change adds index list support. This is
used when the address of an array or structure member is used as an
initializer.
This patch does not include support for taking the address of a
structure or class member. That will be added later.
This updates the array initialization loop to use a do..while loop
rather than a fully serialized initialization. It also allows the
initialization of destructed objects when exception handling is not
enabled.
Array initialization when exception handling is enabled remains
unimplemented, but more precise messages are now emitted.
This PR introduces the `LabelOp`, which is required for implementing
`GotoOp` lowering in the future.
Lowering to LLVM IR is **not** included in this patch, since it depends
on the upcoming `GotoSolver`.
The `GotoSolver` traverses the function body, and if it finds a
`LabelOp` without a matching `GotoOp`, it erases the label.
This means our implementation differs from the classic codegen approach,
where labels may be retained even if unused.
Example:
https://godbolt.org/z/37Mvr4MMr
The OpenACC standard is going to change to clarify that init, shutdown,
and set should only have a single architecture in each 'device_type'
clause. This patch implements that restriction.
See: https://github.com/OpenACC/openacc-spec/pull/550
This change introduces the #cir.global_view attribute and adds support
for using that attribute to handle initializing a global variable with
the address of another global variable.
This does not yet include support for the optional list of indices to
get an offset from the base address. Those will be added in a follow-up
patch.
This adds support for initializing the vptr member of a dynamic class in
the constructor of that class.
This does not include support for lowering the
`cir.vtable.address_point` operation to the LLVM dialect. That handling
will be added in a follow-up patch.
This change adds the definition of VTableAddrPointOp and the related
AddressPointAttr to the CIR dialect, along with tests for the parsing
and verification of these elements.
Code to generate this operation will be added in a later change.
This PR adds support for loading and storing volatile bit-field members
according to the AAPCS specification.
> A volatile bit-field must always be accessed using an access width
appropriate to the type of its container, except when any of the
following are true:
>
> * The bit-field container overlaps with a zero-length bit-field.
> * The bit-field container overlaps with a non-bit-field member.
For example, if a bit-field is declared as `int`, the load/store must
use a 32-bit access, even if the field itself is only 3 bits wide.
Previously, #151360 implemented 'private' clause lowering, but didn't
properly initialize the variables. This patch adds that behavior to make
sure we correctly get the constructor or other init called.
This PR fixes the outdated logic for accumulating bitfields in
`accumulateFields`. The old approach remained after the algorithm was
updated. A non-bitfield member would act as a barrier, causing
`accumulateBitFields` to receive an incomplete range of fields. As a
result, it failed to accumulate them properly when clipping was
necessary.
For reference, in ClangIR we already handle this correctly:
[b647f4b97b/clang/lib/CIR/CodeGen/CIRRecordLayoutBuilder.cpp (L711-L714))
The private clause is the first with 'recipes', so a lot of
infrastructure is included here, including some MLIR dialect changes
that allow simple adding of a privatization. We'll likely get similar
for firstprivate and reduction.
Also, we have quite a bit of infrastructure in clause lowering to make
sure we have most cases we could think of covered.
At the moment, ONLY private is implemented, so all it requires is an
'init' segment (that doesn't call any copy operations), and potentially
a 'destroy' segment. However, actually calling 'init' functions on each
of the elements in them are not properly implemented, and will be in a
followup patch.
This patch implements all of that, and adds tests in a way that will be
useful for firstprivate as well.