This PR implements a semantic checker to ensure the legality of nested
OpenACC parallelism. The following are quotes from Spec 3.3. We need to
disallow loops from having parallelism at the same level as or at a
sub-level of child loops.
>**2.9.2 gang clause**
>[2064] <ins>When the parent compute construct is a parallel
construct</ins>, or on an orphaned loop construct, the gang clause
behaves as follows. (...) The associated dimension is the value of the
dim argument, if it appears, or is dimension one. The dim argument must
be a constant positive integer with value 1, 2, or 3.
>[2112] The region of a loop with a gang(dim:d) clause may not contain a
loop construct with a gang(dim:e) clause where e >= d unless it appears
within a nested compute region.
>[2074] <ins>When the parent compute construct is a kernels
construct</ins>, the gang clause behaves as follows. (...)
>[2148] The region of a loop with the gang clause may not contain
another loop with a gang clause unless within a nested compute region.
>**2.9.3 worker clause**
>[2122]/[2129] The region of a loop with the worker clause may not
contain a loop with the gang or worker clause unless within a nested
compute region.
>**2.9.4 vector clause**
>[2141]/[2148] The region of a loop with the vector clause may not
contain a loop with a gang, worker, or vector clause unless within a
nested compute region.
https://openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.3-final.pdf
The vk::binding attribute allows users to explicitly set the set and
binding for a resource in SPIR-V without chaning the "register"
attribute, which will be used when targeting DXIL.
Fixes https://github.com/llvm/llvm-project/issues/136894
For `select`, we don't have the equivalent of the branch probability analysis to offer defaults, so we make up our own and allow their overriding with flags.
Issue #147390
The codegen for the final class dynamic_cast optimization fails to
consider pointer authentication. This change resolves this be simply
disabling the optimization when pointer authentication enabled.
This was set if `TT.isTargetAEABI()`. This was previously set above
if `TM.isAAPCS_ABI() && (TT.isTargetAEABI() || TT.isTargetGNUAEABI() ||
TT.isTargetMuslAEABI() || TT.isAndroid())`.
So this could differ based on a manually specified -target-abi flag due
to the `isAAPCS_ABI` part of the original condition. I'm guessing
these should be consistent, so either this second group of
setLibcallImpl
calls should have been guarded by the `isAAPCS_ABI` check, or the first
condition should remove it.
There doesn't appear to be any meaningful test coverage using the
manually specified ABI option, so #152108 tries to remove it
In https://github.com/llvm/llvm-project/pull/149248, clang-format
applied some formatting to lines untouched by that PR, because the
existing code is not clang-format compliant. Hence applying clang-format
on the entire files here.
Fixes corner cases of https://github.com/llvm/llvm-project/pull/151406:
- Don't run TestPTrace() on SPARC, because internal_fork() on SPARC
actually calls __fork(). We can't safely __fork(), because it's possible
seccomp has been configured to disallow fork() but allow clone().
- if internal_fork() fails for whatever reason, we shouldn't give up. It
is strictly worse to give up early than to attempt StopTheWorld.
Also updates some comments/TODOs.
The `EmulateInstructionARM64::EvaluateInstruction()` method used
`uint32_t` to store the PC value, causing addresses greater than 2^32 to
be truncated.
As for now, the issue does not affect the mainline because the method is
never called with both `eEmulateInstructionOptionAutoAdvancePC` and
`eEmulateInstructionOptionIgnoreConditions` options set. However, it can
trigger on a downstream that uses software stepping.
Summary:
This being weak forces the external reference to be weak. Either we
define it weak or not by pulling it from `libc`. Doing it here causes it
to not be extracted properly.
A initial commit for #97929, this adds a stub implementation for
`dladdr` and includes the definition for the `DL_info` type used as one
of its arguments.
While the `dladdr` implementation relies on dynamic linker support, this
patch will add its prototype in the generated `dlfcn.h` header so that
it can be used by downstream platforms that have their own `dladdr`
implementation.
* `tools/llvm-objcopy/MachO/update-section-object.test` was failing on
Windows since the input file (`macho_sections.s`) might be checked out
with the wrong line ending, resulting in difference in the size of
sections being checked.
* Removed the check for Windows in `AArch64Arm64ECCallLowering`: when
`llc` is run without an explicit target, the module's target triple is
unknown so this assert fires.
* Expect `llvm/test/CodeGen/Generic/allow-check.ll` to fail for Arm64EC:
Global ISel is not supported.
HandleMemberPointerAccess considered whether the lvalue path in a member
pointer access matched the bases of the containing class of the member,
but neglected to check the same for the containing class of the member
itself, thereby ignoring access attempts to members in direct sibling
classes.
Fixes#150705.
Fixes#150709.
Reason is UpdateNdOffset source operand not retaining the layouts when
it is yielded by the warp op. `warp_execute_on_lane0` op expects that
TensorDesc type is unchanged during distribution out of its region. we
use UnrealizedCasts to reconcile this mismatch outside the warpOp (via
`resolveDistributedTy`)
This moves all the generic MCP code into the new ProtocolMCP library so
it can be shared between the MCP implementation in LLDB (built on the
private API) and lldb-mcp (built on the public API).
This PR doesn't include the actual MCP server. The reason is that the
transport mechanism will be different between the LLDB implementation
(using sockets) and the lldb-mcp implementation (using stdio). The goal
s to abstract away that difference by making the server use
JSONTransport (again) but I'm saving that for a separate PR.
Module files shouldn't ever produce parsing errors, and if they did in
the case of a badly-generated module file, the compiler will notice and
crash. So we can run the parser on module files with message deferral
enabled, and that saves time that would otherwise be spent generating
messages on failed parsing alternatives that are discarded anyway when
backtracking. It's not a big savings (single digit percentage on overall
compilation time for a big application with lots of modules), but worth
doing.
InputNamelist() returns early if any value list read in by
InputDerivedType() or DescriptorIo<Input>() is empty, since they return
false. But an empty value list is okay, and the early return should
occur only on error.
Fixes https://github.com/llvm/llvm-project/issues/151756.
For more accurate compatibility with other compilers' extensions, accept
a bare exponent letter as valid real input to a formatted READ statement
only in a fixed-width input field. So it works with (G1.0) editing, but
not (G)/(D)/(E)/(F) or list-directed input.
Fixes https://github.com/llvm/llvm-project/issues/151465.
When NAMELIST input takes place on a derived type, we need to preserve
the type in the descriptor that is created for storage sequence
association. Further, the fact that any child list input in within the
context of a NAMELIST must be inherited so that input fields don't try
to consume later "variable=" strings.
Fixes https://github.com/llvm/llvm-project/issues/151222.
An initial commit for #149911, this adds a stub implementation for
dlinfo and the enums list of `RTLD_DI_*` values.
While the dlinfo implementation relies on dynamic linker support, this
patch will add its prototype in the generated dlfcn.h header so that it
can be used by downstream platforms that have their own dlinfo
implementation.
On X86-64, LLVM currently generates the same DWARF debug info for `call
rax` and `call [rax]`; in both cases, the generated DWARF claims that
the call goes to address RAX. This bug occurs because the X86 machine
instructions CALL64r and CALL64m both receive register operands, but
those register operands have different semantics.
To fix it, change DwarfDebug::constructCallSiteEntryDIEs() to validate
the callee operand's semantics (`OperandType`) and make sure it is not
semantically describing a memory location.
This fix will result in less DW_TAG_call_site and DW_AT_call_target
entries being generated.
There is an existing test in dwarf-callsite-related-attrs.ll that
asserts the broken behavior; remove the broken check, and instead add a
new test dwarf-callsite-related-attrs-indirect.ll that checks behavior
for indirect calls.
The existing test xray-custom-log.ll is validating something even more
broken: It checks the debug info generated by a PATCHABLE_EVENT_CALL.
`TII->getCalleeOperand()` assumes that the first argument of a call
instruction is always the destination, but the first argument of
PATCHABLE_EVENT_CALL is instead the event structure; and so we were
emitting debug info claiming the callee was stored in a register that
actually contains some kind of xray event descriptor, and the test
validates that this happens.
I am breaking and deleting this test.
I guess the intent there might have been to validate that we emit
debuginfo referencing the target of the direct call that LLVM emits
(which we don't do)? But I'm not sure.
I find myself doing this alot
```
for (unsigned varIndex = rel.getVarKindOffset(VarKind::Domain);
varIndex < rel.getVarKindEnd(VarKind::Domain); ++varIndex) {
...
}
```
Adding this convenience method so I can instead do
```
for (unsigned varIndex : rel.iterVarKind(VarKind::Domain)) {
...
}
```
---------
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Update handling of canonical IV resume phi for the epilogue loop to make
sure the resume phi for the canonical IV is always the first phi in the
scalar preheader.
This makes it easier to retrieve it in preparePlanForEpilogueVectorLoop.
For now, we keep an assert to make sure we use the same resume phi as
before. This will be removed in the future.
GlobalVariableOp describes that built_in specifies SPIR-V BuiltIn
decoration associated with the op. The attribute was defined as builtin
in the tablegen (no uderscore). This was causing correct
GlobalVariableOp decorations like: built_in("GlobalInvocationId") to be
saved as a new attribute making it impossible to access the built_in
attribute through getBuiltinAttr.
`PlatformList::Create()` added an item to the list even when
`Platform::Create()` returned `nullptr`. Other methods use these items
without checking, which can lead to a crash. For example:
```
> lldb
(lldb) platform select unknown
error: unable to find a plug-in for the platform named "unknown"
(lldb) file a.out-arm64
PLEASE submit a bug report to...
Stack dump:
0. Program arguments: lldb
1. HandleCommand(command = "file a.out-arm64 ")
...
```
This fixes failures in a number of tests, in the
clang-cl-no-vcruntime configuration (where libcxx provides dummy, no-op
replacements of some vcruntime base exception classes), if building with
optimization enabled.
Previously, with optimization enabled, the compiler concluded that these
fields would be uninitialized at the points of asserts in the tests.
This fixes the following tests in this configuration:
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.dynamic/alloc.errors/bad.alloc/bad_alloc.pass.cpp
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.dynamic/alloc.errors/new.badlength/bad_array_new_length.pass.cpp
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.exception/bad.exception/bad_exception.pass.cpp
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.exception/exception/exception.pass.cpp
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.rtti/bad.cast/bad_cast.pass.cpp
llvm-libc++-shared-no-vcruntime-clangcl.cfg.in ::
std/language.support/support.rtti/bad.typeid/bad_typeid.pass.cpp
This patch adds a new set of conformance tests for single-precision math
functions provided by the LLVM libm for GPUs.
The functions included in this set were selected based on the following
criteria:
- An implementation exists in `libc/src/math/generic` (i.e., it is not
just a wrapper around a compiler built-in).
- The corresponding LLVM CPU libm implementation is correctly rounded.
- The function is listed in Table 65 of the OpenCL C Specification
v3.0.19.
Binary I/O (also called buffered I/O) expects bytes-like objects and
produces bytes objects [1]. Switch from using a Python buffer to using
Python bytes to read the data. This eliminates calls to functions that
aren't part of the Python stable C API.
[1] https://docs.python.org/3/library/io.html#binary-i-o