This PR adds a test for parsing the bitfield_info attribute.
Additionally, it updates the `storage_type` and `is_signed` fields to
match the style used in the incubator ASM format guide.
Collect all spellings from all supported OpenMP versions before parsing.
Break up the list of spellings by the initial letter to speed up parsing
a little.
In the case of nested loops, `acc.loop` is meant to subsume all of the
loops that it applies to (when explicitly described as doing so in the
OpenACC specification). So when there is a `acc loop tile(...)` present
on nested Fortran DO loops, `acc.loop` should apply to the `n` loops
that `tile` applies to. This change lowers such nested Fortran loops
with tile clause into a collapsed `acc.loop` with `n` IVs, loop bounds,
and step, in a similar fashion to the current lowering for acc loops
with `collapse` clause.
On most operating systems, the x16 and x17 registers are not special,
so there is no benefit, and only a code size cost, to constraining AUT to
only using them. Therefore, adjust the backend to only use the AUT pseudo
(renamed AUTx16x17 for clarity) on Darwin platforms. All other platforms
use an unconstrained variant of the pseudo, AUTxMxN, for selection.
Reviewers: ahmedbougacha, kovdan01, atrosinenko
Reviewed By: atrosinenko
Pull Request: https://github.com/llvm/llvm-project/pull/132857
This RP changes some Breakpoint-related interfaces to return errors. On
its own these improvements are small, but they encourage better error
handling going forward. There are a bunch of other candidates, but these
were the functions that I touched while working on #146602.
When true16 is enabled, isel start to emit sgpr_lo16 register when a
trunc/sext i16/i32 is generated, or a salu32 is used by vgpr16 or vice
versa. And this causes a problem as sgpr_lo16 is not fully supported in
the pipeline.
True16 mode works fine in -O3 mode since folding pass remove sgpr_lo16
from the pipeline. However it hit a problem in -O0 mode as folding pass
is skipped.
This patch did:
1. stop emitting sgpr_lo16 from isel
2. update codegen pattern to split uniformed/divergent pattern for
i16/i32 conversion
3. update fix-sgpr-copy pass to address legalization requirement in
true16 mode, update fix-sgpr-copies-f16-true16.mir
test to include all possible combinations
This patch is tested with cts and downstream repo with -O0 testing
Anatoly Trosinenko found that when hasSideEffect was set to 0 in the
definition of LOADgotAUTH, MultiSource/Benchmarks/Ptrdist/ks/ks test
from llvm-test-suite started to crash. The issue was traced down to
MachineLICM pass placing LOADgotAUTH right after an unrelated copy to
x16 like rewriting this code:
````
bb.0:
renamable $x16 = COPY renamable $x12
B %bb.1
bb.1:
...
/* use $x16 */
...
renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
/* use $x20 */
...
````
like the following:
````
bb.0:
renamable $x16 = COPY renamable $x12
renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
B %bb.1
bb.1:
...
/* use $x16 */
...
/* use $x20 */
...
```
The issue was caused by inconsistent logic between implicit and explicit
operand definitions, where the implicit side was incorrectly skipping
checking RUDefs for dead operands, leading to RuledOut not being set
for the X16 operand.
Because there isn't really a semantic difference between implicit and
explicit operands at this point, let's remove the isImplicit check and
adjust the logic to do the same thing in both cases:
- For implicit operands, we now check and update RUDefs in the same way
as explicit operands.
- For explicit operands, we now allow dead operands to be skipped.
Reviewers: arsenm, s-barannikov, atrosinenko
Reviewed By: arsenm, s-barannikov
Pull Request: https://github.com/llvm/llvm-project/pull/147624
This PR enables support for BFloat16 type in LLVM libc along with
support for testing BFloat16 functions via MPFR.
---------
Signed-off-by: krishna2803 <kpandey81930@gmail.com>
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Co-authored-by: OverMighty <its.overmighty@gmail.com>
Make `AppendZero` a class member instead of an argument to
`GetOrAddStringOffset` to reflect the intended usage that for a given
`StringToOffsetTable`, all strings must use the same value of
`AppendZero`.
Modify `EmitStringTableDef` to drop the `Indent` argument as its always
set to `""`, and to fail if it's called for a table with
non-null-terminated strings.
127bf44385424891eb04cff8e52d3f157fc2cb7c implemented most of the
infrastructure for capturing structured bindings in lambdas, but missed
one piece: constant evaluation of such lambdas. Refactor the code to
handle this case.
Fixes#145956.
As noted in post commit review, the API change here was not required.
I'd apparently confused myself when teasing apart patches from my
development branch.
This pr resolves some discrepancies in verification during `validate` in
`DXILRootSignature.cpp`.
Note: we don't add a backend test for version 1.0 flag values because it
treats the struct as though there is no flags value. However, this will
be used when we use the verifications in the frontend.
- Updates `verifyDescriptorFlag` to check for valid flags based on
version, as reflected [here](https://github.com/llvm/wg-hlsl/pull/297)
- Add test to demonstrate updated flag verifications
- Adds `verifyNumDescriptors` to the validation of `DescriptorRange`s
- Add a test to demonstrate `numDescriptors` verification
- Updates a number of tests that mistakenly had an invalid
`numDescriptors` specified
Resolves: https://github.com/llvm/llvm-project/issues/147107
Move documentation for macros up to where the macros are initially defined and
add new custom MMA builtin macro in prep for adding more accumulate builtins to clang.
---------
Co-authored-by: Amy Kwan <amy.kwan1@ibm.com>
Calls to the posix `write` function can return -1 and set errno to
`EINTR` or perform partial writes when interrupted by signals. In those
cases applications are supposed to just try again. See for example the
documentation in glibc:
https://sourceware.org/glibc/manual/latest/html_node/I_002fO-Primitives.html#index-write
This fixes the uses in `ErrorHandling.cpp` to retry as needed.
This change cleans up DAG-to-DAG instruction selection around FTZ and
SETP comparison mode. Largely these changes do not impact functionality
though support for `{sin.cos}.approx.ftz.f32` is added.
Callsite offsets will help map addresses to the right position in the
basic block (before or after a callsite).
This PR also bumps the BBAddrMap version to 3.
The encoding/decoding ability is already pushed upstream
8d7a8fcc3ab9f6d4c4a7e4312876fe94ed3d6c4f.
Introduces saturated truncate instructions to Global ISel:
G_TRUNC_SSAT_S, G_TRUNC_SSAT_U, G_TRUNC_USAT_U. These were previously
introduced to SDAG to reduce redundant code.
The patch only initially introduces the instruction, a later patch will
follow to add combines and legalization for each instruction.
Inliner currently treats every "call asm" IR instruction as a single
instruction regardless of how many instructions the inline assembly may
contain. This may underestimate the cost of inlining for a callee
containing long inline assembly. Besides, we may need to assign a higher
cost to instructions in inline assembly since they cannot be analyzed
and optimized by the compiler.
This PR introduces a new option `-inline-asm-instr-cost` -- set zero by
default, which can control the cost of inline assembly instructions in
inliner's cost-benefit analysis.
Adds support to RTSan for `free_sized` and `free_aligned_sized` from
C23.
Other sanitizers will be handled with their own separate PRs.
For https://github.com/llvm/llvm-project/issues/144435
Signed-off-by: Justin King <jcking@google.com>
This patch removes setting the MAX_PARLLEL_COMPILE_JOBS and
MAX_PARALLEL_LINK_JOBS env variables in the windows runs. These were
originally used to control the parallelism on the old infrastructure and
we set them on the new infrastructure explicitly so that we could
maintain both at the same time. Now it does not make sense to keep them
explicitly set that we do not need to explicitly control the parallelism
given the amount of RAM we have on the machines. This also adds a
maintnenace cost as evidenced by the fact that these have been incorrect
(64 instead of 32) for quite a while.
For the fixed vector cases, we already support this, but the
deinterleave intrinsic cases (primary used by scalable vectors) didn't.
Supporting it requires plumbing through the Factor separately from the
extracts, as there can now be fewer extracts than the Factor. Note that
the fixed vector path handles this slightly differently - it uses the
shuffle and indices scheme to achieve the same thing.
XSfvqmaccdod/qoq and XSfvfwmaccqqq are SiFive's small-size matrix
multiplication extensions. This patches add scheduling info for their
instructions along with six new SchedReadWrite.
## Purpose
Statically link `TableGenTests` so it can still build when linked
against an LLVM Windows DLL.
## Background
The effort to build LLVM as a WIndows DLL is tracked in #109483.
Additional context is provided in [this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307).
If `TableGenTests` is linked against LLVM built as a DLL on Windows, it
will fail due to a large number of duplicate symbols found in both the
LLVM DLL and TableGen libraries. This is because `LLVMTableGenBasic` and
`LLVMTableGenCommon` are linked statically against LLVM (using
`DISABLE_LLVM_LINK_LLVM_DYLIB`) so already contain a sub-set of symbols
also exported from the LLVM DLL.
This patch was originally part of #145448.
Relands the commit to upstream the lldb-rpc-gen tool in order to fix a
build failure on the linux remote bots. The reland adds the Clang
resource dir unconditionally to the invocation for the tool instead of
only adding it in the event that we're using a standalone build.
Original PR description:
This commit upstreams the lldb-rpc-gen tool, a ClangTool that generates
the LLDB RPC client and server interfaces. This tool, as well as LLDB
RPC itself is built by default. If it needs to be disabled, put
-DLLDB_BUILD_LLDBRPC=OFF in your CMake invocation.
https://discourse.llvm.org/t/rfc-upstreaming-lldb-rpc/85804
Original PR Link:
https://github.com/llvm/llvm-project/pull/138031
Enable SWIG support for translating Doxygen comments found in interface
and header files into a target language's normal documentation language.
This feature was introduced in SWIG 4.0 and currently only supports
Python (and Java). Hand-written documentation still takes precedence.
SWIG documentation: https://www.swig.org/Doc4.0/Doxygen.html