This partially reverts b605dab7a8352158ee0d399b8c3433f9a8b495a3,
dropping the changes to isl. This is an external library, so
we shouldn't modify it unless strictly necessary.
The lowering code was using the wrong chain value, which meant that
the 'smstart' after the call from streaming agnostic-ZA functions ->
non-streaming private-ZA functions was incorrectly removed from the DAG.
For some ABIs `update_cc_test_checks.py` is unable to generate tests
because of the mismatch between the mangled function names reported by
clang's `-asd-dump` and the function names in LLVM IR.
This patch fixes it by striping the leading underscore from the mangled
name for global functions if the data layout string says they have one.
This relands the reverted #120721 with a fix for cases where neither
reduction operand are the reduction phi. Only
63114239cc8d26225a0ef9920baacfc7cc00fc58 and
63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted
PR.
---------
Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
This extension adds eleven instructions to accelerate interrupt
servicing.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest
This patch adds assembler only support.
---------
Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
Add a release note for optimization change related to pointer overflow
checks. I've put this in the breaking changes section to give it the
best chance of being seen.
When passing an instruction with a register mask, the machine copy
propagation pass was dropping the information about some copy
instructions which define a register which is preserved by the mask,
because that register overlaps a register which is partially clobbered
by it. This resulted in a miscompilation for AArch64, because this
caused a live copy to be considered dead.
The fix is to clobber register masks by finding the set of reg units
which is preserved by the mask, and clobbering all units not in that
set.
The PR fixes the datatype error for `nvvm.mma.sync` when the operand is
`bf16`. This operation originally requires the A/B type to be `f16x2`
for the `bf16` MMA. However, it violates the NVVM intrinsic
[[here](372044ee09/llvm/include/llvm/IR/IntrinsicsNVVM.td (L119))],
where the A/B operand type should be `i32`. This is a bug, and there are
no tests in MLIR that cover this datatype.
```
// mma bf16 -> s32 @ m16n8k16/m16n8k8
!eq(gft,"m16n8k16🅰️bf16") : !listsplat(llvm_i32_ty, 4),
!eq(gft,"m16n8k16🅱️bf16") : !listsplat(llvm_i32_ty, 2),
!eq(gft,"m16n8k8🅰️bf16") : !listsplat(llvm_i32_ty, 2),
!eq(gft,"m16n8k8🅱️bf16") : [llvm_i32_ty],
```
This PR addresses this bug and adds tests to guarantee correctness.
Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>
In this build:
https://buildkite.com/llvm-project/github-pull-requests/builds/126961
The builds actually failed, probably because prerequisite of a test
suite failed to build.
However they still ran other tests and all those passed. This meant that
the test reports were green even though the build was red. On some level
this is technically correct, but it is very misleading in practice.
So I've also passed the build script's return code, as it was when we
entered the on exit handler, to the generator, so that when this happens
again, the report will draw the viewer's attention to the overall
failure. There will be a link in the report to the build's log file, so
the next step to investigate is clear.
It would be nice to say "tests failed and there was some other build
error", but we cannot tell what the non-zero return code was caused by.
Could be either.
The script handles the following situations now:
| Have Result Files? | Tests reported failed? | Return code | Report |
|--------------------|------------------------|-------------|-----------------------------------------------------------------------------|
| Yes | No | 0 | Success style report. |
| Yes | Yes | 0 | Shouldn't happen, but if it did, failure style report
showing the failures. |
| Yes | No | 1 | Failure style report, showing no failures but noting
that the build failed. |
| Yes | Yes | 1 | Failure style report, showing the test failures. |
| No | ? | 0 | No test report, success shown in the normal build
display. |
| No | ? | 1 | No test report, failure shown in the normal build
display. |
`ASTImporterLookupTable` did use the `getPrimaryContext` function to get
the declaration context of the inserted items. This is problematic
because the primary context can change during import of AST items, most
likely if a definition of a previously not defined class is imported.
(For any record the primary context is the definition if there is one.)
The use of primary context is really not important, only for namespaces
because these can be re-opened and lookup in one namespace block is not
enough. This special search is now moved into ASTImporter instead of
relying on the lookup table.
Makes Inline Spiller amenable to the new PM.
This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted
because of two unused private members reported on sanitizer bots.
This patch simplifies select-based integer min/max reductions by
utilizing `llvm::getMinMaxReductionPredicate`, and generates
intrinsic-based min/max reductions by utilizing
`llvm::getMinMaxReductionIntrinsicOp`.
This partially reverts commit 07ff786e39e2190449998d3af1000454dee501be.
The hunk being reverted in this patch seems to break:
tools/llvm-gsymutil/ARM_AArch64/macho-merged-funcs-dwarf.yaml
under LLVM_ENABLE_EXPENSIVE_CHECKS.
The decomposition of `linalg.softmax` uses `maxnumf`, but the identity
element that is used in the generated code is the one for `maximumf`.
They are not the same, as the identity for `maxnumf` is `NaN`, while the
one of `maximumf` is `-Infty`. This is wrong and prevents the maxnumf
from being folded.
Related to #114595, which fixed the folder for maxnumf.
`PassRegistry.def` already has this entry, but the dummy definition was
being pulled instead.
I couldn't reproduce the build failures that FIXME referenced, maybe the
Dummy pass getting in the way was part of the cause.
This commit fixes -print-library-module-manifest-path on macos.
Currently, this only works on linux systems. This is because on macos
systems the library and header files are installed in a different
location. The module manifest is next to the libraries and the search
function was not looking in both places. There is also a test included.
Use the existing VPlan-based analysis to identify recipes that only have
their first lane demanded and transform them to uniform recpliate
recipes. This simplifies the generated code in some places and prepares
for fixing https://github.com/llvm/llvm-project/issues/122496.
Note that PointerUnion::is have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
In this patch, I'm calling call().getBase() for an instance of
PointerUnion. call() alone would return an instance of IndexCall,
which wraps PointerUnion. Note that isa<> cannot directly accept an
instance of IndexCall, at least without defining CastInfo.
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.