523604 Commits

Author SHA1 Message Date
Nikita Popov
5eb9acff28 [Polly] Revert changes to isl Python code
This partially reverts b605dab7a8352158ee0d399b8c3433f9a8b495a3,
dropping the changes to isl. This is an external library, so
we shouldn't modify it unless strictly necessary.
2025-01-13 13:11:39 +01:00
Sander de Smalen
3efe83291f [AArch64] Fix chain for calls from agnostic-ZA functions.
The lowering code was using the wrong chain value, which meant that
the 'smstart' after the call from streaming agnostic-ZA functions ->
non-streaming private-ZA functions was incorrectly removed from the DAG.
2025-01-13 12:06:50 +00:00
Eisuke Kawashima
5609724c2e
[Polly] Fix invalid escape sequences (#94037)
These generate a SyntaxWarning since Python 3.12.
2025-01-13 13:05:10 +01:00
Eisuke Kawashima
ca92bdfa3e
[cross-project-tests] Use "is" instead of "==" to check for None (#94016)
From PEP8
(https://peps.python.org/pep-0008/#programming-recommendations):

> Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
2025-01-13 13:03:04 +01:00
Eisuke Kawashima
b605dab7a8
[Polly] Use "is" instead of "==" to check for None (#94021)
From PEP8
(https://peps.python.org/pep-0008/#programming-recommendations):

> Comparisons to singletons like None should always be done with is or
is not, never the equality operators.
2025-01-13 13:00:35 +01:00
Simon Pilgrim
6c5941b09f [X86] subvectorwise-store-of-vector-splat.ll - regenerate VPTERNLOG comments 2025-01-13 11:36:58 +00:00
Momchil Velikov
5315f3f8cb
Handle leading underscores in update_cc_test_checks.py (#121800)
For some ABIs `update_cc_test_checks.py` is unable to generate tests
because of the mismatch between the mangled function names reported by
clang's `-asd-dump` and the function names in LLVM IR.

This patch fixes it by striping the leading underscore from the mangled
name for global functions if the data layout string says they have one.
2025-01-13 11:24:05 +00:00
Sam Tebbs
795e35a653
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither
reduction operand are the reduction phi. Only
63114239cc8d26225a0ef9920baacfc7cc00fc58 and
63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted
PR.

---------

Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
2025-01-13 11:20:35 +00:00
quic_hchandel
171d3edd05
[RISCV] Add Qualcomm uC Xqciint (Interrupts) extension (#122256)
This extension adds eleven instructions to accelerate interrupt
servicing.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.

---------

Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>
2025-01-13 16:36:05 +05:30
Haojian Wu
d7e79663e7 Remove an extra trailing `` in Modules.rst, NFC 2025-01-13 12:04:21 +01:00
Haojian Wu
d2ba364440 Fix an unused-variable warning in release build. 2025-01-13 12:03:35 +01:00
vfdev
f136c800b6
Enabled freethreading support in MLIR python bindings (#122684)
Reland reverted https://github.com/llvm/llvm-project/pull/107103 with
the fixes for Python 3.8

cc @jpienaar

Co-authored-by: Peter Hawkins <phawkins@google.com>
2025-01-13 03:00:31 -08:00
Durgadoss R
7e2eb0f83e
[NVPTX] Add float to tf32 conversion intrinsics (#121507)
This patch adds the missing variants of float to tf32 conversion
intrinsics, with their corresponding lit tests.

PTX Spec link:

https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-01-13 16:17:42 +05:30
Balázs Kéri
7e01a322f8
[clang][ASTImporter] Fix unused variable warning (NFC) (#122686) 2025-01-13 11:35:38 +01:00
Jay Foad
a3b3c26048
[TableGen] Use assert instead of PrintFatalError in TGLexer. NFC. (#122303)
Do not use the PrintFatalError diagnostic machinery for conditions that
can never happen with any input.
2025-01-13 10:30:55 +00:00
Kareem Ergawy
b5987157e8
[flang][OpenMP] Fix omp-declarative-allocate-align.f90 expectations (#122675)
The test was effectively a no-op since we used `//` instead of `!` for
`RUN` and `CHECK` lines. Also, we have to specify the proper OpenMP
version.
2025-01-13 11:27:23 +01:00
Nikita Popov
c2979c58d4
[Clang] Add release note for pointer overflow optimization change (#122462)
Add a release note for optimization change related to pointer overflow
checks. I've put this in the breaking changes section to give it the
best chance of being seen.
2025-01-13 11:24:02 +01:00
xtex
16923da241
Revert "[clang] Canonicalize absolute paths in dependency file" (#121638)
Reverts llvm/llvm-project#117458

https://github.com/llvm/llvm-project/pull/117458#issuecomment-2568804774

https://github.com/ninja-build/ninja/issues/2528
2025-01-13 11:12:23 +01:00
Oliver Stannard
e2a071ece5
[MachineCP] Correctly handle register masks and sub-registers (#122472)
When passing an instruction with a register mask, the machine copy
propagation pass was dropping the information about some copy
instructions which define a register which is preserved by the mask,
because that register overlaps a register which is partially clobbered
by it. This resulted in a miscompilation for AArch64, because this
caused a live copy to be considered dead.

The fix is to clobber register masks by finding the set of reg units
which is preserved by the mask, and clobbering all units not in that
set.
2025-01-13 09:55:08 +00:00
xiaoleis-nv
d03f35f9b6
[MLIR][NVVM] Fix the datatype error for nvvm.mma.sync when the operand is bf16 (#122664)
The PR fixes the datatype error for `nvvm.mma.sync` when the operand is
`bf16`. This operation originally requires the A/B type to be `f16x2`
for the `bf16` MMA. However, it violates the NVVM intrinsic
[[here](372044ee09/llvm/include/llvm/IR/IntrinsicsNVVM.td (L119))],
where the A/B operand type should be `i32`. This is a bug, and there are
no tests in MLIR that cover this datatype.

```
    // mma bf16 -> s32 @ m16n8k16/m16n8k8
    !eq(gft,"m16n8k16🅰️bf16") : !listsplat(llvm_i32_ty, 4),
    !eq(gft,"m16n8k16🅱️bf16") : !listsplat(llvm_i32_ty, 2),
    !eq(gft,"m16n8k8🅰️bf16") : !listsplat(llvm_i32_ty, 2),
    !eq(gft,"m16n8k8🅱️bf16") : [llvm_i32_ty],
```

This PR addresses this bug and adds tests to guarantee correctness.

Co-authored-by: Xiaolei Shi <xiaoleis@nvidia.com>
2025-01-13 15:03:05 +05:30
David Spickett
1b199d1990
[ci] Handle the case where all reported tests pass but the build is still a failure (#120264)
In this build:
https://buildkite.com/llvm-project/github-pull-requests/builds/126961

The builds actually failed, probably because prerequisite of a test
suite failed to build.

However they still ran other tests and all those passed. This meant that
the test reports were green even though the build was red. On some level
this is technically correct, but it is very misleading in practice.

So I've also passed the build script's return code, as it was when we
entered the on exit handler, to the generator, so that when this happens
again, the report will draw the viewer's attention to the overall
failure. There will be a link in the report to the build's log file, so
the next step to investigate is clear.

It would be nice to say "tests failed and there was some other build
error", but we cannot tell what the non-zero return code was caused by.
Could be either.

The script handles the following situations now:
| Have Result Files? | Tests reported failed? | Return code | Report |

|--------------------|------------------------|-------------|-----------------------------------------------------------------------------|
| Yes | No | 0 | Success style report. |
| Yes | Yes | 0 | Shouldn't happen, but if it did, failure style report
showing the failures. |
| Yes | No | 1 | Failure style report, showing no failures but noting
that the build failed. |
| Yes | Yes | 1 | Failure style report, showing the test failures. |
| No | ? | 0 | No test report, success shown in the normal build
display. |
| No | ? | 1 | No test report, failure shown in the normal build
display. |
2025-01-13 09:05:18 +00:00
Balázs Kéri
b270525f73
[clang][ASTImporter] Not using primary context in lookup table (#118466)
`ASTImporterLookupTable` did use the `getPrimaryContext` function to get
the declaration context of the inserted items. This is problematic
because the primary context can change during import of AST items, most
likely if a definition of a previously not defined class is imported.
(For any record the primary context is the definition if there is one.)
The use of primary context is really not important, only for namespaces
because these can be re-opened and lookup in one namespace block is not
enough. This special search is now moved into ASTImporter instead of
relying on the lookup table.
2025-01-13 09:46:45 +01:00
Akshat Oke
4f96fb5fb3
Reapply "Spiller: Detach legacy pass and supply analyses instead (#119181)" (#122665)
Makes Inline Spiller amenable to the new PM.

This reapplies commit a531800344dc54e9c197a13b22e013f919f3f5e1 reverted
because of two unused private members reported on sanitizer bots.
2025-01-13 14:14:13 +05:30
Mel Chen
56a37a3c76
[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549)
This patch simplifies select-based integer min/max reductions by
utilizing `llvm::getMinMaxReductionPredicate`, and generates
intrinsic-based min/max reductions by utilizing
`llvm::getMinMaxReductionIntrinsicOp`.
2025-01-13 16:11:31 +08:00
Kazu Hirata
76af93fbea Partially revert "[TableGen] Avoid repeated hash lookups (NFC) (#122586)"
This partially reverts commit 07ff786e39e2190449998d3af1000454dee501be.

The hunk being reverted in this patch seems to break:

  tools/llvm-gsymutil/ARM_AArch64/macho-merged-funcs-dwarf.yaml

under LLVM_ENABLE_EXPENSIVE_CHECKS.
2025-01-12 23:50:58 -08:00
Clément Fournier
36c3466aef
[mlir][linalg] Fix neutral elt for softmax (#118952)
The decomposition of `linalg.softmax` uses `maxnumf`, but the identity
element that is used in the generated code is the one for `maximumf`.
They are not the same, as the identity for `maxnumf` is `NaN`, while the
one of `maximumf` is `-Infty`. This is wrong and prevents the maxnumf
from being folded.

Related to #114595, which fixed the folder for maxnumf.
2025-01-13 15:21:07 +08:00
CHANDRA GHALE
6f558e0e12
[OpenMP] codegen support for masked combined construct masked taskloop (#121914)
Added codegen support for combined masked constructs `masked taskloop.`
Added implementation for `EmitOMPMaskedTaskLoopDirective`.

---------

Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-01-13 11:42:13 +05:30
Shourya Goel
b861539196
[libc][complex] fix compiler support matrix for cfloat128 (#122593)
Before this patch, [godbolt](https://godbolt.org/z/6PPsvv9qd) failed to
compile `cfloat128` with `-ffreestanding` but with the patch, the
compilation succeeds, [godbolt](https://godbolt.org/z/4M8zzejss).

Fixes: #122500

cc: @nickdesaulniers
2025-01-13 11:23:36 +05:30
Akshat Oke
f431f93a77
[CodeGen][NewPM] Use proper NPM AtomicExpandPass in AMDGPU (#122086)
`PassRegistry.def` already has this entry, but the dummy definition was
being pulled instead.

I couldn't reproduce the build failures that FIXME referenced, maybe the
Dummy pass getting in the way was part of the cause.
2025-01-13 10:38:24 +05:30
wldfngrs
ecf4f95c4f
[libc][math][c23] Add tanf16 function (#121018)
- Implementation of tan for 16-bit floating point inputs.
- Exhaustive tests across the 16-bit input range
2025-01-12 23:46:53 -05:00
Akshat Oke
7bf1cb702b
[AMDGPU][NewPM] Port AMDGPURemoveIncompatibleFunctions to NPM (#122261) 2025-01-13 10:11:40 +05:30
Shilei Tian
f15da5fb78
[AMDGPU] Fix an invalid cast in AMDGPULateCodeGenPrepare::visitLoadInst (#122494)
Fixes: SWDEV-507695
2025-01-12 23:40:25 -05:00
Sameer Sahasrabuddhe
77e6f434ec
[SPIRV] convergence anchor intrinsic does not have a parent token (#122230) 2025-01-13 09:54:57 +05:30
Pengcheng Wang
681c4a2068 Reapply "[RISCV] Rework memcpy test (#120364)"
Use descriptive names and add more cases.

This recommits 59bba39 which was reverted in 4637c77.
2025-01-13 12:06:26 +08:00
Pengcheng Wang
4637c77746
Revert "[RISCV] Rework memcpy test" (#122662)
Reverts llvm/llvm-project#120364

The test should be updated due to some recent changes.
2025-01-13 11:36:37 +08:00
Pengcheng Wang
59bba39a69
[RISCV] Rework memcpy test (#120364)
Use descriptive names and add more cases.
2025-01-13 11:28:24 +08:00
Bill Hoffman
acbd822879
Fix print module manifest file for macos (#122370)
This commit fixes -print-library-module-manifest-path on macos.
Currently, this only works on linux systems. This is because on macos
systems the library and header files are installed in a different
location. The module manifest is next to the libraries and the search
function was not looking in both places. There is also a test included.
2025-01-13 10:20:20 +08:00
Justin Bogner
0e51b54b7a
[DirectX] Implement the resource.store.rawbuffer intrinsic (#121282)
This introduces `@llvm.dx.resource.store.rawbuffer` and generalizes the
buffer store docs under DirectX/DXILResources.

Fixes #106188
2025-01-12 18:52:20 -07:00
Sander de Smalen
08028d68a9 [Clang] Fix buildbot failure introduced by #121788
Silences 'enumeration not handled in switch' warning,
which causes buildbot failures with -Werror.
2025-01-12 22:09:26 +00:00
Florian Hahn
8df64ed777 [LV] Don't consider IV increments uniform if exit value is used outside.
In some cases, there might be a chain of uniform instructions producing
the exit value. To generate correct code in all cases, consider the IV
increment not uniform, if there are users outside the loop.

Instead, let VPlan narrow the IV, if possible using the logic from
3ff1d01985752.

Test case from #122602 verified with Alive2:
    https://alive2.llvm.org/ce/z/bA4EGj

Fixes https://github.com/llvm/llvm-project/issues/122496.
Fixes https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 22:03:21 +00:00
Sander de Smalen
b4ce29ab31
[AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (#121788)
This adds support for parsing the attribute and codegen to map it to
"aarch64_za_state_agnostic" LLVM IR attribute.

This attribute is described in the Arm C Language Extensions (ACLE)
document:

  https://github.com/ARM-software/acle/blob/main/main/acle.md#__arm_agnostic
2025-01-12 21:35:44 +00:00
Fangrui Song
5c0aa31c3c -ftime-report: Move FrontendTimer closer to TimeTraceScope
... to improve consistency and make "Clang time report" cover
`FrontendAction::BeginSourceFile` and `FrontendAction::EndSourceFile`.
2025-01-12 13:17:49 -08:00
Florian Hahn
f5a35a31bf [LV] Add test cases with incorrect IV live-outs.
Add test cases for https://github.com/llvm/llvm-project/issues/122496
and https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 20:55:20 +00:00
Florian Hahn
3ff1d01985 Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67.

Re-applies commit with typos fixed.
2025-01-12 20:10:28 +00:00
Florian Hahn
0ebb3ac7c9 Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac.

Typo breaking the build
2025-01-12 19:37:45 +00:00
Florian Hahn
1afba19913 [VPlan] Try to narrow wide and replicating recipes to uniform recipes.
Use the existing VPlan-based analysis to identify recipes that only have
their first lane demanded and transform them to uniform recpliate
recipes. This simplifies the generated code in some places and prepares
for fixing https://github.com/llvm/llvm-project/issues/122496.
2025-01-12 19:32:01 +00:00
Kazu Hirata
16aa400a27
[ELF] Avoid repeated hash lookups (NFC) (#122628) 2025-01-12 11:07:07 -08:00
Kazu Hirata
fd87188c2b
[wasm] Avoid repeated hash lookups (NFC) (#122626) 2025-01-12 11:06:56 -08:00
Kazu Hirata
43fdd6e81d
[memprof] Migrate away from PointerUnion::is (NFC) (#122622)
Note that PointerUnion::is have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

In this patch, I'm calling call().getBase() for an instance of
PointerUnion.  call() alone would return an instance of IndexCall,
which wraps PointerUnion.  Note that isa<> cannot directly accept an
instance of IndexCall, at least without defining CastInfo.

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2025-01-12 11:06:42 -08:00
Jacques Pienaar
3f1486f08e Revert "Added free-threading CPython mode support in MLIR Python bindings (#107103)"
Breaks on 3.8, rolling back to avoid breakage while fixing.

This reverts commit 9dee7c44491635ec9037b90050bcdbd3d5291e38.
2025-01-12 18:30:42 +00:00