The patch fixed a bug introduced patch [[PowePC] using MTVSRBMI
instruction instead of constant pool in
power10+](https://github.com/llvm/llvm-project/pull/144084#top).
The issue arose because the layout of vector register elements differs
between little-endian and big-endian modes — specifically, the elements
appear in reverse order. This led to incorrect behavior when loading
constants using MTVSRBMI in big-endian configurations.
Summary:
We need `malloc` to return a larger size now that it's aligned properly
and we use a bunch of threads. Also the `match_any` test was wrong
because it assumed a 32-bit lanemask.
We run a couple Github only steps in the monolithic-* scripts like
writing to $GITHUB_STEP_SUMMARY and denoting log groups. These can throw
errors when running the scripts locally or as part of other CI systems
(like buildbot). This patch makes it so that we only run these commands
when in the relevant environment. We also add a $POSTCOMMIT_CI check for
groups so that the annotated builder in buildbot works properly with
these scripts.
Reviewers: cmtice, ldionne, Endilll, gburgessiv, Keenuts, dschuff, lnihlen
Pull Request: https://github.com/llvm/llvm-project/pull/152197
This change rebases the anonymous `xxeval` patterns to use the new
XXEvalPattern class based on the [[PowerPC] Exploit xxeval instruction
for ternary patterns - ternary(A, X,
and(B,C))](https://github.com/llvm/llvm-project/pull/141733#top)
Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
This patch avoids a comparison against zero when lowering abs(sub(a, b))
patterns, instead reusing the condition codes generated by a subs of the
operands directly.
For example, currently:
```
sxtb w8, w0
sub w8, w8, w1, sxtb
cmp w8, #0
cneg w0, w8, mi
```
becomes:
```
sxtb w8, w0
subs w8, w8, w1, sxtb
cneg w0, w8, mi
```
Together with #151177, this should handle the remaining patterns in
#118413.
This reverts commit 2b4b3fd03f716b9ddbb2a69ccfbe144312bedd12.
Turns out the CI still fails with this test enabled:
```
11:08:50 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/queue/TestQueueFromStdModule.py", line 37, in test
11:08:50 self.expect_expr(
11:08:50 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2571, in expect_expr
11:08:50 value_check.check_value(self, eval_result, str(eval_result))
11:08:50 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 301, in check_value
11:08:50 test_base.assertSuccess(val.GetError())
11:08:50 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2606, in assertSuccess
11:08:50 self.fail(self._formatMessage(msg, "'{}' is not success".format(error)))
11:08:50 AssertionError: 'error: /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/module.modulemap:93:11: header 'stdarg.h' not found
11:08:50 93 | header "stdarg.h" // note: supplied by the compiler
11:08:50 | ^
11:08:50 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib/clang/22/include/stdint.h:56:16: submodule of top-level module 'Darwin' implicitly imported here
11:08:50 56 | # include_next <stdint.h>
11:08:50 | ^
11:08:50 error: While building module 'std' imported from <lldb wrapper prefix>:42:
```
NFC patch to clean up extra lines of code in the file
`llvm/test/CodeGen/PowerPC/check-zero-vector.ll` as the current one has
loop unrolled.
Also removing the file `clang/test/CodeGen/PowerPC/check-zero-vector.c`
as the patch affects only the backend.
Co-authored-by: himadhith <himadhith.v@ibm.com>
This has been introduced by #151304. This problem is diagnosed by UBSan
with optimizations enabled. Since we run UBSan only with optimizations
disabled currently, this isn't caught in our CI. We should look into
enabling UBSan with optimizations enabled to catch these sorts of issues
before landing a patch.
Add `ValueObject::CreateValueObjectFromScalar` function and adjust
`Scalar::GetData` to be able to both extend and truncate the data bytes
in Scalar to the specified size.
This reverts commit 600976f4bfb06526c283dcc4efc4801792f08ca5.
The test was crashing trying to access any element of the GPR struct.
(gdb) disas
Dump of assembler code for function _ZN7testing8internal11CmpHelperEQIyyEENS_15AssertionResultEPKcS4_RKT_RKT0_:
0x00450afc <+0>: push {r4, r5, r6, r7, r8, r9, r10, r11, lr}
0x00450b00 <+4>: sub sp, sp, #60 ; 0x3c
0x00450b04 <+8>: ldr r5, [sp, #96] ; 0x60
=> 0x00450b08 <+12>: ldm r3, {r4, r7}
0x00450b0c <+16>: ldm r5, {r6, r9}
0x00450b10 <+20>: eor r7, r7, r9
0x00450b14 <+24>: eor r6, r4, r6
0x00450b18 <+28>: orrs r7, r6, r7
(gdb) p/x r3
$3 = 0x3e300f6e
"However, load and store multiple instructions (LDM and STM) and load and store double-word (LDRD or STRD) must be aligned to at least a word boundary."
https://developer.arm.com/documentation/den0013/d/Porting/Alignment
>>> 0x3e300f6e % 4
2
Program received signal SIGBUS, Bus error.
0x00450b08 in testing::AssertionResult testing::internal::CmpHelperEQ<unsigned long long, unsigned long long>(char const*, char const*, unsigned long long const&, unsigned long long const&) ()
The struct is packed with 1 byte alignment, but it needs to start at an aligned address for us
to ldm from it. So I've done that with alignas.
Also fixed some compiler warnings in the test itself.
There is an issue tracking lsan incompatibility on these platforms:
https://github.com/llvm/llvm-project/issues/131678. Many of these tests
are currently failing and creating CI noise.
rdar://157252316
I discovered this building the lldb test suite with `-mthumb` set, so
that all test programs are purely Arm Thumb code.
When C++ expressions did a function lookup, they took a different path
from C programs. That path happened to land on the line that I've
changed. Where we try to look something up as a function, but don't then
resolve the address as if it's a callable.
With Thumb, if you do the non-callable lookup, the bottom bit won't be
set. This means when lldb's expression wrapper function branches into
the found function, it'll be in Arm mode trying to execute Thumb code.
Thumb is the only instance where you'd notice this. Aside from maybe
MicroMIPS or MIPS16 perhaps but I expect that there are 0 users of that
and lldb.
I have added a new test case that will simulate this situation in
"normal" Arm builds so that it will get run on Linaro's buildbot.
This change also fixes the following existing tests when `-mthumb` is
used:
```
lldb-api :: commands/expression/anonymous-struct/TestCallUserAnonTypedef.py
lldb-api :: commands/expression/argument_passing_restrictions/TestArgumentPassingRestrictions.py
lldb-api :: commands/expression/call-function/TestCallStopAndContinue.py
lldb-api :: commands/expression/call-function/TestCallUserDefinedFunction.py
lldb-api :: commands/expression/char/TestExprsChar.py
lldb-api :: commands/expression/class_template_specialization_empty_pack/TestClassTemplateSpecializationParametersHandling.py
lldb-api :: commands/expression/context-object/TestContextObject.py
lldb-api :: commands/expression/formatters/TestFormatters.py
lldb-api :: commands/expression/import_base_class_when_class_has_derived_member/TestImportBaseClassWhenClassHasDerivedMember.py
lldb-api :: commands/expression/inline-namespace/TestInlineNamespace.py
lldb-api :: commands/expression/namespace-alias/TestInlineNamespaceAlias.py
lldb-api :: commands/expression/no-deadlock/TestExprDoesntBlock.py
lldb-api :: commands/expression/pr35310/TestExprsBug35310.py
lldb-api :: commands/expression/static-initializers/TestStaticInitializers.py
lldb-api :: commands/expression/test/TestExprs.py
lldb-api :: commands/expression/timeout/TestCallWithTimeout.py
lldb-api :: commands/expression/top-level/TestTopLevelExprs.py
lldb-api :: commands/expression/unwind_expression/TestUnwindExpression.py
lldb-api :: commands/expression/xvalue/TestXValuePrinting.py
lldb-api :: functionalities/thread/main_thread_exit/TestMainThreadExit.py
lldb-api :: lang/cpp/call-function/TestCallCPPFunction.py
lldb-api :: lang/cpp/chained-calls/TestCppChainedCalls.py
lldb-api :: lang/cpp/class-template-parameter-pack/TestClassTemplateParameterPack.py
lldb-api :: lang/cpp/constructors/TestCppConstructors.py
lldb-api :: lang/cpp/function-qualifiers/TestCppFunctionQualifiers.py
lldb-api :: lang/cpp/function-ref-qualifiers/TestCppFunctionRefQualifiers.py
lldb-api :: lang/cpp/global_operators/TestCppGlobalOperators.py
lldb-api :: lang/cpp/llvm-style/TestLLVMStyle.py
lldb-api :: lang/cpp/multiple-inheritance/TestCppMultipleInheritance.py
lldb-api :: lang/cpp/namespace/TestNamespace.py
lldb-api :: lang/cpp/namespace/TestNamespaceLookup.py
lldb-api :: lang/cpp/namespace_conflicts/TestNamespaceConflicts.py
lldb-api :: lang/cpp/operators/TestCppOperators.py
lldb-api :: lang/cpp/overloaded-functions/TestOverloadedFunctions.py
lldb-api :: lang/cpp/rvalue-references/TestRvalueReferences.py
lldb-api :: lang/cpp/static_methods/TestCPPStaticMethods.py
lldb-api :: lang/cpp/template/TestTemplateArgs.py
lldb-api :: python_api/thread/TestThreadAPI.py
```
There are other failures that are due to different problems, and this
change does not make those worse.
(I have no plans to run the test suite with `-mthumb` regularly, I just
did it to test some other refactoring)
As a follow-up to #151177, when lowering SELECT_CC nodes of absolute
difference patterns, drop poison-generating flags from the negated
operand to avoid inadvertently propagating poison.
As discussed in the PR above, I didn't find practical issues with the
current code, but it seems safer to do this preemptively.
Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave
functions. This will take as its first argument the callee with the
amdgpu_gfx_whole_wave calling convention, followed by the call
parameters which must match the signature of the callee except for the
first function argument (the i1 original EXEC mask, which doesn't need
to be passed in). Indirect calls are not allowed.
Make direct calls to amdgpu_gfx_whole_wave functions a verifier error.
Unspeakable horrors happen around calls from whole wave functions, the
plan is to improve the handling of caller/callee-saved registers in
a future patch.
Tail calls are also handled in a future patch.
The main change in this patch is we go from emitting the expression:
@ cfa - NumBytes - NumScalableBytes * VG
To:
@ cfa - VG * NumScalableBytes - NumBytes
That is, VG is the first expression. This is for a future patch that
adds an alternative way to resolve VG (which uses the CFA, so it is
convenient for the CFA to be at the top of the stack).
Since doing this is fairly churn-heavy, I took the opportunity to also
save up to 4-bytes per SVE CFI expression. This is done by folding
LEB128 constants to literals when in the range 0 to 31, and using the
offset in `DW_OP_breg*` expressions.
Failing on the macOS matrix bot for Clang-15:
```
07:38:40 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/test/API/commands/expression/import-std-module/queue/TestQueueFromStdModule.py", line 38, in test
07:38:40 self.expect_expr(
07:38:40 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2571, in expect_expr
07:38:40 value_check.check_value(self, eval_result, str(eval_result))
07:38:40 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 301, in check_value
07:38:40 test_base.assertSuccess(val.GetError())
07:38:40 File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2606, in assertSuccess
07:38:40 self.fail(self._formatMessage(msg, "'{}' is not success".format(error)))
07:38:40 AssertionError: 'error: /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/module.modulemap:93:11: header 'stdarg.h' not found
07:38:40 93 | header "stdarg.h" // note: supplied by the compiler
07:38:40 | ^
07:38:40 /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake-matrix/clang_1501_build/include/c++/v1/ctype.h:38:15: submodule of top-level module 'Darwin' implicitly imported here
07:38:40 38 | #include_next <ctype.h>
07:38:40 | ^
07:38:40 error: While building module 'std' imported from <lldb wrapper prefix>:42:
```
Update clang-cl/LLVM to 20.1.8.
Update to llvm-mingw 20250709 (with also is built on LLVM 20.1.8). This
release of llvm-mingw is the first release to be built with PGO, making
it significantly faster for the CI runs (on par with the clang-cl
cases); running the current tests in around 1 h rather than 1 h 20 min.
https://github.com/llvm/llvm-project/pull/150511 changed the
canonicalization pattern to not allow casts from ranked to unranked
anymore. This patch restores this functionality, while still keeping the
fix to preserve memory space and layout.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates symbols that were recently
added to LLVM and fixes incorrectly annotated symbols.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.
The following manual adjustments were also applied after running IDS:
- Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to
explicitly instantiated instances of `llvm::object::SFrameParser`.
## Validation
On Windows 11:
```
cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DLLVM_BUILD_TESTS=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release
ninja -C build
```
## Purpose
Add missing annotations so that LLVM can build as a shared library on
Linux with `-fvisibility=hidden` using GCC.
## Overview
Add a couple of annotations that make GCC happy:
* Add `LLVM_TEMPLATE_ABI` to the `extern` declaration of
`LoopBase<MachineBasicBlock, MachineLoop>`. The instantiation of this of
this template is already properly annotated with `LLVM_EXPORT_TEMPLATE`
and this declaration was missed. This omission did not cause problems
with MSVC or Clang but results in undefined reference to the destructor
with GCC.
* Add `LLVM_ATTRIBUTE_VISIBILITY_DEFAULT` to the template declaration
for `InnerAnalysisManagerProxy::Key`. `LLVM_ABI` cannot be used here
because MSVC disallows storage-class specifiers on class members outside
of the class declaration (C2720). Since
`LLVM_ATTRIBUTE_VISIBILITY_DEFAULT` only applies to non-Windows targets,
it can be used in its place. Omitting this annotation does not cause
problems with Clang but results in undefined reference to
`InnerAnalysisManagerProxy::Key` fields for for explicitly instantiated
declarations when compiling with GCC.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Validation
On Fedora 42, build with GCC `LLVM_BUILD_LLVM_DYLIB=ON`,
`LLVM_BUILD_LLVM_DYLIB_VIS=ON`, and `LLVM_LINK_LLVM_DYLIB=ON`.
```
cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_TESTS=ON -DLLVM_BUILD_EXAMPLES=ON -DLLDB_ENABLE_PYTHON=NO -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release
ninja -C build
```
Fixes a bug that surfaces in frame recognizers.
Details about the bug:
A new frame recognizer is configured to match a specific symbol
(`swift_willThrow`). This is an `extern "C"` symbol defined in a C++
source file. When Swift is built with debug info, the function
`ParseFunctionFromDWARF` will use the debug info to construct a function
name that looks like a C++ declaration (`::swift_willThrow(void *,
SwiftError**)`). The `Mangled` instance will have this string as its
`m_demangled` field, and have _no_ string for its `m_mangled` field.
The result is the frame recognizer would not match the symbol to the
name (`swift_willThrow` != `::swift_willThrow(void *, SwiftError**)`.
By changing `ParseFunctionFromDWARF` to assign both a demangled name and
a mangled, frame recognizers can successfully match symbols in this
configuration.
Forward SourceLocation to `EmitCall` so that clang triggers an error
when a function inside `[[gnu::cleanup(func)]]` is annotated with
`[[gnu::error("some message")]]`.
resolves#146520
This pattern previously checked a specific variant of 4 bytes being
packed that is generated by unaligned load expansion.
Our individual PACK patterns don't handle that particular case because a
DAG combine turns (or (or A, (shl B, 8)), (shl (or C, (shl D, 8)), 16))
into (or (or A, (shl B, 8)), (or (shl C, 16), (shl D, 24)). After this,
the outer OR doesn't have a shl operand so we needed a pattern that
looks through 2 layers of OR.
To match this pattern we don't need to look at the (or A, (shl B, 8))
part since that part wasn't affected by the DAG combine and can be
matched to PACKH by itself. It's enough to make sure that part of the
pattern has zeros in the upper 16 bits.
This allows tablegen to automatically generate more permutations of this pattern.
The associative variant expansion is limited to 3 children.
Languages other than C/C++ don't necessarily emit mangled names in the
`UniqueName` field of type records. Rust specifically emits a unique ID
that doesn't contain the name.
For example, `(i32, i32)` is emitted as
```llvm
!266 = !DICompositeType(
tag: DW_TAG_structure_type, name: "tuple$<i32,i32>", file: !9, size: 64, align: 32,
elements: !267, templateParams: !17, identifier: "19122721b0632fe96c0dd37477674472"
)
```
which results in
```
0x1091 | LF_STRUCTURE [size = 72, hash = 0x1AC67] `tuple$<i32,i32>`
unique name: `19122721b0632fe96c0dd37477674472`
vtable: <no type>, base list: <no type>, field list: 0x1090
options: has unique name, sizeof 8
```
In C++ with Clang and MSVC, a structure similar to this would result in
```
0x136F | LF_STRUCTURE [size = 44, hash = 0x30BE2] `MyTuple`
unique name: `.?AUMyTuple@@`
vtable: <no type>, base list: <no type>, field list: 0x136E
options: has unique name, sizeof 8
```
With this PR, if a `UniqueName` is encountered that couldn't be parsed,
it will fall back to using the undecorated (→ do the same as if the
unique name is empty/unavailable).
I'm not sure how to test this. Maybe compiling the LLVM IR that rustc
emits?
Fixes#152051.
PyMemoryView_FromMemory is part of stable ABI but the flag constants
such as PyBUF_READ are not. This was fixed in Python 3.11 [1], but still
requires this workaround when using older versions.
[1] https://github.com/python/cpython/issues/98680
That change adds support for folding a SETCC when one or both of the
operands is a TRUNCATE with the appropriate no-wrap flags. This pattern
can occur when promoting i8 operations in NVPTX, and we currently have
some ISel rules to try to handle it.
__clc_mem_fence and __clc_work_group_barrier function have two
parameters memory_scope and memory_order. The design allows the clc
functions to implement SPIR-V ControlBarrier and MemoryBarrier
functions in the future.
The default memory ordering in clc is set to __ATOMIC_SEQ_CST, which is
also the default and strongest ordering in OpenCL and C++.
OpenCL cl_mem_fence_flags parameter is converted to combination of
__MEMORY_SCOPE_DEVICE and __MEMORY_SCOPE_WRKGRP, which is passed to clc.
llvm-diff shows no change to nvptx64--nvidiacl.bc.
llvm-diff show a small change to amdgcn--amdhsa.bc and the number of
LLVM IR instruction is reduced by 1: https://alive2.llvm.org/ce/z/_Uhqvt
This PR implements a semantic checker to ensure the legality of nested
OpenACC parallelism. The following are quotes from Spec 3.3. We need to
disallow loops from having parallelism at the same level as or at a
sub-level of child loops.
>**2.9.2 gang clause**
>[2064] <ins>When the parent compute construct is a parallel
construct</ins>, or on an orphaned loop construct, the gang clause
behaves as follows. (...) The associated dimension is the value of the
dim argument, if it appears, or is dimension one. The dim argument must
be a constant positive integer with value 1, 2, or 3.
>[2112] The region of a loop with a gang(dim:d) clause may not contain a
loop construct with a gang(dim:e) clause where e >= d unless it appears
within a nested compute region.
>[2074] <ins>When the parent compute construct is a kernels
construct</ins>, the gang clause behaves as follows. (...)
>[2148] The region of a loop with the gang clause may not contain
another loop with a gang clause unless within a nested compute region.
>**2.9.3 worker clause**
>[2122]/[2129] The region of a loop with the worker clause may not
contain a loop with the gang or worker clause unless within a nested
compute region.
>**2.9.4 vector clause**
>[2141]/[2148] The region of a loop with the vector clause may not
contain a loop with a gang, worker, or vector clause unless within a
nested compute region.
https://openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.3-final.pdf