ExtractFromEnd only has 2 uses, extracting the last and penultimate
elements. Replace it with 2 separate opcodes, removing the need to
materialize and handle a constant argument.
PR: https://github.com/llvm/llvm-project/pull/137030
These instructions are included in XRivosVisni. They perform a scalar
insert into a vector (with a potentially non-zero index) and a scalar
extract from a vector (with a potentially non-zero index) respectively.
They're very analogous to vmv.s.x and vmv.x.s respectively.
The instructions do have a couple restrictions:
1) Only constant indices are supported w/a uimm5 format.
2) There are no FP variants.
One important property of these instructions is that their throughput
and latency are expected to be LMUL independent.
Add following semantic checks for ALLOCATE directive as per OpenMP 6.0
standard.
- List item in ALLOCATE directive must not be a dummy argument
- List item in ALLOCATE directive must not have POINTER attribute
- List item in ALLOCATE directive must not be a associate name
Towards
#https://github.com/llvm/llvm-project/issues/134809#issuecomment-2787206873
This change moves WasmSym from a static global struct to an instance
owned by Ctx, allowing it to be reset cleanly between linker runs. This
enables safe support for multiple invocations of wasm-ld within the same
process
Changes done
- Converted WasmSym from a static struct to a regular struct with
instance members.
- Added a std::unique_ptr<WasmSym> wasmSym field inside Ctx.
- Reset wasmSym in Ctx::reset() to clear state between links.
- Replaced all WasmSym:: references with ctx.wasmSym->.
- Removed global symbol definitions from Symbols.cpp that are no longer
needed.
Clearing wasmSym in ctx.reset() ensures a clean slate for each link
invocation, preventing symbol leakage across runs—critical when using
wasm-ld/lld as a reentrant library where global state can cause subtle,
hard-to-debug errors.
---------
Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com>
The AArch64 procedure call standard does not mandate that the callee
extends the return value. Clang does not add signext to functions
returning i8 or i16 on linux aarch64, but flang does.
This means that runtime routines returning i8's will have signext on the
callsite/declaration, but not on the implementation, and the call site
will assume the return value has already been sign extended when it has
not. This showed up in a test case calling MINVAL on an array of
INTEGER*1.
Adjust our integer extension flags to match clang and aarch64pcs on
linux. The behavior on Darwin should be preserved. This is listed on the
apple developer guide as a divergence from aarch64pcs.
Main algorithm:
The Taylor series expansion of `asin(x)` is:
```math
\begin{align*}
asin(x) &= x + x^3 / 6 + 3x^5 / 40 + ... \\
&= x \cdot P(x^2) \\
&= x \cdot P(u) &\text{, where } u = x^2.
\end{align*}
```
For the fast path, we perform range reduction mod 1/64 and use degree-7
(minimax + Taylor) polynomials to approximate `P(x^2)`.
When `|x| >= 0.5`, we use the transformation:
```math
u = \frac{1 + x}{2}
```
and apply half-angle formula to reduce `asin(x)` to:
```math
\begin{align*}
asin(x) &= sign(x) \cdot \left( \frac{\pi}{2} - 2 \cdot asin(\sqrt{u}) \right) \\
&= sign(x) \cdot \left( \frac{\pi}{2} - 2 \cdot \sqrt{u} \cdot P(u) \right).
\end{align*}
```
Since `0.5 <= |x| <= 1`, `|u| <= 0.5`. So we can reuse the polynomial
evaluation of `P(u)` when `|x| < 0.5`.
For the accurate path, we redo the computations in 128-bit precision
with degree-15 (minimax + Taylor) polynomials to approximate `P(u)`.
This commit ensures that convolution operators including: conv2d,
depthwise_conv2d, transpose_conv2d and conv3d, can have unranked
input/weight operands.
In order to support operands with unranked tensors, the tablegen
definition was relaxed. The relaxation of tensor type will later be
checked by the validation pass, should the user wish to use it.
Signed-off-by: Luke Hutton <luke.hutton@arm.com>
Otherwise `lldb-dotest` fails with:
```
Traceback (most recent call last):
File "/Users/jonas/Git/llvm-worktrees/llvm-project/lldb/test/API/dotest.py", line 8, in <module>
lldbsuite.test.run_suite()
File "/Users/jonas/Git/llvm-worktrees/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 1063, in run_suite
visit("Test", dirpath, filenames)
File "/Users/jonas/Git/llvm-worktrees/llvm-project/lldb/packages/Python/lldbsuite/test/dotest.py", line 701, in visit
raise Exception("Found multiple tests with the name %s" % name)
Exception: Found multiple tests with the name TestReverseContinueNotSupported.py
```
verifyConvOpErrorIf() assumes output channel is the 4th dimension of the
output type but this is wrong for conv3d which now uses that verifier.
Use rank - 1 which works accross the operations using this verifier
(conv2d, conv3d and depthwise_conv3d).
In #132722 spills of ZT0 were disabled around all SME ABI routines to
avoid a case where ZT0 is spilled before ZA is enabled (resulting in a
crash).
It turns out that the ABI does not promise that routines will preserve
ZT0 (however in practice they do), so generally disabling ZT0 spills for
ABI routines is not correct.
The case where a crash was possible was "aarch64_new_zt0" functions with
ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this
case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this
patch avoids the ZT0 spill by marking the callsite with
"aarch64_zt0_undef". This attribute only applies to callsites and marks
that at the point the call is made ZT0 is not defined, so does not need
preserving.
Unlike C++, C allows the definition of an uninitialized `const` object.
If the object has static or thread storage duration, it is still
zero-initialized, otherwise, the object is left uninitialized. In either
case, the code is not compatible with C++.
This adds a new diagnostic group, `-Wdefault-const-init-unsafe`, which
is on by default and diagnoses any definition of a `const` object which
remains uninitialized.
It also adds another new diagnostic group, `-Wdefault-const-init` (which
also enabled the `unsafe` variant) that diagnoses any definition of a
`const` object (including ones which are zero-initialized). This
diagnostic is off by default.
Finally, it adds `-Wdefault-const-init` to `-Wc++-compat`. GCC diagnoses
these situations under this flag.
Fixes#19297
Despite our attempt (build-docs.sh)
to build the documentation with SVG,
it still uses PNG https://llvm.org/doxygen/classllvm_1_1StringRef.html,
and that renders terribly on any high dpi display.
SVG leads to smasller installation and works fine
on all browser (that has been true for _a while_
https://caniuse.com/svg), so this patch just unconditionally build all
dot graphs as SVG in all subprojects and remove the option.
This PR implements the following 8 functions along with the tests.
```c++
int idivr(fract, fract);
long int idivlr(long fract, long fract);
int idivk(accum, accum);
long int idivlk(long accum, long accum);
unsigned int idivur(unsigned fract, unsigned fract);
unsigned long int idivulr(unsigned long fract, unsigned long fract);
unsigned int idivuk(unsigned accum, unsigned accum);
unsigned long int idivulk(unsigned long accum, unsigned long accum);
```
ref: https://www.iso.org/standard/51126.htmlFixes#129125
---------
Signed-off-by: krishna2803 <kpandey81930@gmail.com>
Reverted following several issues appearing related to fake uses, reported
on the github PR.
This reverts commit a9dff35ad251cd20376ab25b26d1e5394e18ff4c.
Based on the DAP specification.
The output categories stdout and stderr should only be used for the
debuggee's stdout and stderr.
```jsonc
/**
* The output category. If not specified or if the category is not
* understood by the client, `console` is assumed.
* Values:
* 'console': Show the output in the client's default message UI, e.g. a
* 'debug console'. This category should only be used for informational
* output from the debugger (as opposed to the debuggee).
* 'important': A hint for the client to show the output in the client's UI
* for important and highly visible information, e.g. as a popup
* notification. This category should only be used for important messages
* from the debugger (as opposed to the debuggee). Since this category value
* is a hint, clients might ignore the hint and assume the `console`
* category.
* 'stdout': Show the output as normal program output from the debuggee.
* 'stderr': Show the output as error program output from the debuggee.
* 'telemetry': Send the output to telemetry instead of showing it to the
* user.
* etc.
*/
category?: 'console' | 'important' | 'stdout' | 'stderr' | 'telemetry' | string;
```
What I am not sure if error should use the important category ?
---------
Signed-off-by: Ebuka Ezike <yerimyah1@gmail.com>
Per suggestion in
https://github.com/llvm/llvm-project/pull/128251#discussion_r2055916229,
adding a new helper function in `SValBuilder` to conjure a symbol when
given a `CallEvent`.
Tested manually (with assertions) that the `LocationContext *` obtained
from the `CallEvent` are identical to those passed in the original
argument.
Fixes the issue reported following the merge of #118026.
When a valid `musttail` call is made, the function it is made from must
return immediately after the call; if there are any cleanups left in the
function, then an error is triggered. This is not necessary for fake
uses however - it is perfectly valid to simply emit the fake use
"cleanup" code before the tail call, and indeed LLVM will automatically
move any fake uses following a tail call to come before the tail call.
Therefore, this patch specifically choose to handle fake use cleanups
when a musttail call is present by simply emitting them immediately
before the call.
This adds verifier checks for the gather op
to make sure the shapes of inputs and output
are consistent with respect to spec.
---------
Signed-off-by: Tai Ly <tai.ly@arm.com>
Co-authored-by: Luke Hutton <luke.hutton@arm.com>
This is the subset of binops (mul and fmul are already enabled) whose
behaviour fully aligns with the equivalent SVE intrinsic. The omissions
are integer divides and shifts that are defined to return poison for
values where the intrinsics have a defined result. These will be covered
in a seperate PR.
Adds a `combine` action (DAG operator) which allows for easy definition of
combine rule that only match one or more instructions, and defer all remaining
match/apply logic to C++ code.
This avoids the need for split match/apply function in such cases. One function
can do the trick as long as it returns `true` if it changed any code.
This is implemented as syntactic sugar over match/apply. The combine rule is
just a match pattern BUT every C++ pattern inside is treated as an "apply" function.
This makes it fit seamlessly with the current backend.
Fixes#92410
The CXX module initializer function, which is called at program startup,
needs to be tagged with Pointer Authentication and Branch Target
Identification marks whenever relevant.
Before this patch, in CPUs set up for PACBTI execution, the function
wasn't protected with return address signing and no BTI instruction was
inserted at the start of it, thus leading to an execution fault.
This patch fixes the issue by marking the function with the function
attributes related to PAC and BTI if relevant.
Add support for using the existing SCRATCH_STORE_BLOCK and
SCRATCH_LOAD_BLOCK instructions for saving and restoring callee-saved
VGPRs. This is controlled by a new subtarget feature, block-vgpr-csr. It
does not include WWM registers - those will be saved and restored
individually, just like before. This patch does not change the ABI.
Use of this feature may lead to slightly increased stack usage, because
the memory is not compacted if certain registers don't have to be
transferred (this will happen in practice for calling conventions where
the callee and caller saved registers are interleaved in groups of 8).
However, if the registers at the end of the block of 32 don't have to be
transferred, we don't need to use a whole 128-byte stack slot - we can
trim some space off the end of the range.
In order to implement this feature, we need to rely less on the
target-independent code in the PrologEpilogInserter, so we override
several new methods in SIFrameLowering. We also add new pseudos,
SI_BLOCK_SPILL_V1024_SAVE/RESTORE.
One peculiarity is that both the SI_BLOCK_V1024_RESTORE pseudo and the
SCRATCH_LOAD_BLOCK instructions will have all the registers that are not
transferred added as implicit uses. This is done in order to inform
LiveRegUnits that those registers are not available before the restore
(since we're not really restoring them - so we can't afford to scavenge
them). Unfortunately, this trick doesn't work with the save, so before
the save all the registers in the block will be unavailable (see the
unit test).
This was reverted due to failures in the builds with expensive checks
on, now fixed by always updating LiveIntervals and SlotIndexes in
SILowerSGPRSpills.
For
```c++
struct S {
constexpr S(int=0) : i(1) {}
int i;
};
constexpr volatile S vs;
```
reading from `vs.i` is not allowed, even though `i` is not volatile
qualified. Propagate the IsVolatile bit down the hierarchy, so we know
reading from `vs.i` is a volatile read.
This wasn't defined correctly and was unused. And it was causing a CI
build failure:
```
/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/lldb/include/lldb/Core/DemangledNameInfo.h: In function ‘bool lldb_private::operator==(const lldb_private::DemangledNameInfo&, const lldb_private::DemangledNameInfo&)’:
/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/lldb/include/lldb/Core/DemangledNameInfo.h:60:25: error: ‘const struct lldb_private::DemangledNameInfo’ has no member named ‘QualifiersRange’
60 | lhs.QualifiersRange) ==
| ^~~~~~~~~~~~~~~
/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/lldb/include/lldb/Core/DemangledNameInfo.h:62:25: error: ‘const struct lldb_private::DemangledNameInfo’ has no member named ‘QualifiersRange’
62 | lhs.QualifiersRange);
| ^~~~~~~~~~~~~~~
111.469 [1284/7/3896] Building CXX object tools/lldb/source/Expression/CMakeFiles/lldbExpression.dir/ObjectFileJIT.cpp.o
111.470 [1284/6/3897] Building CXX object tools/lldb/source/Expression/CMakeFiles/lldbExpression.dir/Materializer.cpp.o
111.492 [1284/5/3898] Building CXX object tools/lldb/source/Expression/CMakeFiles/lldbExpression.dir/REPL.cpp.o
111.496 [1284/4/3899] Building CXX object tools/lldb/source/Expression/CMakeFiles/lldbExpression.dir/UserExpression.cpp.o
111.695 [1284/3/3900] Building CXX object tools/lldb/source/Commands/CMakeFiles/lldbCommands.dir/CommandObjectTarget.cpp.o
112.944 [1284/2/3901] Linking CXX shared library lib/libclang.so.21.0.0git
113.098 [1284/1/3902] Linking CXX shared library lib/libclang-cpp.so.21.0git
```
This makes sure that COMPILER_RT_ARMHF_TARGET is set properly for
targets without a specific "armhf" target name, such as armv7 windows.
This fixes the builtins test comparesf2_test.c on Windows on armv7.