If all incoming values of `div.d` are sign-extended and all users only
use the lower 32 bits, then convert them to W versions.
Fixes: #107946
(cherry picked from commit 0f47e3aebdd2a4a938468a272ea4224552dbf176)
After https://github.com/llvm/llvm-project/pull/92205, LoongArch ISel
selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to
i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit
constant. It will produce wrong result when `X == 2`:
https://alive2.llvm.org/ce/z/pzfGZZ
This patch adds additional `sexti32` checks to operands of
`PatGprGpr_32`.
Alive2 proof: https://alive2.llvm.org/ce/z/AkH5MpFix#107414.
(cherry picked from commit a111f9119a5ec77c19a514ec09454218f739454f)
SMUL_LOHI and UMUL_LOHI are different operations because the high part
of the result is different, so it is not OK to optimize the signed
version to MUL_U24/MULHI_U24 or the unsigned version to
MUL_I24/MULHI_I24.
The minbitwidth restrictions can be skipped only for immediate reduced
values, for other nodes still need to check if external users allow
bitwidth reduction.
Fixes https://github.com/llvm/llvm-project/issues/104422
(cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2)
We were incorrectly not deduplicating results when looking up `_` which,
for a lambda init capture, would result in an ambiguous lookup.
The same bug caused some diagnostic notes to be emitted twice.
Fixes#107024
Corrupted live interval information can cause window scheduling to crash
in some cases. By adding the missing MBB's live interval information in the
ModuloScheduleExpander, the information can be correctly analyzed in
the window scheduler.
(cherry picked from commit 43ba1097ee747b4ec5e757762ed0c9df6255a292)
`compiler-rt/lib/builtins/divtc3.c` and `multc3.c` don't compile on
Solaris/sparcv9 with `gcc -m32`:
```
FAILED: projects/compiler-rt/lib/builtins/CMakeFiles/clang_rt.builtins-sparc.dir/divtc3.c.o
[...]
compiler-rt/lib/builtins/divtc3.c: In function ‘__divtc3’:
compiler-rt/lib/builtins/divtc3.c:22:18: error: implicit declaration of function ‘__compiler_rt_logbtf’ [-Wimplicit-function-declaration]
22 | fp_t __logbw = __compiler_rt_logbtf(
| ^~~~~~~~~~~~~~~~~~~~
```
and many more. It turns out that while the definition of `__divtc3` is
guarded with `CRT_HAS_F128`, the `__compiler_rt_logbtf` and other
declarations use `CRT_HAS_128BIT && CRT_HAS_F128` as guard. This only
shows up with `gcc` since, as documented in Issue #41838, `clang`
violates the SPARC psABI in not using 128-bit `long double`, so this
code path isn't used.
Fixed by changing the guards to match.
Tested on `sparcv9-sun-solaris2.11`.
(cherry picked from commit 63a7786111c501920afc4cc27a4633f76cdaf803)
The functionality to make use of SVE's load/store pair instructions for
the callee-saves is broken because the offsets used in the instructions
are incorrect.
This is addressed by #105518 but given the complexity of this code
and the subtleties around calculating the right offsets, we favour
disabling the behaviour altogether for LLVM 19.
This fix is critical for any programs being compiled with `+sme2`.
Only some instructions should be considered as potentially reducing the
size of the operands types, not all instructions should be considered.
Fixes https://github.com/llvm/llvm-project/issues/107036
(cherry picked from commit f381cd069965dabfeb277f30a4e532d7fd498f6e)
Fix#105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().
Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.
(cherry picked from commit 43661a1214353ea1773a711f403f8d1118e9ca0f)
This removes a redundant 'COPY' instruction that #81716 probably forgot
to remove.
This redundant COPY led to an issue because because code in
LiveRangeSplitting expects that the instruction emitted by
`loadRegFromStackSlot` is an instruction that accesses memory, which
isn't the case for the COPY instruction.
(cherry picked from commit 91a3c6f3d66b866bcda8a0f7d4815bc8f2dbd86c)
In #78436 we made some SourceLocExpr dependent to
deal with the fact that their value should reflect the name of
specialized function - rather than the rtemplate in which they are first
used.
However SourceLocExpr are unusual in two ways
- They don't depend on template arguments
- They morally depend on the context in which they are used (rather than
called from).
It's fair to say that this is quite novels and confuses clang. In
particular, in some cases, we used to create dependent SourceLocExpr and
never subsequently transform them, leaving dependent objects in
instantiated functions types. To work around that we avoid replacing
SourceLocExpr when we think they could remain dependent.
It's certainly not perfect but it fixes a number of reported bugs, and
seem to only affect scenarios in which the value of the SourceLocExpr
does not matter (overload resolution).
Fixes#106428Fixes#81155Fixes#80210Fixes#85373
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Fix codegen of consteval functions returning an empty class, and related
issues
If a class is empty, don't store it to memory: the store might overwrite
useful data. Similarly, if a class has tail padding that might overlap
other fields, don't store the tail padding to memory.
The problem here turned out a bit more general than I initially thought:
basically all uses of EmitAggregateStore were broken. Call lowering had
a method that did mostly the right thing, though: CreateCoercedStore.
Adapt CreateCoercedStore so it always does the conservatively right
thing, and use it for both calls and ConstantExpr.
Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was
set incorrectly for empty classes in some cases.
Fixes#93040.
(cherry picked from commit 1762e01cca0186f1862db561cfd9019164b8c654)
Since we don't generate relocations for those, it doesn't make sense to
assert them here; fallout of
https://github.com/llvm/llvm-project/pull/106722.
(cherry picked from commit a3816b5a573dbf57ba3082a919ca2de6b47257e9)
Ever since 6859685a87ad093d60c8bed60b116143c0a684c7 (or, precisely,
84428dafc0941e3a31303fa1b286835ab2b8e234) relative jumps emitted by the
AVR codegen are off by two bytes - this pull request fixes it.
## Abstract
As compared to absolute jumps, relative jumps - such as rjmp, rcall or
brsh - have an implied `pc+2` behavior; that is, `jmp 100` is `pc =
100`, but `rjmp 100` gets understood as `pc = pc + 100 + 2`.
This is not reflected in the AVR codegen:
f95026dbf6/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp (L89)
... which always emits relative jumps that are two bytes too far - or
rather it _would_ emit such jumps if not for this check:
f95026dbf6/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp (L517)
... which causes most of the relative jumps to be actually resolved
late, by the linker, which applies the offsetting logic on its own,
hiding the issue within LLVM.
[Some time
ago](697a162fa6)
we've had a similar "jumps are off" problem that got solved by touching
`shouldForceRelocation()`, but I think that has worked only by accident.
It's exploited the fact that absolute vs relative jumps in the parsed
assembly can be distinguished through a "side channel" check relying on
the existence of labels (i.e. absolute jumps happen to named labels, but
relative jumps are anonymous, so to say). This was an alright idea back
then, but it got broken by 6859685a87ad093d60c8bed60b116143c0a684c7.
I propose a different approach:
- when emitting relative jumps, offset them by `-2` (well, `-1`,
strictly speaking, because those instructions rely on right-shifted
offset),
- when parsing relative jumps, treat `.` as `+2` and read `rjmp .+1234`
as `rjmp (1234 + 2)`.
This approach seems to be sound and now we generate the same assembly as
avr-gcc, which can be confirmed with:
```cpp
// avr-gcc test.c -O3 && avr-objdump -d a.out
int main() {
asm(
" foo:\n\t"
" rjmp .+2\n\t"
" rjmp .-2\n\t"
" rjmp foo\n\t"
" rjmp .+8\n\t"
" rjmp end\n\t"
" rjmp .+0\n\t"
" end:\n\t"
" rjmp .-4\n\t"
" rjmp .-6\n\t"
" x:\n\t"
" rjmp x\n\t"
" .short 0xc00f\n\t"
);
}
```
avr-gcc is also how I got the opcodes for all new tests like `inst-brbc.s`, so we should be good.
(cherry picked from commit 86a60e7f1e8f361f84ccb6e656e848dd4fbaa713)
When serialising to textual IR, there can be constant Values referred to
by DbgRecords that don't appear anywhere else, and have types hidden
even deeper in side them. Enumerate these when enumerating all types.
Test by Mikael Holmén.
(cherry picked from commit 25f87f2d703178bb4bc13a62cb3df001b186cba2)
WideInc/WideIncExpr can be null. Previously this worked out
because the comparison with WideIncExpr would fail. Now we have
accesses to WideInc prior to that. Avoid the issue with an
explicit check.
Fixes https://github.com/llvm/llvm-project/issues/106239.
(cherry picked from commit c9a5e1b665dbba898e9981fd7d48881947e6560e)
In these environments, the architecture name is armv7; recognize that
and enable the relevant runtimes.
Fix building the sanitizer_common library for this target, by using the
right registers for the architecture - this is similar to what
0c391133c9201ef29273554a1505ef855ce17668 did for aarch64.
(Still, address sanitizer doesn't support hooking functions at runtime
on armv7 or aarch64 - but other runtimes such as ubsan do work.)
(cherry picked from commit 5ea9dd8c7076270695a1d90b9c73718e7d95e0bf)
Run for clang-tidy checks available in release/19.x branch.
Some notable findings:
- altera-id-dependent-backward-branch, stays slow with 13%.
- misc-const-correctness become faster, going from 261% to 67%, but
still above
8% threshold.
- misc-header-include-cycle is a new SLOW check with 10% runtime
implications
- readability-container-size-empty went from 16% to 13%, still SLOW.
(cherry picked from commit b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2)
because that doesn't work (results in `LINK : error LNK2001: unresolved
external symbol malloc`).
Based on the title of #91862 it was only intended for use in 64-bit
builds.
(cherry picked from commit ef26afcb88dcb5f2de79bfc3cf88a8ea10f230ec)
The CMake docs state that `check_c_source_compiles()` checks whether the
supplied code "can be compiled as a C source file and linked as an
executable (so it must contain at least a `main()` function)."
https://cmake.org/cmake/help/v3.30/module/CheckCSourceCompiles.html
In practice, this command is a wrapper around `try_compile()`:
- 2904ce00d2/Modules/CheckCSourceCompiles.cmake (L54)
- 2904ce00d2/Modules/Internal/CheckSourceCompiles.cmake (L101)
When `CMAKE_SOURCE_DIR` is compiler-rt/lib/builtins/,
`CMAKE_TRY_COMPILE_TARGET_TYPE` is set to `STATIC_LIBRARY`, so the
checks for `float16` and `bfloat16` support work as intended in a
Clang + compiler-rt runtime build for example, as it runs CMake
recursively from that directory.
However, when using llvm/ or compiler-rt/ as CMake source directory, as
`CMAKE_TRY_COMPILE_TARGET_TYPE` defaults to `EXECUTABLE`, these checks
will indeed fail if the code doesn't have a `main()` function. This
results in LLVM using x86 SIMD registers when generating calls to
builtins that, with Arch Linux's compiler-rt package for example,
actually use a GPR for their argument or return value as they use
`uint16_t` instead of `_Float16`.
This had been caught in post-commit review:
https://reviews.llvm.org/D145237#4521152. Use of the internal
`CMAKE_C_COMPILER_WORKS` variable is not what hides the issue, however.
PR #69842 tried to fix this by unconditionally setting
`CMAKE_TRY_COMPILE_TARGET_TYPE` to `STATIC_LIBRARY`, but it apparently
caused other issues, so it was reverted. This PR just adds a `main()`
function in the checks, as per the CMake docs.
(cherry picked from commit 68d8b3846ab1e6550910f2a9a685690eee558af2)
Pull request #96032 unconditionall adds the `cwchar` include in the
`format` umbrella header. However support for wchar_t can be disabled in
the build system (LIBCXX_ENABLE_WIDE_CHARACTERS).
This patch guards against inclusion of `cwchar` in `format` by checking
the `_LIBCPP_HAS_NO_WIDE_CHARACTERS` define.
For clarity I've also merged the include header section that `cwchar`
was in with the one above as they were both guarded by the same `#if`
logic.
(cherry picked from commit ec56790c3b27df4fa1513594ca9a74fd8ad5bf7f)
Not quite NFC, fixes splitBasicBlockBefore case when we split before an
instruction with debug records (but without the headBit set, i.e., we are
splitting before the instruction but after the debug records that come before
it). splitBasicBlockBefore splices the instructions before the split point into
a new block. Prior to this patch, the debug records would get shifted up to the
front of the spliced instructions (as seen in the modified unittest - I believe
the unittest was checking erroneous behaviour). We instead want to leave those
debug records at the end of the spliced instructions.
The functionality of the deleted `else if` branch is covered by the remaining
`if` now that `DestMarker` is set to the trailing marker if `Dest` is `end()`.
Previously the "===" markers were sometimes detached, now we always detach
them and always reattach them.
Note: `deleteTrailingDbgRecords` only "unlinks" the tailing marker from the
block, it doesn't delete anything. The trailing marker is still cleaned up
properly inside the final `if` body with `DestMarker->eraseFromParent();`.
Part 1 of 2 needed for #105571
(cherry picked from commit f5815534d180c544bffd46f09c28b6fc334260fb)
Resolves#106361. Adding #include <unordered_map> to
llvm/lib/Support/Z3Solver.cpp fixes compilation errors for homebrew
build on macOS with Xcode 14.
https://github.com/Homebrew/homebrew-core/actions/runs/10604291631/job/29390993615?pr=181351
shows that this is resolved when the include is patched in (Linux CI
failure is due to unrelated timeout).
(cherry picked from commit fcb3a0485857c749d04ea234a8c3d629c62ab211)
This reverts commit 2e1ad93961a3f444659c5d02d800e3144acccdb4.
Reverting #95202 in the 19.x branch
Fixes#106182
The change in #95202 causes code to crash and there is
no good way to backport a fix for that as there are ABI-impacting
changes at play.
Instead we revert #95202 in the 19x branch, fixing the regression
and preserving the 18.x behavior (which is GCC's behavior)
https://github.com/llvm/llvm-project/pull/106335#discussion_r1735174841
Called workflows don't have access to secrets by default, so we need to
explicitly pass secrets that we use.
(cherry picked from commit 9d81e7e36e33aecdee05fef551c0652abafaa052)
Addresses a regression in JavaScript when formatting anonymous classes.
---------
Co-authored-by: Owen Pan <owenpiano@gmail.com>
(cherry picked from commit 77d63cfd18aa6643544cf7acd5ee287689d54cca)
32-bit Windows uses `unsigned int` for uintptr_t and size_t.
Commit 18e06e3e2f3d47433e1ed323b8725c76035fc1ac changed uptr to
unsigned long, so it no longer matches the real size_t/uintptr_t and
therefore the current definition of usize result in:
`error C2821: first formal parameter to 'operator new' must be 'size_t'`
However, the real problem is that uptr is wrong to work around the fact
that we have local SIZE_T and SSIZE_T typedefs that trample on the
basetsd.h definitions of the same name and therefore need to match
exactly. Unlike size_t/ssize_t the uppercase ones always use unsigned
long (even on 32-bit).
This commit works around the build breakage by keeping the existing
definitions of uptr/sptr and just changing usize. A follow-up change
will attempt to fix this properly.
Fixes: https://github.com/llvm/llvm-project/issues/101998
Reviewed By: mstorsjo
Pull Request: https://github.com/llvm/llvm-project/pull/106151
(cherry picked from commit bb27dd853a713866c025a94ead8f03a1e25d1b6e)
We were using a `_LIBCPP_ASSERT_FOO` macro without including `<__assert>`.
rdar://134425695
(cherry picked from commit 0df78123fdaed39d5135c2e4f4628f515e6d549d)
In the last patch #82310, we used template depths to tell if such alias
decls contain lambdas, which is wrong because the lambda can also appear
as a part of the default argument, and that would make
`getTemplateInstantiationArgs` provide extra template arguments in
undesired contexts. This leads to issue #89853.
Moreover, our approach
for https://github.com/llvm/llvm-project/issues/82104 was sadly wrong.
We tried to teach `DeduceReturnType` to consider alias template
arguments; however, giving these arguments in the context where they
should have been substituted in a `TransformCallExpr` call is never
correct.
This patch addresses such problems by using a `RecursiveASTVisitor` to
check if the lambda is contained by an alias `Decl`, as well as
twiddling the lambda dependencies - we should also build a dependent
lambda expression if the surrounding alias template arguments were
dependent.
Fixes#89853Fixes#102760Fixes#105885
(cherry picked from commit b412ec5d3924c7570c2c96106f95a92403a4e09b)
These builtins are currently returning CR0 which will have the format
[0, 0, flag_true_if_saved, XER].
We only want to return flag_true_if_saved. This patch adds a shift to
remove the XER bit before returning.
(cherry picked from commit 327edbe07ab4370ceb20ea7c805f64950871d835)