Running the test-release.sh script with PGO enabled causes build errors
like:
ld.lld: error: Function Import: link error: linking module flags
'ProfileSummary': IDs have conflicting values
I believe this a build system bug due to the PGO profile data being
generated unconditionally. If you run `ninja check-all` and then `ninja
install` like we do in test-release.sh, then the profile data is
regenerated during `ninja install` and some of the clang tools which are
not test dependencies get build during the ninja install step with
different profile data. When these tools link against the LLVM
libraries, like libSupport, we end up with these errors.
(cherry picked from commit 0d2bb7f017f13ceae793fab7d83d3e67e8d8d8f8)
- The opcode of the mina.fmt and max.fmt is documented wrong, the
object code compiled from the same assembly with LLVM behaves
differently than one compiled with GCC and Binutils.
- Modify the opcodes to match Binutils. The actual opcodes are as
follows:
{5,3} | bits {2,0} of func
| ... | 100 | 101 | 110 | 111
-----+-----+-----+-----+-----+-----
010 | ... | min | mina | max | maxa
(cherry picked from commit 8b859c6e4a8e9ab9969582267bbdc04ed6bfa535)
Previously, the `override` keyword in C++ was being print in the left
side of a method decl, which is unsupported by C++ standard. This commit
fixes that by setting the `CanPrintOnLeft` field to 0, forcing it to be
print on the right side of the decl.
Signed-off-by: Giuliano Belinassi <gbelinassi@suse.de>
When doing GC, we normally won't have dangling references, because such
a reference would keep the other section alive, keeping it from being
eliminated.
However, references within DWARF sections are ignored for the purposes
of GC (because otherwise, they would essentially keep everything alive,
defeating the point of the GC), see
c579a5b1d92a9bc2046d00ee2d427832e0f5ddec for more context.
Therefore, dangling relocations against discarded symbols are ignored
within DWARF sections (see maybeReportRelocationToDiscarded in
Chunks.cpp). Consequently, we also shouldn't create any pseudo
relocations for these cases, as we run into a null pointer dereference
when trying to generate the pseudo relocation info for it.
This fixes the downstream bug
https://github.com/mstorsjo/llvm-mingw/issues/418, fixing crashes on
combinations with -ffunction-sections, -fdata-sections,
-Wl,--gc-sections and debug info.
(cherry picked from commit 9c970d5ecd6a85188cd2b0a941fcd4d60063ef81)
As reported in #86843, we must have #pragma GCC system_header before we
use #include_next, otherwise the compiler may not understand that we're
in a system header and may issue a diagnostic for our usage of
(cherry picked from commit 3c4b673af05f53e8a4d1a382b5c86367ea512c9e)
Fixes#84831
When matching carry pattern with `getAsCarry`, it may produce different
type of carryout. This patch checks such case and does early exit.
I'm new to DAG, any suggestion is appreciated.
(cherry picked from commit cb4453dc69d75064c9a82e9a6a9bf0d0ded4b204)
After aa02002491333c42060373bc84f1ff5d2c76b4ce we started passing the
user name to the create_release function and this was being interpreted
as the git tag.
(cherry picked from commit 0b9ce71a256d86c08f2b52ad2e337395b8f54b41)
This fixes an edge case where functions starting with inline assembly
would assert while trying to lower that inline asm instruction.
After this PR, for now we always add a no-op (xchgw in this case) without
considering the size of the next inline asm instruction. We might want
to revisit this in the future.
This fixes Unreal Engine 5.3.2 compilation with clang-cl and /HOTPATCH.
Should close https://github.com/llvm/llvm-project/issues/56234
(cherry picked from commit ec1af63dde58c735fe60d6f2aafdb10fa93f410d)
While attempting to build some Rust code, I was getting linker errors
due to missing functions that are implemented in `compiler-rt`. Turns
out that when `compiler-rt` is built for Arm64EC, all its function names
are mangled with the leading `#`.
This change removes the hard-coded list of library-implemented
intrinsics to mangle for Arm64EC, and instead assumes that they all must
be mangled.
__precision_ is declared as an int32_t which on some hexagon platforms
is defined as a long.
This change fixes errors like the ones below:
In file included from
/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:19:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/format:202:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:29:
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:700:17:
error: no matching function for call to 'max'
700 | int __p = std::max(1, (__specs.__has_precision() ?
__specs.__precision_ : 6));
| ^~~~~~~~
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:771:25:
note: in instantiation of function template specialization
'std::__formatter::__format_floating_point<float, char,
std::format_context>' requested here
771 | return __formatter::__format_floating_point(__value, __ctx,
__parser_.__get_parsed_std_specifications(__ctx));
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:284:42:
note: in instantiation of function template specialization
'std::__formatter_floating_point<char>::format<float,
std::format_context>' requested here
284 | __ctx.advance_to(__formatter.format(__arg, __ctx));
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:429:15:
note: in instantiation of function template specialization
'std::__vformat_to<std::back_insert_iterator<std::string>, char,
std::back_insert_iterator<std::__format::__output_buffer<char>>>'
requested here
429 | return std::__vformat_to(std::move(__out_it), __fmt, __args);
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:462:8:
note: in instantiation of function template specialization
'std::vformat_to<std::back_insert_iterator<std::string>>' requested here
462 | std::vformat_to(std::back_inserter(__res), __fmt, __args);
| ^
/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:29:8:
note: in instantiation of function template specialization
'std::vformat<void>' requested here
29 | std::vformat("", std::make_format_args());
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:35:1:
note: candidate template ignored: deduced conflicting types for
parameter '_Tp' ('int' vs. 'int32_t' (aka 'long'))
35 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b) {
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:43:1:
note: candidate template ignored: could not match
'initializer_list<_Tp>' against 'int'
43 | max(initializer_list<_Tp> __t, _Compare __comp) {
| ^
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:48:86:
note: candidate function template not viable: requires single argument
'__t', but 2 arguments were provided
48 | _LIBCPP_NODISCARD_EXT inline _LIBCPP_HIDE_FROM_ABI
_LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp max(initializer_list<_Tp> __t) {
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:29:1:
note: candidate function template not viable: requires 3 arguments, but
2 were provided
29 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b, _Compare __comp) {
| ^
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(cherry picked from commit e1830f586ac4c504f632bdb69aab49234256e899)
This adds support for using the L and H argument modifiers for twinword
operands in inline asm code, such as in:
```
%1 = tail call i64 asm sideeffect "rd %pc, ${0:L} ; srlx ${0:L}, 32, ${0:H}", "={o4}"()
```
This is needed by the Linux kernel.
(cherry picked from commit 697dd93ae30f489e5bcdac74c2ef2d876e3ca064)
Libc++'s own <stddef.h> is complicated by the need to handle various
platform-specific macros and to support duplicate inclusion. In reality,
we only need to add a declaration of nullptr_t to it, so we can simply
include the underlying <stddef.h> outside of our guards to let it handle
re-inclusion itself.
(cherry picked from commit 2950283dddab03c183c1be2d7de9d4999cc86131)
Add wheel publishing in addition to existing source distribution
publishing of lit.
Fixes#63369. This also uses the exact fix proposed by @EFord36 in
#63369.
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@sifive.com>
(cherry picked from commit 8a8ab8f70cbb5507d1aa55efcd9c6e61ad4e891c)
Even if __need_unreachable is set, stddef.h should not declare
unreachable() in C++ because it conflicts with the declaration in
\<utility>.
(cherry picked from commit df69a305253f1d1b4a4066055a07101a4cc03e55)
We were passing the min and max values of the range to the ConstantRange
constructor, but the constructor expects the upper bound to 1 more than
the max value so we need to add 1.
We also need to use getNonEmpty so that passing 0, 0 to the constructor
creates a full range rather than an empty range. And passing smin,
smax+1 doesn't cause an assertion.
I believe this fixes at least some of the reason #79158 was reverted.
(cherry picked from commit 12836467b76c56872b4c22a6fd44bcda696ea720)
The range for these operations is being constructed without the
maximum value for the range due to an incorrect usage of the
ConstantRange constructor.
This causes Float2Int to think the range for 'uitofp i1' only
contains 0 instead of 0 and 1.
(cherry picked from commit 6295e677220bb6ec1fa8abe2f4a94b513b91b786)
In glibc versions before 2.33. `libc_nonshared.a` defines
`__fxstat/__fxstat64` but there is no `fstat/fstat64`. glibc 2.33 added
`fstat/fstat64` and obsoleted `__fxstat/__fxstat64`. Ports added after
2.33 do not provide `__fxstat/__fxstat64`, so our `fstat/fstat64`
interceptors using `__fxstat/__fxstat64` interceptors would lead to
runtime failures on such ports (LoongArch and certain RISC-V ports).
Similar to https://reviews.llvm.org/D118423, refine the conditions that
we define fstat{,64} interceptors. `fstat` is supported by musl/*BSD
while `fstat64` is glibc only.
(cherry picked from commit d5224b73ccd09a6759759791f58426b6acd4a2e2)
The most recent declaration of a template as a friend can introduce a
different template parameter depth compared to what we anticipate from a
CTAD guide.
Fixes https://github.com/llvm/llvm-project/issues/86769
This patch implements `affinity` for AIX, which is quite different from
platforms such as Linux.
- Setting CPU affinity through masks and related functions are not
supported. System call `bindprocessor()` is used to bind a thread to one
CPU per call.
- There are no system routines to get the affinity info of a thread. The
implementation of `get_system_affinity()` for AIX gets the mask of all
available CPUs, to be used as the full mask only.
- Topology is not available from the file system. It is obtained through
system SRAD (Scheduler Resource Allocation Domain).
This patch has run through the libomp LIT tests successfully with
`affinity` enabled.
(cherry picked from commit d394f3a162b871668d0c8e8bf6a94922fa8698ae)
The color methods in formatted_raw_ostream were forwarding directly to
the underlying stream without considering existing buffered output. This
would cause incorrect colored output for buffered uses of
formatted_raw_ostream.
Fix this issue by applying the color to the formatted_raw_ostream itself
and temporarily disabling scanning of any color related output so as not
to affect the position tracking.
This fix means that workarounds that forced formatted_raw_ostream
buffering to be disabled can be removed. In the case of llvm-objdump,
this can improve disassembly performance when redirecting to a file by
more than an order of magnitude on both Windows and Linux. This
improvement restores the disassembly performance when redirecting to a
file to a level similar to before color support was added.
(cherry picked from commit c9db031c48852af491747dab86ef6f19195eb20d)
This reapplies 272d1b44efdedb68c194970a610f0ca1b7b769c5 (from #85756),
which was reverted in
407937036fa7640f61f225474b1ea6623a40dbdd.
In the previous attempt, empty CMAKE_INSTALL_PREFIX was handled by
quoting them, in d209d1340b99d4fbd325dffb5e13b757ab8264ea. That made the
calls to cmake_path(ABSOLUTE_PATH) succeed, but the output paths of that
weren't actually absolute, which was required by file(RELATIVE_PATH).
Avoid this issue by constructing a non-empty base directory variable
to use for calculating the relative path.
(cherry picked from commit 50801f1095d33e712c3a51fdeef82569bd09007f)
We use the VSCBIQ/VSBIQ/VSBCBIQ family of instructions to implement
USUBO/USUBO_CARRY for the i128 data type. However, these instructions
use an inverted sense of the borrow indication flag (a value of 1
indicates *no* borrow, while a value of 0 indicated borrow). This does
not match the semantics of the boolean "overflow" flag of the
USUBO/USUBO_CARRY ISD nodes.
Fix this by generating code to explicitly invert the flag. These cancel
out of the result of USUBO feeds into an USUBO_CARRY.
To avoid unnecessary zero-extend operations, also improve the DAGCombine
handling of ZERO_EXTEND to optimize (zext (xor (trunc))) sequences where
appropriate.
Fixes: https://github.com/llvm/llvm-project/issues/83268
The existing implementation didn't handle when the input text section
was some offset from the output section.
This resulted in an assert in relaxGot() with an lld built with asserts
for some large binaries, or even worse, a silently broken binary with an
lld without asserts.
(cherry picked from commit 48048051323d5dd74057dc5f32df8c3c323afcd5)
Using range.size() "as is" means we accumulate 'size_t' values into
'int32_t' variable. This may produce narrowing conversion warnings
(particularly, on MSVC). The surrounding code seems to cast <x>.size()
to 'int32_t' so following this practice seems safe enough.
Co-authored-by: Ovidiu Pintican <ovidiu.pintican@intel.com>
(cherry picked from commit bce17034157fdfe4d898d30366c1eeca3442fa3d)
Delete the code that skips the CFI for the condition register on ELF32.
The code checked !MustSaveCR, which happened only when
Subtarget.is32BitELFABI(), where spillCalleeSavedRegisters is spilling
cr in a different way. The spill was missing CFI. After deleting this
code, a spill of cr2 to cr4 gets CFI in the same way as a spill of r14
to r31.
Fixes#83094
(cherry picked from commit 6b70c5d79fe44cbe01b0443454c6952c5b541585)
This ports the change from TSan
(0784b1eefa).
Testing notes: run 'sudo sysctl vm.mmap_rnd_bits=32; ninja check-msan'
before and after this patch.
N.B. aggressive ASLR may also cause the app to overlap with the
allocator region; for MSan, this was fixed in
af2bf86a37
(cherry picked from commit 58f7251820b14c93168726a24816d8a094599be5)
MSan divides the virtual address space into APP, INVALID, SHADOW and
ORIGIN memory. The allocator usually just steals a bit of the APP
address space: typically the bottom portion of the PIE binaries section,
which works because the Linux kernel maps from the top of the PIE
binaries section. However, if ASLR is very aggressive, the binary may
end up mapped in the same location where the allocator wants to live;
this results in a segfault.
This patch adds in a MappingDesc::ALLOCATOR type and enforces that the
memory range for the allocator is not occupied by anything else.
Since the allocator range information is not readily available in
msan.h, we duplicate the information from msan_allocator.cpp.
Note: aggressive ASLR can also lead to a different type of failure,
where the PIE binaries/libraries are mapped entirely outside of the
APP/ALLOCATOR sections; that will be addressed in a separate patch
(https://github.com/llvm/llvm-project/pull/85142).
(cherry picked from commit af2bf86a372cacf5f536bae06e2f2d3886eefb7b)
When emitting the storage (or memory copy operations) for constant
initializers, the decision whether to split a constant structure or
array store into a sequence of field stores or to use `memcpy` is
based upon the optimization level and the size of the initializer.
In afe8b93ffdfef5d8879e1894b9d7dda40dee2b8d, we extended this by
allowing constants to be split when the array (or struct) type does
not match the type of data the address to the object (constant) is
expected to contain. This may happen when `emitStoresForConstant` is
called by `EmitAutoVarInit`, as the element type of the address gets
shrunk. When this occurs, let the initializer be split into a bunch
of stores only under `-ftrivial-auto-var-init=pattern`.
Fixes: https://github.com/llvm/llvm-project/issues/84178.
This PR indicates that `addrspacecasts` are always no-ops on LoongArch.
Fixes#82330
(cherry picked from commit dd3e0a4643670f33850278ad281a358bbdd04e92)
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based
on all of the return instructions in the function, not just one.
However, there is also code in ARMLoadStoreOptimizer which changes
return instructions, but it set IsRestored based on the one instruction
it changed, not the whole function.
The fix is to factor out the code added in #75527, and also call it from
ARMLoadStoreOptimizer if it made a change to return instructions.
Fixes#80287.
(cherry picked from commit 749384c08e042739342c88b521c8ba5dac1b9276)
This test shows the bug where LR is used as a general-purpose register
on a code path where it is not spilled to the stack.
(cherry picked from commit 8779cf68e80dcc0b15e8034f39e6ce18b08352b6)
Previously we disabled to compute ODR hash for declarations from the
global module fragment. However, we missed the case that the functions
lives in the concept requiments (see the attached the test files for
example). And the mismatch causes the potential crashment.
Due to we will set the function body as lazy after we deserialize it and
we will only take its body when needed. However, we don't allow to take
the body during deserializing. So it is actually potentially problematic
if we set the body as lazy first and computing the hash value of the
function, which requires to deserialize its body. So we will meet a
crash here.
This patch tries to solve the issue by not taking the body of the
function from GMF. Note that we can't skip comparing the constraint
expression from the GMF directly since it is an key part of the
function selecting and it may be the reason why we can't return 0
directly for `FunctionDecl::getODRHash()` from the GMF.
This fixes the miscompile reported in #82430 by telling
isSimpleVIDSequence to sign extend to XLen instead of the width of the
indices, since the "sequence" of indices generated by a strided load
will be at XLen.
This was the simplest way I could think of getting isSimpleVIDSequence
to treat the indexes as if they were zero extended to XLenVT.
Another way we could do this is by refactoring out the "get constant
integers" part from isSimpleVIDSequence and handle them as APInts so we
can separately zero extend it.
Fixes#82430
(cherry picked from commit 815644b4dd882ade2e5649d4f97c3dd6f7aea200)
This shows the issue in #82430, but triggers it via the widening SEW combine
rather than a GEP that RISCVGatherScatterLowering doesn't detect.
(cherry picked from commit 2cd59bdc891ab59a1abfe5205feb45791a530a47)
atomicrmw xchg also accepts pointer and floating-point values. To handle
those, insert necessary casts to and from integer. This is what we do
for cmpxchg as well.
Fixes https://github.com/llvm/llvm-project/issues/85226.
(cherry picked from commit ff2fb2a1d78585944dcdb9061c8487fe1476dfa4)