`half` currently uses the default legalization of promoting to a `f32`;
however, this implementation implements math in a way that results in
incorrect rounding. Switch to the soft promote implementation, which
does not have this problem.
The SPARC ABI does not specify a `_Float16` type, so there is no concern
with keeping interface compatibility.
Fixes the SPARC part of
https://github.com/llvm/llvm-project/issues/97975
Fixes the SPARC part of
https://github.com/llvm/llvm-project/issues/97981
`f16` is passed and returned in vector registers on both x86 on AArch64,
the same calling convention as `f32`, so it is a straightforward type to
support. The calling convention support already exists, added as part of
a6065f0fa55a ("Arm64EC entry/exit thunks, consolidated. (#79067)").
Thus, add mangling and remove the error in order to make `half` work.
MSVC does not yet support `_Float16`, so for now this will remain an
LLVM-only extension.
Fixes the `f16` portion of
https://github.com/llvm/llvm-project/issues/94434
There are a number of platforms affected by [1]. It is easy enough to
check in a cross-platform way that bitcasts aren't using f16<->f32
libcalls; thus, add a generic test covering most supported
architectures, with an XFAIL for targets that are currently broken. As
they get fixed, this test will fail and can be updated.
[1]: https://github.com/llvm/llvm-project/issues/97981
* `tools/llvm-objcopy/MachO/update-section-object.test` was failing on
Windows since the input file (`macho_sections.s`) might be checked out
with the wrong line ending, resulting in difference in the size of
sections being checked.
* Removed the check for Windows in `AArch64Arm64ECCallLowering`: when
`llc` is run without an explicit target, the module's target triple is
unknown so this assert fires.
* Expect `llvm/test/CodeGen/Generic/allow-check.ll` to fail for Arm64EC:
Global ISel is not supported.
`f128` intrinsic functions from libm sometimes lower to `long double`
library calls when they instead need to be `f128` versions. Add a
generic test demonstrating current behavior.
MSVC always emits minimal CodeView metadata with compiler information,
even when debug info is otherwise disabled. Other tools may rely on this
metadata being present. For example, linkers use it to determine whether
hotpatching is enabled for the object file.
Over in 6a45fce, this flag (experimental-debuginfo-iterators) was
switched to do nothing, to flush out anything that depended on the
debug-intrinsics way of doing things. It's been a month and nothing's
super-broken, so we'll start to rip things out.
This commit deletes MergeFunc's debuginfo-iterators test: in d2942a86d7
it's documented that that test is specifically because of differences
between intrinsic/non-intrinsic data structures, and we're deleting the
possibility of that difference.
During the transition from debug intrinsics to debug records, we used
several different command line options to customise handling: the
printing of debug records to bitcode and textual could be independent of
how the debug-info was represented inside a module, whether the
autoupgrader ran could be customised. This was all valuable during
development, but now that totally removing debug intrinsics is coming
up, this patch removes those options in favour of a single flag
(experimental-debuginfo-iterators), which enables autoupgrade, in-memory
debug records, and debug record printing to bitcode and textual IR.
We need to do this ahead of removing the
experimental-debuginfo-iterators flag, to reduce the amount of
test-juggling that happens at that time.
There are quite a number of weird test behaviours related to this --
some of which I simply delete in this commit. Things like
print-non-instruction-debug-info.ll , the test suite now checks for
debug records in all tests, and we don't want to check we can print as
intrinsics. Or the update_test_checks tests -- these are duplicated with
write-experimental-debuginfo=false to ensure file writing for intrinsics
is correct, but that's something we're imminently going to delete.
A short survey of curious test changes:
* free-intrinsics.ll: we don't need to test that debug-info is a zero
cost intrinsic, because we won't be using intrinsics in the future.
* undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory
mode while we sorted something out; it works now either way.
* salvage-cast-debug-info.ll: was testing intrinsics-in-memory get
salvaged, isn't necessary now
* localize-constexpr-debuginfo.ll: was producing "dead metadata"
intrinsics for optimised-out variable values, dbg-records takes the
(correct) representation of poison/undef as an operand. Looks like we
didn't update this in the past to avoid spurious test differences.
* Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing
that debug-info affected codegen, and we deferred updating the tests
until now. This is just one of those silent gnochange issues that get
fixed by RemoveDIs.
Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc,
that checks we can autoupgrade debug intrinsics that are in bitcode into
the new debug records.
After outputting block scalar string, the indent will be wrong.
This patch fixes Padding after block scalar string to ensure the correct
format of yaml.
The new added ut will fail in main.
```diff
@@ -3,4 +3,4 @@
Just a block
scalar doc
-scalar: a
+ scalar: a
...\n
```
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.
Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.
(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
Adds test case for X86 to check that the output of
@llvm.expect.with.probability's generic lowering is reasonable. This
replaces a generic test which only asserts that llc does not crash.
After 9fe78db4, the pass inserts `store volatile i32 -1, ptr %call_site`
before all invoke instruction except the one in the entry block, which
has the effect of bypassing landing pads on exceptions.
When configuring the call site for a potentially throwing instruction
check that it is not `InvokeInst` -- they are handled by earlier code.
Handle \@llvm.expect.with.probability in SelectionDAGBuilder, FastISel,
and IntrinsicLowering in the same way \@llvm.expect is handled, where
the value is passed through as-is. This can be reached if the intrinsic
is used without optimizations, where it would otherwise be properly
transformed out.
Fixes#115411 for SelectionDAG. A similar patch is likely needed for
GlobalISel.
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
- Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required.
- Parameter count is now calculated based on the unique hash count.
- Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size.
- Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time.
- Moved a sorting operation outside of the loop.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
- Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required.
- Parameter count is now calculated based on the unique hash count.
- Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size.
- Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time.
- Moved a sorting operation outside of the loop.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basic-block-sections` take precedence over functions with profiles.
SPARC ABI doesn't use stack realignment, so let LLVM know about it in
`SparcFrameLowering`. This has the side effect of making all overaligned
allocations go through `LowerDYNAMIC_STACKALLOC`, so implement the
missing logic there too for overaligned allocations.
This makes the SPARC backend not crash on overaligned `alloca`s and fix
https://github.com/llvm/llvm-project/issues/89569.
Fix bug that `mir-strip-debug` pass does not remove debug location from
bundled instructions.
Problem arises during testing that debug info does not affect
optimization passes output (`llvm-lit` with ` -Dllc="llc
-debugify-and-strip-all-safe"`), when pass operates on MIR with bundled
instructions + memory operands.
Let mir test check looks like:
```
CHECK-NEXT: BUNDLE {
CHECK-NEXT: $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2)
CHECK-NEXT: }
```
So as `mir-strip-debug` pass does not process bundled instructions,
running `llc -debugify-and-strip-all-safe` on the test will produce the
following output:
```
BUNDLE {
$r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2)
}
```
And test will fail, but it shouldn't.
Seems like the root cause is that `mir-strip-debug` pass should remove
debug location from bundled instructions.
Previously this would fail if the default target enabled the loop
terminator folding pass (currently just RISC-V), as it runs after loop
strength reduction.
I came across the subtly when setting up lit for z/OS and running it on
a Linux on Power machine. Linux on Power is little endian. This was
resulting in all of these tests being run even though the target triple
was z/OS which is big endian. The lit should really be checking if the
target is little endian not the host. The previous way didn't handle
cross compilation while running lit.
extractelement-shuffle.ll: Test for bugfix in DAGCombiner, moved to
Generic.
2010-07-06-DbgCrash.ll and 2006-10-02-BoolRetCrash.ll: Bugfixes in X86,
run tests with X86 backend.
This was first introduced way back in in 2010 by
6c74a872a8d34d41b751efb68e335cbe91b5a5cc, and has little evidence
of use. Only one test attempts to make use of this, but it's
also redundant since it's also using strip to drop debug info anyway
(and that also makes the test buggy, since it's intended to test
with and without debug info).
The other tests using it were only added to test the option after
discovering it was untested and moved, in later commits.
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
Correct missing cases in a switch that result in @llvm.vp.fma.v4f32
getting lowered to a constrained fma intrinsic. Vector predicated
lowering to contrained intrinsics is not supported currently, and
there's no consensus on the path forward. We certainly shouldn't be
introducing constrained intrinsics into a function that isn't strictfp.
Problem found with D146845.
This patch changes debugify to support debug variable records, and
subsequently to no longer convert modules automatically to intrinsics
when entering debugify.
D33412/D33413 introduced this to support a clang pragma to set section
names for a symbol depending on if it would be placed in
bss/data/rodata/text, which may not be known until the backend. However,
for text we know that only functions will go there, so just directly set
the section in clang instead of going through a completely separate
attribute.
Autoupgrade the "implicit-section-name" attribute to directly setting
the section on a Fuction.
This is mostly NFC but some output does change due to consistently
inserting into poison rather than undef and using i64 as the index
type for inserts.
Hi,
AMD has it's own implementation of vector calls. This patch include the
changes to enable the use of AMD's math library using -fveclib=AMDLIBM.
Please refer https://github.com/amd/aocl-libm-ose
---------
Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF
sections has landed, there is no longer a need to keep around the
mbb-profile-dump flag.
Both RISC-V and AMDGPU(GCN) deploy two VirtRegRewriter in their codegen
pipeline. This test prematurely stops at the first one, which doesn't
cleanup the virtual register map and cause an assertion failure. Ideally
we can solve this by teaching `-stop-after` how to stop at the last
instance of a Pass, but we're just marking XFAIL for these two targets
for now.