llvm-project

Author	SHA1	Message	Date
Matt Arsenault	4d98ee2a22	ARM: Add watchos run line to llvm.sincos test (#166271 )	2025-11-03 18:20:24 -08:00
Matt Arsenault	c77b614564	ARM: Add more ABIs to llvm.sincos test (#166264 ) Make sure the iOS with/without sincos_stret are tested	2025-11-03 16:00:54 -08:00
Erik Enikeev	1523332fbd	[ARM] Mark function calls as possibly changing FPSCR (#160699 ) This patch does the same changes as D143001 for AArch64. This PR is part of the work on adding strict FP support in ARM, which was previously discussed in #137101.	2025-10-30 16:36:55 +00:00
Erik Enikeev	242ebcf13e	[ARM] Add instruction selection for strict FP (#160696 ) This consists of marking the various strict opcodes as legal, and adjusting instruction selection patterns so that 'op' is 'any_op'. The changes are similar to those in D114946 for AArch64. Custom lowering and promotion are set for some FP16 strict ops to work correctly. This PR is part of the work on adding strict FP support in ARM, which was previously discussed in #137101.	2025-10-29 21:43:43 +00:00
AZero13	5d0f1591f8	[DAGCombine] Improve bswap lowering for machines that support bit rotates (#164848 ) Source: Hacker's delight.	2025-10-25 10:17:15 -07:00
David Green	a1e59bdc17	[GlobalISel] Make scalar G_SHUFFLE_VECTOR illegal. (#140508 ) I'm not sure if this is the best way forward or not, but we have a lot of issues with forgetting that shuffle_vectors can be scalar again and again. (There is another example from the recent known-bits code added recently). As a scalar-dst shuffle vector is just an extract, and a scalar-source shuffle vector is just a build vector, this patch makes scalar shuffle vector illegal and adjusts the irbuilder to create the correct node as required. Most targets do this already through lowering or combines. Making scalar shuffles illegal simplifies gisel as a whole, it just requires that transforms that create shuffles of new sizes to account for the scalar shuffle being illegal (mostly IRBuilder and LessElements).	2025-10-24 08:21:35 +01:00
Kees Cook	d130f40264	[ARM][KCFI] Add backend support for Kernel Control-Flow Integrity (#163698 ) Implement KCFI (Kernel Control Flow Integrity) backend support for ARM32, Thumb2, and Thumb1. The Linux kernel has supported ARM KCFI via Clang's generic KCFI implementation, but this has finally started to [cause problems](https://github.com/ClangBuiltLinux/linux/issues/2124) so it's time to get the KCFI operand bundle lowering working on ARM. Supports patchable-function-prefix with adjusted load offsets. Provides an instruction size worst case estimate of how large the KCFI bundle is so that range-limited instructions (e.g. cbz) know how big the indirect calls can become. ARM implementation notes: - Four-instruction EOR sequence builds the 32-bit type ID byte-by-byte to work within ARM's modified immediate encoding constraints. - Scratch register selection: r12 (IP) is preferred, r3 used as fallback when r12 holds the call target. r3 gets spilled/reloaded if it is being used as a call argument. - UDF trap encoding: 0x8000 \| (0x1F << 5) \| target_reg_index, similar to aarch64's trap encoding. Thumb2 implementation notes: - Logically the same as ARM - UDF trap encoding: 0x80 \| target_reg_index Thumb1 implementation notes: - Due to register pressure, 2 scratch registers are needed: r3 and r2, which get spilled/reloaded if they are being used as call args. - Instead of EOR, add/lsl sequence to load immediate, followed by a compare. - No trap encoding. Update tests to validate all three sub targets.	2025-10-23 08:27:13 -07:00
paperchalice	542703fa68	[test][ARM] Remove unsafe-fp-math-uses (NFC) (#164744 ) Post cleanup for #164534.	2025-10-23 15:07:46 +08:00
Prabhu Rajasekaran	b7c7083c1f	[llvm] Update call graph ELF section type. (#164461 ) Make call graph section to have a dedicated type instead of the generic progbits type.	2025-10-22 15:08:36 -07:00
David Green	6d5dea63ed	[ARM][SDAG] Add llvm.lround half promotion. (#164235 ) Similar to #161088, add llvm.lround and llvm.llround promotion.	2025-10-21 16:56:55 +01:00
Prabhu Rajasekaran	cac8bdb56c	[NFC][llvm] Update call graph section's name. (#163429 ) Call graph section emitted by LLVM was named `.callgraph`. Renaming it to `.llvm.callgraph`.	2025-10-15 07:52:54 -07:00
paperchalice	bfee9db785	[DAGCombiner] Remove NoNaNsFPMath uses (#163504 ) Users should use `nnan` flag instead.	2025-10-15 21:22:13 +08:00
Simon Pilgrim	4c3ec9cda0	[ARM] carry.ll - regenerate test checks (#163172 )	2025-10-13 11:12:09 +00:00
Yatao Wang	c4bcbf02a5	[GlobalISel] Add G_SUB for computeNumSignBits (#158384 ) This patch ports the ISD::SUB handling from SelectionDAG’s ComputeNumSignBits to GlobalISel. Related to https://github.com/llvm/llvm-project/issues/150515. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-10-13 10:45:26 +00:00
beetrees	11571a005a	Fix legalizing `FNEG` and `FABS` with `TypeSoftPromoteHalf` (#156343 ) Based on top of #157211. `FNEG` and `FABS` must preserve signalling NaNs, meaning they should not convert to f32 to perform the operation. Instead legalize to `XOR` and `AND`. Fixes almost all of #104915	2025-10-11 11:08:26 +09:00
Prabhu Rajasekaran	6fb87b231f	[llvm][AsmPrinter] Call graph section format. (#159866 ) Make .callgraph section's layout efficient in space. Document the layout of the section.	2025-10-10 12:20:11 -07:00
Brad Smith	31e85cc572	[Android] Drop workarounds for older Android API levels pre 9, 17 and 21 (#161911 ) Drop workarounds for Android API levels pre 9, 17, 21. The minimum Android API currently supported by the LTS NDK is 21.	2025-10-10 03:59:44 -04:00
Erik Enikeev	5c613f287d	[ARM] Add mayRaiseFPException to appropriate instructions and mark all instructions that read/write fpscr rounding bits as doing so (#160698 ) Added new register FPSCR_RM to correctly model interactions with rounding mode control bits of fpscr and to avoid performance regressions in normal non-strictfp case This PR is part of the work on adding strict FP support in ARM, which was previously discussed in #137101.	2025-10-07 22:19:53 +01:00
David Green	125f0ac757	[ARM][SDAG] Half promote llvm.lrint nodes. (#161088 ) As shown in #137101, fp16 lrint are not handled correctly on Arm. This adds soft-half promotion for them, reusing the function that promotes a value with operands (and can handle strict fp once that is added).	2025-10-07 22:04:39 +01:00
Luke Lau	795a115d19	[RegAlloc] Remove default restriction on non-trivial rematerialization (#159211 ) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However https://github.com/llvm/llvm-project/pull/160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and https://github.com/llvm/llvm-project/pull/160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%	2025-10-04 22:50:44 +00:00
David Green	9e4af2ffa6	[ARM] Update and cleanup lround/llround tests. NFC Similar to f4370fb801aa, the fp16 tests do not work yet.	2025-10-04 19:52:46 +01:00
Yatao Wang	178e2a704b	[LLVM][CodeGen] Check Non Saturate Case in isSaturatingMinMax (#160637 ) Fix Issue #160611	2025-10-03 20:39:45 +01:00
AZero13	90582ad284	[ARM] shouldFoldMaskToVariableShiftPair should be true for scalars up to the biggest legal type (#158070 ) For ARM, we want to do this up to 32-bits. Otherwise the code ends up bigger and bloated.	2025-10-03 08:10:22 +01:00
David Green	f4370fb801	[ARM] Update and cleanup lrint/llrint tests. NFC Most of the fp16 cases still do not work properly. See #161088.	2025-10-02 21:51:45 +01:00
Matt Arsenault	c6e280e7ed	PeepholeOpt: Fix losing subregister indexes on full copies (#161310 ) Previously if we had a subregister extract reading from a full copy, the no-subregister incoming copy would overwrite the DefSubReg index of the folding context. There's one ugly rvv regression, but it's a downstream issue of this; an unnecessary same class reg-to-reg full copy was avoided.	2025-10-02 13:36:47 +09:00
Un1q32	133406e3d9	Reserve R9 on armv6 iOS 2.x (#150835 ) The iOS 2.x ABI had R9 as a reserved register, 3.0 made it available, but support for the 2.x ABI was never added to LLVM. We only use the 2.x ABI on armv6 since before 3.0 armv6 was the only architecture supported by iOS.	2025-09-30 21:05:27 -07:00
Matt Arsenault	9811226967	PeepholeOpt: Try to constrain uses to support subregister (#161338 ) This allows removing a special case hack in ARM. ARM's implementation of getExtractSubregLikeInputs has the strange property that it reports a register with a class that does not support the reported subregister index. We can however reconstrain the register to support this usage. This is an alternative to #159600. I've included the test, but the output is different. In this case version the VMOVSR is replaced with an ordinary subregister extract copy.	2025-10-01 00:18:51 +09:00
paperchalice	8ce3b8b518	[ARM] Remove `UnsafeFPMath` uses (#151275 ) Try to remove `UnsafeFPMath` uses in arm backend. These global flags block some improvements like https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797. Remove them incrementally.	2025-09-28 13:50:20 +08:00
David Green	9bf51b2b19	[ARM] Generate build-attributes more correctly in the presence of intrinsic declarations. (#160749 ) This code doesn't work very well, but this makes it work when intrinsic definitions are present. It now discounts functions declarations from the set of attributes it looks at. The code would have worked better before 0ab5b5b8581d9f2951575f7245824e6e4fc57dec when module-level attributes could provide the information used to construct build-attributes.	2025-09-27 16:50:48 +01:00
David Green	02746f80c1	[ARM] Remove -fno-unsafe-math from a number of tests. NFC llvm.convert/to.fp16 and from.fp16 are no longer used / deprecated and do not need to be tested any more.	2025-09-26 11:48:34 +01:00
paperchalice	3257dc35fe	[ARM] Remove `UnsafeFPMath` uses in code generation part (#160801 ) Factor out from #151275 Remove all UnsafeFPMath uses but ABI tags related part.	2025-09-26 15:54:30 +08:00
paperchalice	add906ffe4	[ARM] Consider denormal mode in `ARMSubtarget` (#160456 ) Factor out from #151275. Add denormal mode to subtarget.	2025-09-25 07:51:48 +08:00
Simon Pilgrim	6f188056b3	[ARM] ha-alignstack-call.ll - regenerate test checks (#159988 )	2025-09-21 16:16:08 +00:00
Mikhail Gudim	562146499c	[CodeGen][NewPM] Port `ReachingDefAnalysis` to new pass manager. (#159572 ) In this commit: (1) Added new pass manager support for `ReachingDefAnalysis`. (2) Added printer pass. (3) Make old pass manager use `ReachingDefInfoWrapperPass`	2025-09-19 09:38:34 -04:00
Nikita Popov	1723f80b08	[ARM] Allow s constraints on half (#157860 ) Fix a regression from https://github.com/llvm/llvm-project/pull/147559.	2025-09-11 08:50:32 +02:00
Matt Arsenault	fc0f1fc695	ARM: Move remaining half convert libcall config into tablegen (#153408 ) The __truncdfhf2 handling is kind of convoluted, but reproduces the existing, likely wrong, handling.	2025-09-11 12:11:46 +09:00
Francesco Petrogalli	f82023d72e	[clang][driver][arm][macho] Default to -mframe-pointer=non-leaf. (#154216 ) The commit in [1] changes the behavior of the Arm backend for the attribute frame-pointer=all. Before [1], leaf functions marked with frame-pointer=all were not emitting the frame-pointer. After [1], frame-pointer=all started generating frame pointer for all functions, including leaf functions. However, the default behavior for the driver in clang is to emit the command line option `-mframe-pointer=all` on Arm, if no options for handling the frame pointer is specified at command line. This causes observable regressions. This patch addresses these regressions by configuring the driver so to emit `-mframe-pointer=non-leaf` when targeting Arm. Codegen tests dealing with frame pointer generation have been extended to handle functions with a tail call, since this configuration was missing. [1] 4a2bd78f5b0d0661c23dff9c4b93a393a49dbf9a	2025-09-09 18:39:26 +00:00
paperchalice	667f919214	[SelectionDAG][ARM] Propagate fast math flags in visitBRCOND (#156647 ) Factor out from #151275.	2025-09-06 20:44:25 +08:00
Nikita Popov	3f757a39f2	[CodeGen] Remove ExpandInlineAsm hook (#156617 ) This hook replaces inline asm with LLVM intrinsics. It was intended to match inline assembly implementations of bswap in libc headers and replace them more optimizable implementations. At this point, it has outlived its usefulness (see https://github.com/llvm/llvm-project/issues/156571#issuecomment-3247638412), as libc implementations no longer use inline assembly for this purpose. Additionally, it breaks the "black box" property of inline assembly, which some languages like Rust would like to guarantee. Fixes https://github.com/llvm/llvm-project/issues/156571.	2025-09-04 09:28:11 +02:00
zhijian lin	36cb33bbca	support branch hint for AtomicExpandImpl::expandAtomicCmpXchg (#152366 ) The patch add branch hint for AtomicExpandImpl::expandAtomicCmpXchg, For example: in PowerPC, it support branch hint as ``` loop: lwarx r6,0,r3 # load and reserve cmpw r4,r6 #1st 2 operands equal? bne- exit #skip if not bne- exit #skip if not stwcx. r5,0,r3 #store new value if still res’ved bne- loop #loop if lost reservation bne- loop #loop if lost reservation exit: mr r4,r6 #return value from storage ``` `-` hints not taken, `+` hints taken,	2025-09-02 09:33:28 -04:00
beetrees	1fae86d5d3	[ARM] Improve fp16-promote.ll test (NFC) (#156341 ) Update the test to use `utils/update_llc_test_checks.py`, and add a check for `fneg`. Prerequisite to #156343.	2025-09-01 16:10:10 +00:00
Amara Emerson	64b9896754	[ARM] Use t2LDRLIT_ga_pcrel for loading stack guards with no-movt in PIC mode. (#156208 ) When using no-movt we don't use the pcrel version of the literal load. This change also unifies logic with the ARM version of this function as well, which has: ``` if (!Subtarget.useMovt() \|\| ForceELFGOTPIC) { // For ELF non-PIC, use GOT PIC code sequence as well because R_ARM_GOT_ABS // does not have assembler support. if (TM.isPositionIndependent() \|\| ForceELFGOTPIC) expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_pcrel, ARM::LDRi12); else expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_abs, ARM::LDRi12); return; } ``` rdar://138334512	2025-08-31 22:31:01 -07:00
AZero13	79dfe48865	[ARM] Set isCheapToSpeculateCtlz as true for hasV5TOps and no Thumb 1 (#154848 ) This is so that we don't expand to include unneeded 0 checks. Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1 in Fast-ISEL. Finally, Remove the README entry regarding this issue.	2025-08-25 12:43:48 -07:00
paperchalice	945a186089	[DAGCombiner] Remove most `UnsafeFPMath` references (#146295 ) This pull request removes all references to `UnsafeFPMath` in dag combiner except FP_ROUND. - Set fast math flags in some tests.	2025-08-22 15:27:25 +08:00
Craig Topper	9240061800	[RegAllocFast] Don't align stack slots if the stack can't be realigned (#153682 ) This is the fast regalloc equivalent of 773771ba382b1fbcf6acccc0046bfe731541a599.	2025-08-19 08:17:26 -07:00
Arne Stenkrona	ea2f5395b1	[SimplifyCFG] Avoid threading for loop headers (#151142 ) Updates SimplifyCFG to avoid jump threading through loop headers if -keep-loops is requested. Canonical loop form requires a loop header that dominates all blocks in the loop. If we thread through a header, we risk breaking its domination of the loop. This change avoids this issue by conservatively avoiding threading through headers entirely. Fixes: https://github.com/llvm/llvm-project/issues/151144	2025-08-18 09:46:55 +00:00
David Green	06d2d1e156	[ARM] Protect against odd sized vectors in isVTRNMask and friends (#153413 ) Fixes the issue reported on #153138, where odd-sized vectors would cause the checks to iterate off the end of the mask.	2025-08-13 20:57:46 +01:00
Philip Reames	4d629f9744	[MIR] Remove std::variant from multiple save/restore point handling [nfc] (#153226 ) In review of bbde6b, I had originally proposed that we support the legacy text format. As review evolved, it bacame clear this had been a bad idea (too much complexity), but in order to let that patch finally move forward, I approved the change with the variant. This change undoes the variant, and updates all the tests to just use the array form.	2025-08-12 11:23:05 -07:00
Elizaveta Noskova	bbde6be841	[llvm] Support multiple save/restore points in mir (#119357 ) Currently mir supports only one save and one restore point specification: ``` savePoint: '%bb.1' restorePoint: '%bb.2' ``` This patch provide possibility to have multiple save and multiple restore points in mir: ``` savePoints: - point: '%bb.1' restorePoints: - point: '%bb.2' ``` Shrink-Wrap points split Part 3. RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581 Part 1: https://github.com/llvm/llvm-project/pull/117862 Part 2: https://github.com/llvm/llvm-project/pull/119355 Part 4: https://github.com/llvm/llvm-project/pull/119358 Part 5: https://github.com/llvm/llvm-project/pull/119359	2025-08-12 16:34:29 +03:00
Trevor Gross	00c4be3c9e	[Test] Add and update tests for `lrint`/`llrint` (NFC) (#152662 ) Many backends are missing either all tests for lrint, or specifically those for f16, which currently crashes for `softPromoteHalf` targets. For a number of popular backends, do the following: * Ensure f16, f32, f64, and f128 are all covered * Ensure both a 32- and 64-bit target are tested, if relevant * Add `nounwind` to clean up CFI output * Add a test covering the above if one did not exist * Always specify the integer type in intrinsic calls There are quite a few FIXMEs here, especially for `f16`, but much of this will be resolved in the near future.	2025-08-12 09:56:51 +09:00

1 2 3 4 5 ...

5164 Commits