llvm-project

Author	SHA1	Message	Date
AtariDreams	c5d000b1a8	[Thumb] Resolve FIXME: Use 'mov hi, $src; mov $dst, hi' (#81908 ) Consider the following: ldr r0, [r4] ldr r7, [r0, #4] cmp r7, r3 bhi .LBB0_6 cmp r0, r2 push {r0} pop {r4} bne .LBB0_3 movs r0, r6 pop {r4, r5, r6, r7} pop {r1} bx r1 Here is a snippet of the generated THUMB1 code of the K&R malloc function that clang currently compiles to. push {r0} ends up being popped to pop {r4}. movs r4, r0 would destroy the flags set by cmp right above. The compiler has no alternative in this case, except one: the only alternative is to transfer through a high register. However, it seems like LLVM does not consider that this is a valid approach, even though it is a free clobbering a high register. This patch addresses the FIXME so the compiler can do that when it can in r10 or r11, or r12.	2024-04-05 10:18:22 +01:00
Victor Campos	74373c1bef	Revert "[ARM][Thumb2] Mark BTI-clearing instructions as scheduling region boundaries" (#87699 ) Reverts llvm/llvm-project#79173 The testcase fails in non-asserts builds.	2024-04-04 21:29:21 +01:00
Victor Campos	5ad320abe3	[ARM][Thumb2] Mark BTI-clearing instructions as scheduling region boundaries (#79173 ) Following https://github.com/llvm/llvm-project/pull/68313 this patch extends the idea to M-profile PACBTI. The Machine Scheduler can reorder instructions within a scheduling region depending on the scheduling policy set. If a BTI-clearing instruction happens to partake in one such region, it might be moved around, therefore ending up where it shouldn't. The solution is to mark all BTI-clearing instructions as scheduling region boundaries. This essentially means that they must not be part of any scheduling region, and as consequence never get moved: - PAC - PACBTI - BTI - SG Note that PAC isn't BTI-clearing, but it's replaced by PACBTI late in the compilation pipeline. As far as I know, currently it isn't possible to organically obtain code that's susceptible to the bug: - Instructions that write to SP are region boundaries. PAC seems to always be followed by the pushing of r12 to the stack, so essentially PAC is always by itself in a scheduling region. - CALL_BTI is expanded into a machine instruction bundle. Bundles are unpacked only after the last machine scheduler run. Thus setjmp and BTI can be separated only if someone deliberately run the scheduler once more. - The BTI insertion pass is run late in the pipeline, only after the last machine scheduling has run. So once again it can be reordered only if someone deliberately runs the scheduler again. Nevertheless, one can reasonably argue that we should prevent the bug in spite of the compiler not being able to produce the required conditions for it. If things change, the compiler will be robust against this issue. The tests written for this are contrived: bogus MIR instructions have been added adjacent to the BTI-clearing instructions in order to have them inside non-trivial scheduling regions.	2024-04-04 12:44:32 +01:00
Jonas Paulsson	7564566779	Reapply "Move assertion for AdjustsStack from PEI to MachineVerifier (#85698 )" - The check is now actually done in both PEI and the MachineVerifier. - More .mir tests trivially updated with "adjustsStack: true" as needed.	2024-03-21 20:24:57 -04:00
David Green	686f4599cf	[ARM] Regenerate some check lines. NFC	2024-03-21 13:45:44 +00:00
Jonas Paulsson	9ebd329ad8	Revert "Move assertion for AdjustsStack from PEI to MachineVerifier. (#85698 )" This reverts commit 05bde30585710a51592eee0a6cf6df8184d09c92. Reverting due to verifier complaints with expensive checks on build-bot.	2024-03-20 11:48:30 -04:00
Jonas Paulsson	05bde30585	Move assertion for AdjustsStack from PEI to MachineVerifier. (#85698 ) Have the verifier report a missing AdjustsStack flag rather than waiting until PEI asserts.	2024-03-20 10:29:12 -04:00
Sivan Shani	5e688f0dbd	[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83. T1 allow for an optional registers list, the register list must be {d0-d15}. T2 define a mandatory register list, the register list must be {d0-d31}. The requirements for T1/T2 are as follows: T1 T2 Require: v8-M.Main, v8.1-M.Main, secure state secure state 16 D Regs valid valid 32 D Regs UNDEFINED valid No D Regs NOP NOP	2024-03-11 14:27:28 +00:00
Jay Foad	fd3eaf76ba	[GISel] Enforce G_PTR_ADD RHS type matching index size for addr space (#84352 )	2024-03-09 09:07:22 +00:00
David Green	5f058398ab	[ARM] Mark AESD and AESE instructions as commutative. Similar to #83390, this marks AESD and AESE as commutative, as the logic of the instructions starts as a XOR between the two operands.	2024-03-03 16:56:21 +00:00
Fangrui Song	d89b771ef5	[ARM] Add alias tests for ROPI/RWPI https://reviews.llvm.org/D23195 does not test aliases.	2024-03-02 10:33:57 -08:00
Tomas Matheson	03420f570e	Revert "[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm (#83116 )" This reverts commit 634b0243b8f7acc85af4f16b70e91d86ded4dc83. Failing EXPENSIVE_CHECKS builds with "undefined physical register".	2024-02-29 09:48:29 +00:00
SivanShani-Arm	634b0243b8	[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm (#83116 ) T1 allows for an optional registers list, the register list must be {d0-d15}. T2 defines a mandatory register list, the register list must be {d0-d31}. The requirements for T1/T2 are as follows: T1 T2 Require: v8-M.Main, v8.1-M.Main, secure state secure state 16 D Regs valid valid 32 D Regs UNDEFINED valid No D Regs NOP NOP	2024-02-28 17:02:51 +00:00
ostannard	749384c08e	[ARM] Update IsRestored for LR based on all returns (#82745 ) PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based on all of the return instructions in the function, not just one. However, there is also code in ARMLoadStoreOptimizer which changes return instructions, but it set IsRestored based on the one instruction it changed, not the whole function. The fix is to factor out the code added in #75527, and also call it from ARMLoadStoreOptimizer if it made a change to return instructions. Fixes #80287.	2024-02-26 12:23:25 +00:00
Oliver Stannard	8779cf68e8	Pre-commit test showing bug #80287 This test shows the bug where LR is used as a general-purpose register on a code path where it is not spilled to the stack.	2024-02-26 12:21:13 +00:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Serge Pavlov	213b0ae497	[GlobalISel][ARM] legalize G_FPENV_RESET for soft-float mode (#81456 )	2024-02-12 17:46:59 +07:00
Serge Pavlov	b0785cd1cb	[GlobalISel][ARM] Support missing case for G_CONSTANT (#80555 ) Global Instruction Selector could not select the code: %0:gprb(s32) = G_CONSTANT i32 -1 In DAG selector the similar code is selected to the instruction MVNi using custom operand `mod_imm_not`. Changing its definition from `PatLeaf` to `ImmLeaf` and providing counterpart for `imm_not_XFORM` make the relevant rule available for GlobalISel too.	2024-02-07 12:53:20 +07:00
Nikita Popov	b31fffbc7f	[ARM] Convert tests to opaque pointers (NFC)	2024-02-05 13:56:59 +01:00
Craig Topper	6590d0fed5	[DAGCombiner][ARM] Teach reduceLoadWidth to handle (and (srl (load), C, ShiftedMask)) (#80342 ) If we have a shifted mask, we may be able to reduce the load width to the width of the non-zero part of the mask and use an offset to the base address to remove the srl. The offset is given by C+trailingzeros(ShiftedMask). Then we add a final shl to restore the trailing zero bits. I've use the ARM test because that's where the existing (and (srl (load))) tests were. The X86 test was modified to keep the H register.	2024-02-04 16:05:51 -08:00
Serge Pavlov	b4eb7a10c0	[GlobalISel][ARM] Legalze set_fpenv and get_fpenv (#79852 ) Implement handling of get/set floating point environment for ARM in Global Instruction Selector. Lowering of these intrinsics to operations on FPSCR was previously inplemented in DAG selector, in GlobalISel it is reused.	2024-02-04 12:30:33 +07:00
Harald van Dijk	52864d9c7b	[ARM] Switch to soft promoting half types. (#80440 ) The traditional promotion is known to generate wrong code. Fixes #73805.	2024-02-02 21:40:40 +00:00
Quentin Dian	112fba974c	[MIRPrinter] Don't print line break when there is no instructions (NFC) (#80147 ) Per #80143, we can remove the extra line break when there is no instruction.	2024-02-01 22:10:52 +08:00
Simon Pilgrim	ea2984287d	[ARM] Add ctpop codegen tests	2024-02-01 11:42:18 +00:00
Quentin Dian	b7738e275d	[MIRPrinter] Don't print space when there is no successor (#80143 ) Extra space causes the checks generated by update_mir_test_checks to be unavailable. ``` # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4 # RUN: llc -mtriple=x86_64-- -o - %s -run-pass=none -verify-machineinstrs -simplify-mir \| FileCheck %s --- name: foo body: \| ; CHECK-LABEL: name: foo ; CHECK: bb.0: ; CHECK-NEXT: successors: ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: bb.1: ; CHECK-NEXT: RET 0, $eax bb.0: successors: bb.1: RET 0, $eax ... ``` The failure log is as follows: ``` llvm/test/CodeGen/MIR/X86/unreachable-block-print.mir:9:16: error: CHECK-NEXT: is on the same line as previous match ; CHECK-NEXT: {{ $}} ^ <stdin>:21:13: note: 'next' match was here successors: ^ <stdin>:21:13: note: previous match ended here successors: ```	2024-01-31 22:35:41 +08:00
Alfie Richards	de75e5079a	[ARM][NEON] Add constraint to vld2 Odd/Even Pseudo instructions. (#79287 ) This ensures the odd/even pseudo instructions are allocated to the same register range. This fixes #71763	2024-01-31 14:08:02 +00:00
Fangrui Song	4cb90ca8f8	[Thumb,ELF] Fix access to dso_preemptable __stack_chk_guard with static relocation model (#78950 ) PR #70014 fixes A32 to use GOT for dso_preemptable `__stack_chk_guard` with static relocation model (e.g. -fPIE/-fPIC LTO compiles with -no-pie linking). This patch fixes such `__stack_chk_guard` access for Thumb1 and Thumb2. Note: `t2LDRLIT_ga_pcrel` is only for ELF. mingw needs `.refptr.__stack_chk_guard` (https://reviews.llvm.org/D92738). Fix #64999	2024-01-22 13:16:31 -08:00
Fangrui Song	3b943c0203	[Thumb,test] Improve __stack_chk_guard test	2024-01-22 00:20:40 -08:00
Simon Pilgrim	d92ce344bf	Revert faecc736e2ac3cd8c77 #74443 [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443 ) Relying on ComputeKnownBits to find a splat is causing miscompilations where a shift of zero is being assumed to give zero, but further simplification leads to a shift of zero by undef, resulting in an unexpected undef value. Fixes #78109	2024-01-17 15:59:33 +00:00
Alfie Richards	60c775769b	[ARM] Add missing earlyclobber to sqrshr and uqrshl instructions. (#77782 ) This avoids possible undefined behavior using the same register for Rm and Rda. Additionally adds a check in MC to produce an error upon parsing this case.	2024-01-16 10:30:16 +00:00
Nick Anderson	f1ec0d12bb	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#77182 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #75380 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-09 13:32:59 +07:00
Orlando Cazalet-Hyams	10b03e6662	[RemoveDIs] Handle DPValues in FastISel (#76952 ) The change is fairly mechanical: 1. Factor code from `FastISel::selectIntrinsicCall`, which converts debug intrinsics into debug instructions, into functions (NFC). 2. Call those functions for DPValues attached to instructions too. The test updates look the same as other RemoveDIs changes: re-run the tests with `--try-experimental-debuginfo-iterators`, which checks the output is identical using the new debug info format (if it has been enabled in the cmake configuration). Depends on #76941 (otherwise some modified tests spuriously fail).	2024-01-05 15:11:47 +00:00
Simon Pilgrim	7648371c25	Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update CodeGenPreparePass reference in CodeGenPassBuilder (#77054 )" Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)" Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104	2024-01-05 12:28:10 +00:00
Nick Anderson	e0c554ad87	Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380 ) Port CodeGenPrepare to new pass manager and dependency BasicBlockSectionsProfileReader Fixes: #64560 Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>	2024-01-05 13:47:56 +07:00
Serge Pavlov	2f81788067	[ARM][FPEnv] Lowering of fpmode intrinsics (#74054 ) LLVM intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` operate control modes, the bits of FP environment that affect FP operations. On ARM these bits are in FPSCR together with the status bits. The implementation of these intrinsics produces code close to that of functions `fegetmode` and `fesetmode` from GLIBC. Pull request: https://github.com/llvm/llvm-project/pull/74054	2023-12-18 18:57:36 +07:00
ostannard	4888218d03	[ARM] Do not emit unwind tables when saving LR around outlined call (#69611 ) In some cases, the machine outliner needs to preserve LR across an outlined call by pushing it onto the stack. Previously, this also generated unwind table instructions, which is incorrect because EHABI unwind tables cannot represent different stack frames a different points in the function, so the extra unwind info applied to the entire function. The outliner code already avoided generating CFI instructions, but EHABI unwind data is generated later from the actual instructions, so we need to avoid using the FrameSetup and FrameDestroy flags to prevent unwind data being generated.	2023-12-14 14:46:13 +00:00
Shih-Po Hung	b97c5a9554	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 21:16:11 +08:00
Simon Pilgrim	b7fc78255e	Revert rG2047ab00eaf0a17e71ce5e8a5b27a8c90f034c3d "[VPlan] Add a test for testing unused interleave recipes (#75026 )" vplan-unused-interleave-group.ll is causing buildbot failures	2023-12-14 10:25:41 +00:00
Shih-Po Hung	2047ab00ea	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 17:36:58 +08:00
paperchalice	b0cc42ae0f	[CodeGen] Port `SjLjEHPrepare` to new pass manager (#75023 ) `doInitialization` in `SjLjEHPrepare` is trivial. This is the last pass suffix with `ehprepare`.	2023-12-12 16:07:26 +08:00
Simon Pilgrim	faecc736e2	[DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443 )	2023-12-08 10:53:51 +00:00
Simon Pilgrim	22df0886a1	[DAG] Don't split f64 constant stores if the fp imm is legal (#74622 ) If the target can generate a specific fp immediate constant, then don't split the store into 2 x i32 stores Another cleanup step for #74304	2023-12-07 10:33:03 +00:00
Simon Pilgrim	609d980b3f	[ARM] Regenerate aapcs-hfa-code.ll	2023-12-06 12:09:30 +00:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
simpal01	74cdb8e6f8	[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features (#71545 ) The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F. So Compiler's -S output and `-save-temps=obj` loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this. In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same. So this patch enables the MVE features by emitting the below arch extension: .fpu fpv5-d16 .arch_extension mve.fp --------- Co-authored-by: Simi Pallipurath <simi.pallipurath.com>	2023-11-22 09:16:58 +00:00
Serge Pavlov	a2e1de1934	[ARM][FPEnv] Lowering of fpenv intrinsics The change implements lowering of `get_fpenv`, `set_fpenv` and `reset_fpenv`. Differential Revision: https://reviews.llvm.org/D81843	2023-11-20 15:08:25 +07:00
Tavian Barnes	75cf672b12	[SDAG] Simplify is-power-of-2 codegen (#72275 ) When x is not known to be nonzero, ctpop(x) == 1 is expanded to x != 0 && (x & (x - 1)) == 0 resulting in codegen like leal -1(%rdi), %eax testl %eax, %edi sete %cl testl %edi, %edi setne %al andb %cl, %al But another expression that works is (x ^ (x - 1)) > x - 1 which has nicer codegen: leal -1(%rdi), %eax xorl %eax, %edi cmpl %eax, %edi seta %al	2023-11-15 22:26:34 +09:00
Serge Pavlov	5b0f703918	Revert "[ARM][FPEnv] Lowering of fpenv intrinsics" This reverts commit d62f040418bd167d1ddd2b79c640a90c0c2ea353. Some cuda buildbots start failing.	2023-11-10 16:24:51 +07:00
Serge Pavlov	d62f040418	[ARM][FPEnv] Lowering of fpenv intrinsics The change implements lowering of `get_fpenv`, `set_fpenv` and `reset_fpenv`. Differential Revision: https://reviews.llvm.org/D81843	2023-11-10 16:06:33 +07:00
Nikita Popov	e4a4122eb6	[IR] Remove zext and sext constant expressions (#71040 ) Remove support for zext and sext constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. There is some additional cleanup that can be done on top of this, e.g. we can remove the ZExtInst vs ZExtOperator footgun. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-03 10:46:07 +01:00

1 2 3 4 5 ...

4867 Commits