llvm-project

Author	SHA1	Message	Date
Ellis Hoag	819f020b28	Use F.hasOptSize() instead of checking optsize directly (#147348 )	2025-07-28 08:38:52 -07:00
Kazu Hirata	094a7087b8	[Target] Use range-based for loops (NFC) (#146198 )	2025-06-27 22:07:58 -07:00
Kazu Hirata	228f66807d	[llvm] Remove unused includes (NFC) (#142733 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-06-04 12:30:52 -07:00
David Green	af6910194c	[ARM] Remove unused enable-arm-3-addr-conv (#141850 ) This code is not enabled by default and has no tests, having been added back in 10043e215bcfd. It can be safely removed to help keep things simpler, not needing to maintain code that is never used.	2025-05-29 18:31:25 +01:00
Iris Shi	bdf03fcff3	Revert "[llvm][NFC] Use `llvm::sort()`" (#140668 )	2025-05-20 11:27:03 +08:00
Iris Shi	061a7699f3	[llvm][NFC] Use `llvm::sort()` (#140335 )	2025-05-17 14:49:46 +08:00
Sergei Barannikov	2afef58e40	[ARM] Use helper class for emitting CFI instructions into MIR (#135994 ) Similar to #135845. PR: https://github.com/llvm/llvm-project/pull/135994	2025-04-17 00:03:34 +03:00
Kazu Hirata	ad1ba15ea8	[Target] Use llvm::append_range (NFC) (#133606 )	2025-03-29 18:47:47 -07:00
Craig Topper	571b787b83	[CodeGen] Change copyPhysReg interface to use Register instead of MCRegister. (#128473 ) NVPTX, SPIRV, and WebAssembly pass virtual registers to this function since they don't perform register allocation. We need to use Register to avoid a virtual register being converted to MCRegister by the caller.	2025-02-24 09:55:34 -08:00
Christopher Di Bella	309e3ca081	Revert "[CodeGen] Remove static member function Register::isPhysicalRegister. NFC" This reverts commit 5fadb3d680909ab30b37eb559f80046b5a17045e.	2025-02-20 22:06:21 +00:00
Craig Topper	5fadb3d680	[CodeGen] Remove static member function Register::isPhysicalRegister. NFC Prefer the nonstatic member by converting unsigned to Register instead.	2025-02-20 10:49:53 -08:00
Hua Tian	a9d2834508	[llvm][CodeGen] Fix the issue caused by live interval checking in window scheduler (#123184 ) At some corner cases, the cloned MI still retains an old slot index, which leads to the compiler crashing. This patch update the slot index map before delete the recycled MI. https://github.com/llvm/llvm-project/issues/123165	2025-01-23 09:39:03 +08:00
Venkata Ramanaiah Nalamothu	f7d8336a2f	[llvm] Pass MachineInstr flags to storeRegToStackSlot/loadRegFromStackSlot (NFC) (#120622 ) This patch is in preparation to enable setting the MachineInstr::MIFlag flags, i.e. FrameSetup/FrameDestroy, on callee saved register spill/reload instructions in prologue/epilogue. This eventually helps in setting the prologue_end and epilogue_begin markers more accurately. The DWARF Spec in "6.4 Call Frame Information" says: The code that allocates space on the call frame stack and performs the save operation is called the subroutine’s prologue, and the code that performs the restore operation and deallocates the frame is called its epilogue. which means the callee saved register spills and reloads are part of prologue (a.k.a frame setup) and epilogue (a.k.a frame destruction), respectively. And, IIUC, LLVM backend uses FrameSetup/FrameDestroy flags to identify instructions that are part of call frame setup and destruction. In the trunk, while most targets consistently set FrameSetup/FrameDestroy on save/restore call frame information (CFI) instructions of callee saved registers, they do not consistently set those flags on the actual callee saved register spill/reload instructions. I believe this patch provides a clean mechanism to set FrameSetup/FrameDestroy flags on the actual callee saved register spill/reload instructions as needed. And, by having default argument of MachineInstr::NoFlags for Flags, this patch is a NFC. With this patch, the targets have to just pass FrameSetup/FrameDestroy flag to the storeRegToStackSlot/loadRegFromStackSlot calls from the target derived spillCalleeSavedRegisters and restoreCalleeSavedRegisters to set those flags on callee saved register spill/reload instructions. Also, this patch makes it very easy to set the source line information on callee saved register spill/reload instructions which is needed by the DwarfDebug.cpp implementation to set prologue_end and epilogue_begin markers more accurately. As per DwarfDebug.cpp implementation: prologue_end is the first known non-DBG_VALUE and non-FrameSetup location that marks the beginning of the function body epilogue_begin is the first FrameDestroy location that has been seen in the epilogue basic block With this patch, the targets have to just do the following to set the source line information on callee saved register spill/reload instructions, without hampering the LLVM's efforts to avoid adding source line information on the artificial code generated by the compiler. <Foo>InstrInfo::storeRegToStackSlot() { ... DebugLoc DL = Flags & MachineInstr::FrameSetup ? DebugLoc() : MBB.findDebugLoc(I); ... } <Foo>InstrInfo::loadRegFromStackSlot() { ... DebugLoc DL = Flags & MachineInstr::FrameDestroy ? MBB.findDebugLoc(I) : DebugLoc(); ... } While I understand this patch would break out-of-tree backend builds, I think it is in the right direction. One immediate use case that can benefit from this patch is fixing #120553 becomes simpler.	2025-01-22 13:36:39 +05:30
Craig Topper	9d9c5619a5	[ARM] Use MCRegister instead of unsigned. NFC Primarily around uses of getSubReg/getSuperReg.	2025-01-20 19:36:44 -08:00
Pengcheng Wang	f421a7a6ee	[ARM] Use `RegisterClassInfo::getRegPressureSetLimit` (#120377 ) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from https://github.com/llvm/llvm-project/pull/118787	2024-12-20 11:43:12 +08:00
Matt Arsenault	5fb8d70e5f	ARM: Handle vldrh and vstrh in stack access hooks (#120527 )	2024-12-19 17:55:19 +07:00
Kazu Hirata	9571cc2b28	[ARM] Remove unused includes (NFC) (#115995 ) Identified with misc-include-cleaner.	2024-11-12 23:15:21 -08:00
Oliver Stannard	9f02950a15	[ARM] Allow spilling FPSCR for MVE adc/sbc intrinsics (#115174 ) The MVE VADC and VSBC instructions read and write a carry bit in FPSCR, which is exposed through the intrinsics. This makes it possible to write code which has the FPSCR live across a function call, or which uses the same value twice, so it needs to be possible to spill and reload it. There is a missed optimisation in one of the test cases, where we reload the FPSCR from the stack despite it still being live, I've not found a simple way to prevent the register allocator from doing this.	2024-11-07 11:23:49 +00:00
Benson Chu	0b32769444	[ARM] Apply sign-return-address attribute to outlined function This make checking for whether PAC is necessary simpler when building the outlined frame.	2024-10-23 08:50:56 -07:00
Kazu Hirata	9c64b5e759	[ARM] Simplify code with std::map::operator[] (NFC) (#112159 )	2024-10-14 06:56:39 -07:00
Kyungwoo Lee	93b8d07a75	[MachineOutliner][NFC] Refactor (#105398 ) This patch prepares the NFC groundwork for global outlining using CGData, which will follow https://github.com/llvm/llvm-project/pull/90074. - The `MinRepeats` parameter is now explicitly passed to the `getOutliningCandidateInfo` function, rather than relying on a default value of 2. For local outlining, the minimum number of repetitions is typically 2, but for the global outlining (mentioned above), we will optimistically create a single `Candidate` for each `OutlinedFunction` if stable hashes match a specific code sequence. This parameter is adjusted accordingly in global outlining scenarios. - I have also implemented `unique_ptr` for `OutlinedFunction` to ensure safe and efficient memory management within `FunctionList`, avoiding unnecessary implicit copies. This depends on https://github.com/llvm/llvm-project/pull/101461. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.	2024-08-27 14:38:36 -07:00
Piyou Chen	b01c006f73	[TII][RISCV] Add renamable bit to copyPhysReg (#91179 ) The renamable flag is useful during MachineCopyPropagation but renamable flag will be dropped after lowerCopy in some case. This patch introduces extra arguments to pass the renamable flag to copyPhysReg.	2024-08-27 10:08:43 +08:00
Kazu Hirata	dca820951c	[llvm] Use llvm::any_of (NFC) (#104443 )	2024-08-15 17:59:10 -07:00
Pengcheng Wang	ed4e75d5e5	[CodeGen] Remove AA parameter of isSafeToMove (#100691 ) This `AA` parameter is not used and for most uses they just pass a nullptr. The use of `AA` was removed since 8d0383e.	2024-07-26 15:47:47 +08:00
Matt Arsenault	2ce865d490	ARM: Avoid using MachineFunction::getMMI	2024-07-24 15:21:14 +04:00
Matt Arsenault	3cb5604d2c	MachineOutliner: Use PM to query MachineModuleInfo (#99688 ) Avoid getting this from the MachineFunction	2024-07-24 13:22:56 +04:00
Nikita Popov	4169338e75	[IR] Don't include Module.h in Analysis.h (NFC) (#97023 ) Replace it with a forward declaration instead. Analysis.h is pulled in by all passes, but not all passes need to access the module.	2024-06-28 14:30:47 +02:00
Kazu Hirata	fef144cebb	Revert "[llvm] Use llvm::sort (NFC) (#96434 )" This reverts commit 05d167fc201b4f2e96108be0d682f6800a70c23d. Reverting the patch fixes the following under EXPENSIVE_CHECKS: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir LLVM :: CodeGen/AMDGPU/sched-group-barrier-pre-RA.mir LLVM :: CodeGen/PowerPC/aix-xcoff-used-with-stringpool.ll LLVM :: CodeGen/PowerPC/merge-string-used-by-metadata.mir LLVM :: CodeGen/PowerPC/mergeable-string-pool-large.ll LLVM :: CodeGen/PowerPC/mergeable-string-pool-pass-only.mir LLVM :: CodeGen/PowerPC/mergeable-string-pool.ll	2024-06-25 11:18:40 -07:00
Kazu Hirata	05d167fc20	[llvm] Use llvm::sort (NFC) (#96434 )	2024-06-23 10:38:51 -07:00
Nikita Popov	db08b0999d	[ARM][AArch64] Bail out if CandidatesWithoutStackFixups is empty (#95410 ) The following code assumes that RepeatedSequenceLocs is non-empty. Bail out if there are less than 2 candidates left, as no outlining is possible in that case. The same check is already present in all the other places where elements from RepeatedSequenceLocs may be dropped. This fixes the issue reported at: https://github.com/llvm/llvm-project/pull/93965#issuecomment-2151989716	2024-06-14 09:29:21 +02:00
Nikita Popov	1c9f4d4b6f	[ARM] Avoid reference into modified vector (#93965 ) FirstCand is a reference to RepeatedSequenceLocs[0]. However, that vector is being modified a lot throughout the function, including one place that reassigns the whole vector. I'm not sure whether this can really happen in practice, but it doesn't seem unlikely that this could lead to a use-after-free. Avoid this by directly using RepeatedSequenceLocs[0] at the start of the function (as a lot of other places already do) and only creating FirstCand at the end where no more modifications take place.	2024-06-03 17:10:35 +02:00
Xu Zhang	f6d431f208	[CodeGen] Make the parameter TRI required in some functions. (#85968 ) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.	2024-04-24 14:24:14 +01:00
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
Philip Reames	3ff7caea33	[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339 )	2024-02-01 17:52:35 -08:00
Shengchen Kan	550f0eb2ce	[NFC] Rename TargetInstrInfo::FoldImmediate to TargetInstrInfo::foldImmediate and simplify implementation for X86	2024-01-26 20:50:58 +08:00
Anatoly Trosinenko	10bd69a4f7	[MachineOutliner] Refactor iterating over Candidate's instructions (#78972 ) Make Candidate's front() and back() functions return references to MachineInstr and introduce begin() and end() returning iterators, the same way it is usually done in other container-like classes. This makes possible to iterate over the instructions contained in Candidate the same way one can iterate over MachineBasicBlock (note that begin() and end() return bundled iterators, just like MachineBasicBlock does, but no instr_begin() and instr_end() are defined yet).	2024-01-23 17:21:40 +03:00
Alex Bradbury	80aeb62211	[llvm][NFC] Use SDValue::getConstantOperandVal(i) where possible (#76708 ) This helper function shortens examples like `cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to `Node->getConstantOperandVal(1);`. Implemented with: `git grep -l "cast<ConstantSDNode>\(.->getOperand\(.\)\)->getZExtValue\(\)" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.)->getOperand\((.)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/` and `git grep -l "cast<ConstantSDNode>\(.\.getOperand\(.\)\)->getZExtValue\(\)" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.)\.getOperand\((.)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`. With a couple of simple manual fixes needed. Result then processed by `git clang-format`.	2024-01-02 13:14:28 +00:00
ostannard	4888218d03	[ARM] Do not emit unwind tables when saving LR around outlined call (#69611 ) In some cases, the machine outliner needs to preserve LR across an outlined call by pushing it onto the stack. Previously, this also generated unwind table instructions, which is incorrect because EHABI unwind tables cannot represent different stack frames a different points in the function, so the extra unwind info applied to the entire function. The outliner code already avoided generating CFI instructions, but EHABI unwind data is generated later from the actual instructions, so we need to avoid using the FrameSetup and FrameDestroy flags to prevent unwind data being generated.	2023-12-14 14:46:13 +00:00
Ramkumar Ramachandra	e8dbe945f3	TargetInstrInfo, TargetSchedule: fix non-NFC parts of 9468de4 (#74338 ) Follow up on a post-commit review of 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) by Bjorn Pettersson to fix a couple of things that are not NFC: - std::optional<T>::operator<= returns true if the first operand is a std::nullopt and second operand is T. Fix a couple of places where we assumed it would return false. - In TargetSchedule, computeInstrCost could take another codepath, returning InstrLatency instead of DefaultDefLatency. Fix one instance not accounting for this behavior.	2023-12-05 08:18:17 +00:00
Simon Pilgrim	83e01ea1a5	Fix MSVC signed/unsigned mismatch warning. NFC.	2023-12-04 11:11:33 +00:00
Ramkumar Ramachandra	d222fa4521	TargetInstrInfo: squelch a signedness warning on MSVC (#74078 ) Follow up on 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) to squelch a signedness warning on MSVC, reported by Simon Pilgrim.	2023-12-01 16:08:41 +00:00
Ramkumar Ramachandra	9468de48fc	TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769 ) getOperandLatency has the following behavior: it returns -1 as a special value, negative numbers other than -1 on some target-specific overrides, or a valid non-negative latency. This behavior can be surprising, as some callers do arithmetic on these negative values. Change the interface of getOperandLatency to return a std::optional<unsigned> to prevent surprises in callers. While at it, change the interface of getInstrLatency to return unsigned instead of int. This change was inspired by a refactoring in TargetSchedModel::computeOperandLatency.	2023-12-01 11:29:19 +00:00
Alex Bradbury	5b3eb1bc22	[ARM][X86][NFC] Use lambda to avoid duplicate switches in areLoadsFromSameBasePtr (#72376 ) Both the Arm and X86 implementations of areLoadsFromSameBasePtr use a switch over the machine opcode, and repeat the same logic for both SDNode operands. We can avoid the duplicated logic (especially lengthy in the X86 case) by just using a lambda. This could obviously be a candidate for moving out to a separate helper function if there were other users, but I've made the minimal change in this patch.	2023-11-15 12:35:35 +00:00
Fangrui Song	5888dee7d0	[ARM,ELF] Fix access to dso_preemptable __stack_chk_guard with static relocation model (#70014 ) The ELF code from https://reviews.llvm.org/D112811 emits LDRLIT_ga_pcrel when `TM.isPositionIndependent()` but uses a different condition `Subtarget.isGVIndirectSymbol(GV)` (aka dso_preemptable on ELF targets). This would cause incorrect access for dso_preemptable `__stack_chk_guard` with the static relocation model. Regarding whether `__stack_chk_guard` gets the dso_local specifier, https://reviews.llvm.org/D150841 switched to `M.getDirectAccessExternalData()` (implied by "PIC Level") instead of `TM.getRelocationModel() == Reloc::Static`. The result is that when non-zero "PIC Level" is used with static relocation model (e.g. -fPIE/-fPIC LTO compiles with -no-pie linking), `__stack_chk_guard` accesses are incorrect. ``` ldr r0, .LCPI0_0 ldr r0, [r0] ldr r0, [r0] // incorrectly dereferences __stack_chk_guard ... .LCPI0_0: .long __stack_chk_guard ``` To fix this, for dso_preemptable `__stack_chk_guard`, emit a GOT PIC code sequence like for -fpic using `LDRLIT_ga_pcrel`: ``` ldr r0, .LCPI0_0 .LPC0_0: add r0, pc, r0 ldr r0, [r0] ldr r0, [r0] ... LCPI0_0: .Ltmp0: .long __stack_chk_guard(GOT_PREL)-((.LPC0_0+8)-.Ltmp0) ``` Technically, `LDRLIT_ga_abs` with `R_ARM_GOT_ABS` could be used, but `R_ARM_GOT_ABS` does not have GNU or integrated assembler support. (Note, `.LCPI0_0: .long __stack_chk_guard@GOT` produces an `R_ARM_GOT_BREL`, which is not desired). This patch fixes #6499 while not changing behavior for the following configurations: ``` run arm.linux.nopic --target=arm-linux-gnueabi -fno-pic run arm.linux.pie --target=arm-linux-gnueabi -fpie run arm.linux.pic --target=arm-linux-gnueabi -fpic run armv6.darwin.nopic --target=armv6-apple-darwin -fno-pic run armv6.darwin.dynamicnopic --target=armv6-apple-darwin -mdynamic-no-pic run armv6.darwin.pic --target=armv6-apple-darwin -fpic run armv7.darwin.nopic --target=armv7-apple-darwin -mcpu=cortex-a8 -fno-pic run armv7.darwin.dynamicnopic --target=armv7-apple-darwin -mcpu=cortex-a8 -mdynamic-no-pic run armv7.darwin.pic --target=armv7-apple-darwin -mcpu=cortex-a8 -fpic run arm64.darwin.pic --target=arm64-apple-darwin ```	2023-10-31 15:37:26 -07:00
Jie Fu	9d116b4b12	[ARM] Remove unused variable 'MF' in ARMBaseInstrInfo.cpp (NFC) /Users/jiefu/llvm-project/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp:4986:24: error: unused variable 'MF' [-Werror,-Wunused-variable] MachineFunction &MF = *MBB.getParent(); ^ 1 error generated.	2023-08-09 15:47:28 +08:00
Simon Wallis	33b9634394	[ARM] v6-M XO: save CPSR around LoadStackGuard For Thumb-1 Execute-Only, expandLoadStackGuardBase generates a tMOVimm32 pseudo when calculating the stack offset. It does this in a context where the CSPR maybe be live. tMOVimm32 may corrupt CPSR. To fix this, generate save/restore CPSR around the tMOVimm32 using MRS/MSR to/from a scratch register. expandLoadStackGuardBase this runs after register allocation, so the scratch register needs to be a physical register. Use R12 as a scratch register, as is usual when expanding a pseudo. MSR/MRS are some of the few v6-M instructions which operate on a high register. New stack-guard test case added which was generating incorrect code without the save/restore CPSR. Reviewed By: stuij Differential Revision: https://reviews.llvm.org/D156968	2023-08-09 08:40:35 +01:00
Sander de Smalen	bbb95893de	[TII] NFCI: Simplify the interface for isTriviallyReMaterializable Currently `isTriviallyReMaterializable` calls `isReallyTriviallyReMaterializable` and `isReallyTriviallyReMaterializableGeneric`. The two interfaces are confusing, but there are also some real issues with this. The documentation of this function (see below) suggests that `isReallyTriviallyRematerializable` allows the target to override the default behaviour. /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. It however implements something different. The default behaviour is the analysis done in `isReallyTriviallyReMaterializableGeneric`, which is testing if it is safe to rematerialize the MachineInstr. The result of `isReallyTriviallyReMaterializable` is only considered if `isReallyTriviallyReMaterializableGeneric` returns `false`. That means there is no way to override the default behaviour if `isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to rematerialize, but we'd rather not). By making this a single interface, we can override the interface to do either. Reviewed By: craig.topper, nemanjai Differential Revision: https://reviews.llvm.org/D156520	2023-08-07 13:01:06 +00:00
Ties Stuij	2273741ea2	[ARM] generate armv6m eXecute Only (XO) code [ARM] generate armv6m eXecute Only (XO) code for immediates, globals Previously eXecute Only (XO) support was implemented for targets that support MOVW/MOVT (~armv7+). See: https://reviews.llvm.org/D27449 XO prevents the compiler from generating data accesses to code sections. This patch implements XO codegen for armv6-M, which does not support MOVW/MOVT, and must resort to the following general pattern to avoid loads: movs r3, :upper8_15:foo lsls r3, #8 adds r3, :upper0_7:foo lsls r3, #8 adds r3, :lower8_15:foo lsls r3, #8 adds r3, :lower0_7:foo ldr r3, [r3] This is equivalent to the code pattern generated by GCC. The above relocations are new to LLVM and have been implemented in a parent patch: https://reviews.llvm.org/D149443. This patch limits itself to implementing codegen for this pattern and enabling XO for armv6-M in the backend. Separate patches will follow for: - switch tables - replacing specific loads from constant islands which are spread out over the ARM backend codebase. Amongst others: FastISel, call lowering, stack frames. Reviewed By: john.brawn Differential Revision: https://reviews.llvm.org/D152795	2023-06-23 10:50:47 +01:00
Simon Tatham	10e4228114	[ARM,AArch64] Add a full set of -mtp= options. AArch64 has five system registers intended to be useful as thread pointers: one for each exception level which is RW at that level and inaccessible to lower ones, and the special TPIDRRO_EL0 which is readable but not writable at EL0. AArch32 has three, corresponding to the AArch64 ones that aren't specific to EL2 or EL3. Currently clang supports only a subset of these registers, and not even a consistent subset between AArch64 and AArch32: - For AArch64, clang permits you to choose between the four TPIDR_ELn thread registers, but not the fifth one, TPIDRRO_EL0. - In AArch32, on the other hand, the //only// thread register you can choose (apart from 'none, use a function call') is TPIDRURO, which corresponds to (the bottom 32 bits of) AArch64's TPIDRRO_EL0. So there is no thread register that you can currently use in both targets! For custom and bare-metal purposes, users might very reasonably want to use any of these thread registers. There's no reason they shouldn't all be supported as options, even if the default choices follow existing practice on typical operating systems. This commit extends the range of values acceptable to the `-mtp=` clang option, so that you can specify any of these registers by (the lower-case version of) their official names in the ArmARM: - For AArch64: tpidr_el0, tpidrro_el0, tpidr_el1, tpidr_el2, tpidr_el3 - For AArch32: tpidrurw, tpidruro, tpidrprw All existing values of the option are still supported and behave the same as before. Defaults are also unchanged. No command line that worked already should change behaviour as a result of this. The new values for the `-mtp=` option have been agreed with Arm's gcc developers (although I don't know whether they plan to implement them in the near future). Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D152433	2023-06-15 09:27:41 +01:00
David Green	8998ff53c9	Revert "[ARM] Allow D-reg copies to use VMOVD with fpregs64" This reverts commit 0a762ec1b09d96734a3462f8792a5574d089b24d. Some CPUs enable fp64 by default (such as cortex-m7). When specifying a single-precision fpu with them like -mfpu=fpv5-sp-d16, the fp64 feature will be disabled, but fpreg64 will not. We need to disable them both correctly under clang in order for the backend to be able to use the reliably. In the meantime this reverts 0a762ec1b09d96734 until that issue is fixed.	2023-06-01 17:49:25 +01:00

1 2 3 4 5 ...

792 Commits