llvm-project

Author	SHA1	Message	Date
Florian Hahn	b1a5ee1feb	[ARM] Check all terms in emitPopInst when clearing Restored for LR. (#75527 ) emitPopInst checks a single function exit MBB. If other paths also exit the function and any of there terminators uses LR implicitly, it is not save to clear the Restored bit. Check all terminators for the function before clearing Restored. This fixes a mis-compile in outlined-fn-may-clobber-lr-in-caller.ll where the machine-outliner previously introduced BLs that clobbered LR which in turn is used by the tail call return. Alternative to #73553	2023-12-20 16:56:15 +01:00
Serge Pavlov	2f81788067	[ARM][FPEnv] Lowering of fpmode intrinsics (#74054 ) LLVM intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` operate control modes, the bits of FP environment that affect FP operations. On ARM these bits are in FPSCR together with the status bits. The implementation of these intrinsics produces code close to that of functions `fegetmode` and `fesetmode` from GLIBC. Pull request: https://github.com/llvm/llvm-project/pull/74054	2023-12-18 18:57:36 +07:00
ostannard	4888218d03	[ARM] Do not emit unwind tables when saving LR around outlined call (#69611 ) In some cases, the machine outliner needs to preserve LR across an outlined call by pushing it onto the stack. Previously, this also generated unwind table instructions, which is incorrect because EHABI unwind tables cannot represent different stack frames a different points in the function, so the extra unwind info applied to the entire function. The outliner code already avoided generating CFI instructions, but EHABI unwind data is generated later from the actual instructions, so we need to avoid using the FrameSetup and FrameDestroy flags to prevent unwind data being generated.	2023-12-14 14:46:13 +00:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Jonathan Thackray	f576cbe44e	[AArch64] Correctly mark Neoverse N2 as an Armv9.0a core (#75055 ) Neoverse N2 was incorrectly marked as an Armv8.5a core. This has been changed to an Armv9.0a core. However, crypto options are not enabled by default for Armv9 cores, so -mcpu=neoverse-n2+crypto is required to enable crypto for this core. Neoverse N2 Technical Reference Manual: https://developer.arm.com/documentation/102099/0003/	2023-12-11 18:52:25 +00:00
Kazu Hirata	d57a26a714	[ARM] Include bit.h instead of MathExtras.h (NFC)	2023-12-10 11:58:52 -08:00
Kazu Hirata	b85f1f9b18	[Target] Include bitset (NFC) These files are relying on the transitive include of <bitset> from GIMatchTableExecutor.h, which doesn't actually use std::bitset.	2023-12-09 18:34:57 -08:00
Jonathan Thackray	8758e648da	[ARM][AArch32] Add support for AArch32 Cortex-M52 CPU (#74822 ) Cortex-M52 is an Armv8.1 AArch32 CPU. Technical specifications available at: https://developer.arm.com/processors/cortex-m52	2023-12-08 15:04:08 +00:00
Kazu Hirata	286ef12b47	[Target] Remove unnecessary includes (NFC)	2023-12-07 21:03:56 -08:00
Craig Topper	e87f33d9ce	[RISCV][MC] Pass MCSubtargetInfo down to shouldForceRelocation and evaluateTargetFixup. (#73721 ) Instead of using the STI stored in RISCVAsmBackend, try to get it from the MCFragment. This addresses the issue raised here https://discourse.llvm.org/t/possible-problem-related-to-subtarget-usage/75283	2023-12-07 13:17:58 -08:00
Kazu Hirata	124b4ab85a	[Target] Stop including bitset (NFC) Identified with clangd.	2023-12-05 20:48:04 -08:00
Craig Topper	e888e83fb6	[ARM][AArch64] Use SelectionDAG::SplitScalar to simplify some code. (#74411 ) We know we're splitting a type in half to two legal values. Instead of using shift and truncate that need to be legalized, we can use two ISD::EXTRACT_ELEMENTs. Spotted while reviewing #67918 for RISC-V which copied this code.	2023-12-05 07:51:54 -08:00
Ramkumar Ramachandra	e8dbe945f3	TargetInstrInfo, TargetSchedule: fix non-NFC parts of 9468de4 (#74338 ) Follow up on a post-commit review of 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) by Bjorn Pettersson to fix a couple of things that are not NFC: - std::optional<T>::operator<= returns true if the first operand is a std::nullopt and second operand is T. Fix a couple of places where we assumed it would return false. - In TargetSchedule, computeInstrCost could take another codepath, returning InstrLatency instead of DefaultDefLatency. Fix one instance not accounting for this behavior.	2023-12-05 08:18:17 +00:00
Momchil Velikov	e3a97dffee	[Verifier] Check function attributes related to branch protection (NFC) (#70565 )	2023-12-04 16:16:55 +00:00
Simon Pilgrim	83e01ea1a5	Fix MSVC signed/unsigned mismatch warning. NFC.	2023-12-04 11:11:33 +00:00
Simon Pilgrim	5c672d87ea	Fix MSVC signed/unsigned mismatch warning. NFC.	2023-12-04 10:48:33 +00:00
Kazu Hirata	92c2529ccd	[llvm] Stop including vector (NFC) Identified with clangd.	2023-12-03 22:32:21 -08:00
Kazu Hirata	f6d6809d78	[llvm] Stop including array (NFC) Identified with clangd.	2023-12-02 00:52:25 -08:00
Ramkumar Ramachandra	d222fa4521	TargetInstrInfo: squelch a signedness warning on MSVC (#74078 ) Follow up on 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) to squelch a signedness warning on MSVC, reported by Simon Pilgrim.	2023-12-01 16:08:41 +00:00
Eleanor Bonnici	6e3b2cb46e	[llvm][MC][ARM][Assembly] Emit relocations for ADRs and big-endian targets (#73834 ) Follow-up on https://github.com/llvm/llvm-project/pull/72873/ When ADR/LDR instructions reference a label in a different section, the offset is not known until link time, however, the assembler assumes it can resolve them in some cases. The previous patch addressed the issue for most LDR instructions, focusing on little-endian targets. This patch addresses the remaining work for ADRs and big-endian targets.	2023-12-01 13:54:04 +00:00
Ramkumar Ramachandra	9468de48fc	TargetInstrInfo: make getOperandLatency return optional (NFC) (#73769 ) getOperandLatency has the following behavior: it returns -1 as a special value, negative numbers other than -1 on some target-specific overrides, or a valid non-negative latency. This behavior can be surprising, as some callers do arithmetic on these negative values. Change the interface of getOperandLatency to return a std::optional<unsigned> to prevent surprises in callers. While at it, change the interface of getInstrLatency to return unsigned instead of int. This change was inspired by a refactoring in TargetSchedModel::computeOperandLatency.	2023-12-01 11:29:19 +00:00
Eleanor Bonnici	bbc5d9fe42	[llvm][MC][ARM][Assembly] Emit relocations for LDRs (#72873 ) It's possible (though inadvisable) to use LDR and refer to labels in different sections. In the Arm state, the assembler resolves the LDR instruction without emitting a relocation. That's incorrect because the assembler cannot make any assumptions about the relative position of the sections and the compiler output is therefore wrong. This patch ensures relocations are generated for all `LDR <Rt...>, label` instructions in the Arm state (little endian). This is not necessary when the label is in the same section but the relocation is now generated regardless. Instructions that now generate relocations have been removed from the pcrel-global.s test. Fortunately, LLD already implements the generated relocations and can fix LDR instructions when the symbol is in a different section, or report an error if the offset is too large for the immediate field in the particular LDR's encoding. The patch to address this problem for big endian targets will follow, as well as a fix for ADR that exhibits a similar behavior.	2023-11-25 12:36:00 +00:00
Nikita Popov	c1e3a94105	[TargetLowering] Don't include ComplexDeinterleavingPass.h (NFC) TargetLowering.h shouldn't include any passes and thus pull in the entire pass infrastructure. Replace the include with forward declarations.	2023-11-24 12:13:38 +01:00
simpal01	74cdb8e6f8	[llvm][ARM] Emit MVE .arch_extension after .fpu directive if it does not include MVE features (#71545 ) The floating-point and MVE features together specify the MVE functionality that is supported on the Cortex-M85 processor. But the FPU extension for the underlying architecture(armv8.1-m.main) is FPV5 which does not include MVE-F. So Compiler's -S output and `-save-temps=obj` loses MVE feature which leads to assembler error. What happening here is .fpu directive overrides any previously set features by .cpu directive. Since the the corresponding .fpu generated (.fpu fpv5-d16) does not include MVE-F, it overrides those features even though it is supported and set by the .cpu directive. Looks like .fpu is supposed to do this. In this case, there should be an .arch_extension directive re-enabling the relevant extensions after .fpu if the goal is to keep these extensions enabled. GCC also does the same. So this patch enables the MVE features by emitting the below arch extension: .fpu fpv5-d16 .arch_extension mve.fp --------- Co-authored-by: Simi Pallipurath <simi.pallipurath.com>	2023-11-22 09:16:58 +00:00
Sander de Smalen	81b7f115fb	[llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979 ) It seems TypeSize is currently broken in the sense that: TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8) without failing its assert that explicitly tests for this case: assert(LHS.Scalable == RHS.Scalable && ...); The reason this fails is that `Scalable` is a static method of class TypeSize, and LHS and RHS are both objects of class TypeSize. So this is evaluating if the pointer to the function Scalable == the pointer to the function Scalable, which is always true because LHS and RHS have the same class. This patch fixes the issue by renaming `TypeSize::Scalable` -> `TypeSize::getScalable`, as well as `TypeSize::Fixed` to `TypeSize::getFixed`, so that it no longer clashes with the variable in FixedOrScalableQuantity. The new methods now also better match the coding standard, which specifies that: * Variable names should be nouns (as they represent state) * Function names should be verb phrases (as they represent actions)	2023-11-22 08:52:53 +00:00
Simon Pilgrim	cfee7152d4	[DAG] clang-format createBranchMacroFusionDAGMutation calls. NFC. Reduces diff in #72227	2023-11-20 12:13:09 +00:00
Serge Pavlov	a2e1de1934	[ARM][FPEnv] Lowering of fpenv intrinsics The change implements lowering of `get_fpenv`, `set_fpenv` and `reset_fpenv`. Differential Revision: https://reviews.llvm.org/D81843	2023-11-20 15:08:25 +07:00
Alex Bradbury	5b3eb1bc22	[ARM][X86][NFC] Use lambda to avoid duplicate switches in areLoadsFromSameBasePtr (#72376 ) Both the Arm and X86 implementations of areLoadsFromSameBasePtr use a switch over the machine opcode, and repeat the same logic for both SDNode operands. We can avoid the duplicated logic (especially lengthy in the X86 case) by just using a lambda. This could obviously be a candidate for moving out to a separate helper function if there were other users, but I've made the minimal change in this patch.	2023-11-15 12:35:35 +00:00
Kazu Hirata	01702c3f7f	[llvm] Stop including llvm/ADT/SmallSet.h (NFC) Identified with clangd.	2023-11-11 12:32:15 -08:00
Serge Pavlov	5b0f703918	Revert "[ARM][FPEnv] Lowering of fpenv intrinsics" This reverts commit d62f040418bd167d1ddd2b79c640a90c0c2ea353. Some cuda buildbots start failing.	2023-11-10 16:24:51 +07:00
Serge Pavlov	d62f040418	[ARM][FPEnv] Lowering of fpenv intrinsics The change implements lowering of `get_fpenv`, `set_fpenv` and `reset_fpenv`. Differential Revision: https://reviews.llvm.org/D81843	2023-11-10 16:06:33 +07:00
Craig Topper	c4821073cd	[GISel] Make target's PartMapping, ValueMapping, and BankIDToCopyMapIdx arrays const. (#71079 ) AMDGPU arrays were already const.	2023-11-09 17:03:56 -08:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
David Green	fee2953f23	[ARM] Fix for undef elements from demanded elements (#70504 ) I think this is right, that the undef bits should be the undef bits from the passthrough (operand 0), with the top/bottom lanes cleared, as they come from the second arg (operand 1). We don't yet attempt to look for undef elements in the second operand, but this should fix the bug with all elements being marked as undef and the instruction being optimized away.	2023-11-02 14:28:40 +00:00
Fangrui Song	5888dee7d0	[ARM,ELF] Fix access to dso_preemptable __stack_chk_guard with static relocation model (#70014 ) The ELF code from https://reviews.llvm.org/D112811 emits LDRLIT_ga_pcrel when `TM.isPositionIndependent()` but uses a different condition `Subtarget.isGVIndirectSymbol(GV)` (aka dso_preemptable on ELF targets). This would cause incorrect access for dso_preemptable `__stack_chk_guard` with the static relocation model. Regarding whether `__stack_chk_guard` gets the dso_local specifier, https://reviews.llvm.org/D150841 switched to `M.getDirectAccessExternalData()` (implied by "PIC Level") instead of `TM.getRelocationModel() == Reloc::Static`. The result is that when non-zero "PIC Level" is used with static relocation model (e.g. -fPIE/-fPIC LTO compiles with -no-pie linking), `__stack_chk_guard` accesses are incorrect. ``` ldr r0, .LCPI0_0 ldr r0, [r0] ldr r0, [r0] // incorrectly dereferences __stack_chk_guard ... .LCPI0_0: .long __stack_chk_guard ``` To fix this, for dso_preemptable `__stack_chk_guard`, emit a GOT PIC code sequence like for -fpic using `LDRLIT_ga_pcrel`: ``` ldr r0, .LCPI0_0 .LPC0_0: add r0, pc, r0 ldr r0, [r0] ldr r0, [r0] ... LCPI0_0: .Ltmp0: .long __stack_chk_guard(GOT_PREL)-((.LPC0_0+8)-.Ltmp0) ``` Technically, `LDRLIT_ga_abs` with `R_ARM_GOT_ABS` could be used, but `R_ARM_GOT_ABS` does not have GNU or integrated assembler support. (Note, `.LCPI0_0: .long __stack_chk_guard@GOT` produces an `R_ARM_GOT_BREL`, which is not desired). This patch fixes #6499 while not changing behavior for the following configurations: ``` run arm.linux.nopic --target=arm-linux-gnueabi -fno-pic run arm.linux.pie --target=arm-linux-gnueabi -fpie run arm.linux.pic --target=arm-linux-gnueabi -fpic run armv6.darwin.nopic --target=armv6-apple-darwin -fno-pic run armv6.darwin.dynamicnopic --target=armv6-apple-darwin -mdynamic-no-pic run armv6.darwin.pic --target=armv6-apple-darwin -fpic run armv7.darwin.nopic --target=armv7-apple-darwin -mcpu=cortex-a8 -fno-pic run armv7.darwin.dynamicnopic --target=armv7-apple-darwin -mcpu=cortex-a8 -mdynamic-no-pic run armv7.darwin.pic --target=armv7-apple-darwin -mcpu=cortex-a8 -fpic run arm64.darwin.pic --target=arm64-apple-darwin ```	2023-10-31 15:37:26 -07:00
David Green	75b3c3d267	[ARM] Disable UpperBound loop unrolling for MVE tail predicated loops. (#69709 ) For MVE tail predicated loops, better code can be generated by keeping the loop whole than to unroll to an upper bound, which requires the expansion of active lane masks that can be difficult to generate good code for. This patch disables UpperBound unrolling when we find a active_lane_mask in the loop.	2023-10-31 09:51:30 +00:00
Fangrui Song	8e247b8f47	Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC	2023-10-27 00:30:41 -07:00
Craig Topper	2f4328e697	[GISel] Make assignValueToReg take CCValAssign by const reference. (#70086 ) This was previously passed by value. It used to be passed by non-const reference, but it was changed to value in D110610. I'm not sure why.	2023-10-24 15:47:04 -07:00
Craig Topper	9f592cbc18	[GISel] Pass MPO and VA to assignValueToAddress by const reference. NFC (#69810 ) Previously they were passed by non-const reference. No in tree target modifies the values. This makes it possible to call assignValueToAddress from assignCustomValue without a const_cast. For example in this patch https://github.com/llvm/llvm-project/pull/69138.	2023-10-24 09:58:22 -07:00
David Green	8a701024f3	[ARM] Lower i1 concat via MVETRUNC The MVETRUNC operation can perform the same truncate of two vectors, without requiring lane inserts/extracts from every vector lane. This moves the concat i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra stack space for less instructions.	2023-10-18 19:40:11 +01:00
David Green	c060757bcc	[ARM] Correct v2i1 concat extract types. For two v2i1 concat into a v4i1, we cannot extract each i64 element as an i32. This casts to a v4i32 instead and extracts the correct vector lanes.	2023-10-18 13:40:38 +01:00
Kazu Hirata	9bcc094d37	[llvm] Use llvm::erase_if (NFC)	2023-10-12 22:59:25 -07:00
Kazu Hirata	4a0ccfa865	Use llvm::endianness::{big,little,native} (NFC) Note that llvm::support::endianness has been renamed to llvm::endianness while becoming an enum class as opposed to an enum. This patch replaces support::{big,little,native} with llvm::endianness::{big,little,native}.	2023-10-12 21:21:45 -07:00
ostannard	b98b567c25	[ARM] Correctly handle .inst in IT and VPT blocks (#68902 ) Advance the IT and VPT block state when parsing the .inst directive, so that it is possible to use them to emit conditional instructions. If we don't do this, then a later instruction inside or just after the block will have a mis-matched condition, so be incorrectly reported as an error.	2023-10-12 17:03:01 +01:00
Kazu Hirata	b8885926f8	Use llvm::endianness::{big,little,native} (NFC) Note that llvm::support::endianness has been renamed to llvm::endianness while becoming an enum class as opposed to an enum. This patch replaces llvm::support::{big,little,native} with llvm::endianness::{big,little,native}.	2023-10-10 22:54:51 -07:00
Kazu Hirata	a9d5056862	Use llvm::endianness (NFC) Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces support::endianness with llvm::endianness.	2023-10-10 21:54:15 -07:00
Kazu Hirata	d7b18d5083	Use llvm::endianness{,::little,::native} (NFC) Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces llvm::support::endianness with llvm::endianness.	2023-10-09 00:54:47 -07:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00

1 2 3 4 5 ...

12216 Commits