llvm-project

Author	SHA1	Message	Date
Nashe Mncube	21f7c62627	[LLVM][ARM] Latency mutations for cortex m55,m7 and m85 (#115153 ) This patch adds latency mutations as a scheduling related speedup for the above mentioned cores. When benchmarking this pass on selected benchmarks we see a performance improvement of 1% on most benchmarks with some improving by up to 6%. Author: David Penry <david.penry@arm.com> Co-authored-by: Nashe Mncube <nashe.mncube@arm.com	2024-11-13 17:13:41 +00:00
Oliver Stannard	dff114b356	[ARM] Optimise non-ABI frame pointers (#110286 ) With -fomit-frame-pointer, even if we set up a frame pointer for other reasons (e.g. variable-sized or over-aligned stack allocations), we don't need to create an ABI-compliant frame record. This means that we can save all of the general-purpose registers in one push, instead of splitting it to ensure that the frame pointer and link register are adjacent on the stack, saving two instructions per function.	2024-10-28 09:01:06 +00:00
Oliver Stannard	493529fbce	Re-land: [ARM] Fix frame chains with M-profile PACBTI (#110285 ) When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.	2024-10-24 16:44:16 +01:00
Oliver Stannard	18ac0178ad	Revert "[ARM] Fix frame chains with M-profile PACBTI (#110285 )" Reverting because this is causing failures with MSan: https://lab.llvm.org/buildbot/#/builders/169/builds/4378 This reverts commit e1f8f84acec05997893c305c78fbf7feecf44dd7.	2024-10-18 09:04:28 +01:00
Oliver Stannard	e1f8f84ace	[ARM] Fix frame chains with M-profile PACBTI (#110285 ) When using AAPCS-compliant frame chains with PACBTI return address signing, there ware a number of bugs in the generation of the frame pointer and function prologues. The most obvious was that we sometimes would modify r11 before pushing it to the stack, so it wasn't preserved as required by the PCS. We also sometimes did not push R11 and LR adjacent to one another on the stack, or used R11 as a frame pointer without pointing it at the saved value of R11, both of which are required to have an AAPCS compliant frame chain. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.	2024-10-17 09:32:44 +01:00
Michał Górny	387b37af1a	[LLVM] [Clang] Support for Gentoo `t64` triples (64-bit time_t ABIs) (#111302 ) Gentoo is planning to introduce a `t64` suffix for triples that will be used by 32-bit platforms that use 64-bit `time_t`. Add support for parsing and accepting these triples, and while at it make clang automatically enable the necessary glibc feature macros when this suffix is used. An open question is whether we can backport this to LLVM 19.x. After all, adding new triplets to Triple sounds like an ABI change — though I suppose we can minimize the risk of breaking something if we move new enum values to the very end.	2024-10-14 11:18:04 +00:00
Oliver Stannard	67200f5dc8	[ARM] Tidy up stack frame strategy code (NFC) (#110283 ) We have two different ways of splitting the pushes of callee-saved registers onto the stack, controlled by the confusingly similar names STI.splitFramePushPop() and STI.splitFramePointerPush(). This removes those functions and replaces them with a single function which returns an enum. This is in preparation for adding another value to that enum. The original work of this patch was done by James Westwood, reviewed as #82801 and #81249, with some tidy-ups done by Mark Murray and myself.	2024-10-09 09:29:27 +01:00
Nashe Mncube	439dcfafc5	[llvm][ARM][NFC] Renaming FeaturePrefLoopAlignment (#109932 ) The feature 'FeaturePrefLoopAlignment' was misleading as it was used to set the alignment of branch targets such as functions. Renamed to FeaturePreferfBranchAlignment.	2024-09-26 13:36:12 +01:00
Nikita Popov	dfa54298ff	[InitUndef] Enable the InitUndef pass on non-AMDGPU targets (#108353 ) The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in #77770. I've since removed the target-specific parts of the pass in #106744 and #107885. This PR reduces the pass to use a single requiresDisjointEarlyClobberAndUndef() target hook and enables it by default. The hook is disabled for AMDGPU, because overlapping early-clobber and undef operands are known to be safe for that target, and we get significant codegen diffs otherwise. The motivating case is the one in arm64-ldxr-stxr.ll, where we were previously incorrectly allocating a stxp input and output to the same register.	2024-09-16 09:48:25 +02:00
Tomas Matheson	71c5964f5c	[ARM][AArch64] autogenerate header file for TargetParser from Target tablegen files (#88378 ) Introduce a mechanism to share data between the ARM and AArch64 backends and TargetParser, to reduce duplication of code. This is similar to the current RISC-V implementation. The target tablegen file (in this case `ARM.td` or `AArch64.td`) is processed during building of `TargetParser` to generate the following files in the build tree: - `build/include/llvm/TargetParser/ARMTargetParserDef.inc` - `build/include/llvm/TargetParser/AArch64TargetParserDef.inc` For now, the use of these generated files is limited to files _outside_ of `TargetParser`. The main reason for this is that the modifications to `TargetParser` will require additional data added to the tablegen files, which I want to split into separate PRs.	2024-04-24 09:18:36 +01:00
Jonathan Thackray	8160139136	Add support for Arm Cortex A78AE CPU (#84485 ) Add support for Arm Cortex A78AE CPU Technical Reference Manual for Arm Cortex A78AE: https://developer.arm.com/documentation/101779/0003 Fixes #84450	2024-03-08 16:11:36 +00:00
James Westwood	b2c16e7ff4	Revert "[ARM] R11 not pushed adjacent to link register with PAC-M and… (#84019 ) … AAPCS frame chain fix (#82801)" This reverts commit 00e4a4197137410129d4725ffb82bae9ce44bdde. This patch was found to cause miscompilations and compilation failures.	2024-03-05 14:34:43 +00:00
James Westwood	00e4a41971	[ARM] R11 not pushed adjacent to link register with PAC-M and AAPCS frame chain fix (#82801 ) When code for M class architecture was compiled with AAPCS and PAC enabled, the frame pointer, r11, was not pushed to the stack adjacent to the link register. Due to PAC being enabled, r12 was placed between r11 and lr. This patch fixes this by adding an extra case to the already existing code that splits the GPR push in two when R11 is the frame pointer and certain paremeters are met. The differential revision for this previous change can be found here: https://reviews.llvm.org/D125649. This now ensures that r11 and lr are pushed in a separate push instruction to the other GPRs when PAC and AAPCS are enabled, meaning the frame pointer and link register are now pushed onto the stack adjacent to each other.	2024-03-04 12:00:36 +00:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Lucas Duarte Prates	6bbaad1ed4	[ARM] Introduce the v9.5-A architecture version to Arm targets (#78994 ) This introduces the Armv9.5-A architecture version to the Arm backend, following on from the existing implementation for AArch64 targets. Mode details about the Armv9.5-A architecture version can be found at: * https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023 * https://developer.arm.com/documentation/ddi0602/2023-09/	2024-01-23 14:39:15 +00:00
Kazu Hirata	b85f1f9b18	[Target] Include bitset (NFC) These files are relying on the transitive include of <bitset> from GIMatchTableExecutor.h, which doesn't actually use std::bitset.	2023-12-09 18:34:57 -08:00
Jonathan Thackray	8758e648da	[ARM][AArch32] Add support for AArch32 Cortex-M52 CPU (#74822 ) Cortex-M52 is an Armv8.1 AArch32 CPU. Technical specifications available at: https://developer.arm.com/processors/cortex-m52	2023-12-08 15:04:08 +00:00
Nicholas Guy	d65feccb12	[ARM] Set preferred function alignment Aligning functions yields small performance gains on embedded cores, moreso with numerous small function calls. Similar to aligning loops, if the function can fit within a single cache line then the performance overhead of fetching more instructions can be limited. Differential Revision: https://reviews.llvm.org/D157514	2023-08-16 17:31:21 +01:00
Maurice Heumann	249bd9eab0	[ARM] Fix codegen of unaligned volatile load/store of i64 Volatile loads/stores of i64 are lowered to LDRD/STRD on ARMv5TE. However, these instructions require the addresses to be aligned. Unaligned loads/stores therefore should be ignored by this handling. Differential Revision: https://reviews.llvm.org/D152790	2023-06-26 10:45:41 -07:00
Simon Tatham	10e4228114	[ARM,AArch64] Add a full set of -mtp= options. AArch64 has five system registers intended to be useful as thread pointers: one for each exception level which is RW at that level and inaccessible to lower ones, and the special TPIDRRO_EL0 which is readable but not writable at EL0. AArch32 has three, corresponding to the AArch64 ones that aren't specific to EL2 or EL3. Currently clang supports only a subset of these registers, and not even a consistent subset between AArch64 and AArch32: - For AArch64, clang permits you to choose between the four TPIDR_ELn thread registers, but not the fifth one, TPIDRRO_EL0. - In AArch32, on the other hand, the //only// thread register you can choose (apart from 'none, use a function call') is TPIDRURO, which corresponds to (the bottom 32 bits of) AArch64's TPIDRRO_EL0. So there is no thread register that you can currently use in both targets! For custom and bare-metal purposes, users might very reasonably want to use any of these thread registers. There's no reason they shouldn't all be supported as options, even if the default choices follow existing practice on typical operating systems. This commit extends the range of values acceptable to the `-mtp=` clang option, so that you can specify any of these registers by (the lower-case version of) their official names in the ArmARM: - For AArch64: tpidr_el0, tpidrro_el0, tpidr_el1, tpidr_el2, tpidr_el3 - For AArch32: tpidrurw, tpidruro, tpidrprw All existing values of the option are still supported and behave the same as before. Defaults are also unchanged. No command line that worked already should change behaviour as a result of this. The new values for the `-mtp=` option have been agreed with Arm's gcc developers (although I don't know whether they plan to implement them in the near future). Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D152433	2023-06-15 09:27:41 +01:00
Kazu Hirata	d7aa279e98	[ARM] Remove unused declaration computeIssueWidth The corresponding function definition was removed by: commit b2680c718fc49698e820441ed30c692a63476660 Author: Andrew Trick <atrick@apple.com> Date: Tue Jun 5 03:44:43 2012 +0000	2023-05-16 18:42:49 -07:00
David Green	15d2821263	[ARM] Fix qsat for armv5te/armv6 + thumb-mode This is a Thumb1 target, so will not have qsat instructions available. There was a mismatch between hasBaseDSP and the instruction patterns when +dsp was present, which is set by clang (but maybe shouldn't be). The target being thumb1-only should override that, implying that it does not have any qadds. Fixes #62273	2023-04-23 17:20:28 +01:00
Pavel Kosov	c417b7a695	[OHOS] Add support for OpenHarmony Add support for OpenHarmony OS General OpenHarmony OS discussion on discourse thread "[RFC] Add support for OpenHarmony OS" https://discourse.llvm.org/t/rfc-add-support-for-openharmony-os/66656 Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D138202	2023-02-27 17:15:45 +03:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Ties Stuij	983f63f7f0	[AArch64][ARM] add Armv8.9-a/Armv9.4-a identifier support For both ARM and AArch64 add support for specifying -march=armv8.9a/armv9.4a to clang. Add backend plumbing like target parser and predicate support. For a summary of Amv8.9/Armv9.4 features, see: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 For detailed information, consult the Arm Architecture Reference Manual for A-profile architecture: https://developer.arm.com/documentation/ddi0487/latest/ People who contributed to this patch: - Keith Walker - Ties Stuij Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D138010	2022-11-16 10:20:14 +00:00
David Spickett	e428baf001	[LLVM][ARM] Remove options for armv2, 2A, 3 and 3M Fixes #57486 These pre v4 architectures are not specifically supported by codegen. As demonstrated in the linked issue. GCC has not supported 3M since GCC 9 and presumably 2 and 2A earlier than that. So we are aligned in that sense. (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add) This removes the options and associated testing. The Pre_v4 build attribute remains mainly because its absence would be more confusing. It will not be used other than to complete the list of build attributes as shown in the ABI. https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes Reviewed By: nickdesaulniers, peter.smith, rengolin Differential Revision: https://reviews.llvm.org/D133109	2022-09-08 09:49:48 +00:00
Lucas Prates	70a5c52534	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-27 14:08:48 +01:00
Krasimir Georgiev	8f2ba36336	Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records AND [NFC][Thumb] Update frame-chain codegen test to use thumbv6m" This reverts commit 7625e01d661644a560884057755d48a0da8b77b4 and dependent cbcce82ef6b512d97e92a319a75a03e997c844e1. Commit 7625e01d661644a560884057755d48a0da8b77b4 causes some new codegen test failures under asan, e.g., CodeGen/ARM/execute-only.ll: https://lab.llvm.org/buildbot/#/builders/5/builds/24659/steps/15/logs/stdio.	2022-06-15 16:10:02 +02:00
Lucas Prates	7625e01d66	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-14 13:37:51 +01:00
Lucas Prates	33b9ad647e	Revert "[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records" Reverting change due to test failure. This reverts commit 6119053dab67129eb1700dbf36db3524dd3e421f.	2022-06-13 11:00:49 +01:00
Lucas Prates	6119053dab	[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records Currently the a AAPCS compliant frame record is not always created for functions when it should. Although a consistent frame record might not be required in some cases, there are still scenarios where applications may want to make use of the call hierarchy made available trough it. In order to enable the use of AAPCS compliant frame records whilst keep backwards compatibility, this patch introduces a new command-line option (`-mframe-chain=[none\|aapcs\|aapcs+leaf]`) for Aarch32 and Thumb backends. The option allows users to explicitly select when to use it, and is also useful to ensure the extra overhead introduced by the frame records is only introduced when necessary, in particular for Thumb targets. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125094	2022-06-13 10:21:06 +01:00
Martin Storsjö	2ab19bfa41	[ARM] Adjust the frame pointer when it's needed for SEH unwinding For functions that require restoring SP from FP (e.g. that need to align the stack, or that have variable sized allocations), the prologue and epilogue previously used to look like this: push {r4-r5, r11, lr} add r11, sp, #8 ... sub r4, r11, #8 mov sp, r4 pop {r4-r5, r11, pc} This is problematic, because this unwinding operation (restoring sp from r11 - offset) can't be expressed with the SEH unwind opcodes (probably because this unwind procedure doesn't map exactly to individual instructions; note the detour via r4 in the epilogue too). To make unwinding work, the GPR push is split into two; the first one pushing all other registers, and the second one pushing r11+lr, so that r11 can be set pointing at this spot on the stack: push {r4-r5} push {r11, lr} mov r11, sp ... mov sp, r11 pop {r11, lr} pop {r4-r5} bx lr For the same setup, MSVC generates code that uses two registers; r11 still pointing at the {r11,lr} pair, but a separate register used for restoring the stack at the end: push {r4-r5, r7, r11, lr} add r11, sp, #12 mov r7, sp ... mov sp, r7 pop {r4-r5, r7, r11, pc} For cases with clobbered float/vector registers, they are pushed after the GPRs, before the {r11,lr} pair. Differential Revision: https://reviews.llvm.org/D125649	2022-06-02 12:28:46 +03:00
David Penry	dcb77643e3	Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM Fixed "private field is not used" warning when compiled with clang. original commit: 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f reverted in: fa49021c68ef7a7adcdf7b8a44b9006506523191 ------ This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-29 10:54:39 -07:00
David Penry	fa49021c68	Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM" This reverts commit 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f while I investigate a buildbot failure.	2022-04-28 13:29:27 -07:00
David Penry	28d09bbbc3	[CodeGen][ARM] Enable Swing Module Scheduling for ARM This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-28 13:01:18 -07:00
Eli Friedman	2f497ec3a0	[ARM] Fix ARM backend to correctly use atomic expansion routines. Without this patch, clang would generate calls to __sync_* routines on targets where it does not make sense; we can't assume the routines exist on unknown targets. Linux has special implementations of the routines that work on old ARM targets; other targets have no such routines. In general, atomics operations which aren't natively supported should go through libatomic (__atomic_) APIs, which can support arbitrary atomics through locks. ARM targets older than v6, where this patch makes a difference, are rare in practice, but not completely extinct. See, for example, discussion on D116088. This also affects Cortex-M0, but I don't think __sync_ routines actually exist in any Cortex-M0 libraries. So in practice this just leads to a slightly different linker error for those cases, I think. Mechanically, this patch does the following: - Ensures we run atomic expansion unconditionally; it never makes sense to completely skip it. - Fixes getMaxAtomicSizeInBitsSupported() so it returns an appropriate number on all ARM subtargets. - Fixes shouldExpandAtomicRMWInIR() and shouldExpandAtomicCmpXchgInIR() to correctly handle subtargets that don't have atomic instructions. Differential Revision: https://reviews.llvm.org/D120026	2022-03-18 12:43:57 -07:00
Tomas Matheson	831ab35b2f	[ARM][AArch64] generate subtarget feature flags Reland of D120906 after sanitizer failures. This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 16:07:00 +00:00
Tomas Matheson	62c481542e	Revert "[ARM][AArch64] generate subtarget feature flags" This reverts commit dd8b0fecb95df7689aac26c2ef9ebd1f527f9f46.	2022-03-18 11:58:20 +00:00
Tomas Matheson	dd8b0fecb9	[ARM][AArch64] generate subtarget feature flags This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods. Some naming inconsistencies have been fixed to allow this, and one unused member removed. This implementation only applies to boolean members; in future both BitVector and enum members could also be generated. Differential Revision: https://reviews.llvm.org/D120906	2022-03-18 11:48:20 +00:00
Mircea Trofin	cb2160760e	[nfc][codegen] Move RegisterBank[Info].h under CodeGen This wraps up from D119053. The 2 headers are moved as described, fixed file headers and include guards, updated all files where the old paths were detected (simple grep through the repo), and `clang-format`-ed it all. Differential Revision: https://reviews.llvm.org/D119876	2022-03-01 21:53:25 -08:00
Egor Zhdan	3a1cb36237	Add DriverKit support This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang. This code was originally authored by JF Bastien. Differential Revision: https://reviews.llvm.org/D118046	2022-02-22 13:42:53 +00:00
Mark Murray	3d7662142d	[ARM] Undeprecate complex IT blocks AArch32/Armv8A introduced the performance deprecation of certain patterns of IT instructions. After some debate internal to ARM, this is now being reverted; i.e. no IT instruction patterns are performance deprecated anymore, as the perfomance degredation is not significant enough. This reverts the following: "ARMv8-A deprecates some uses of the T32 IT instruction. All uses of IT that apply to instructions other than a single subsequent 16-bit instruction from a restricted set are deprecated, as are explicit references to the PC within that single 16-bit instruction. This permits the non-deprecated forms of IT and subsequent instructions to be treated as a single 32-bit conditional instruction." The deprecation no longer applies, but the behaviour may be controlled by the -arm-restrict-it and -arm-no-restrict-it command-line options, with the latter being the default. No warnings about complex IT blocks will be generated. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118044	2022-02-07 15:47:53 +00:00
Ties Stuij	6b1e844b69	[ARM] Add Cortex-X1C Support for Clang and LLVM This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and ARM. For more information, see: - https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c - https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components The following people contributed to this patch: - Simon Tatham - Ties Stuij Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D117202	2022-01-31 14:23:35 +00:00
Lucas Prates	cd7f621a0a	[ARM][AArch64] Introduce Armv9.3-A This patch introduces support for targetting the Armv9.3-A architecture, which should map to the existing Armv8.8-A extensions. Differential Revision: https://reviews.llvm.org/D116158	2022-01-03 12:40:43 +00:00
Simon Tatham	d50072f74e	[ARM] Introduce an empty "armv8.8-a" architecture. This is the first commit in a series that implements support for "armv8.8-a" architecture. This should contain all the necessary boilerplate to make the 8.8-A architecture exist from LLVM and Clang's point of view: it adds the new arch as a subtarget feature, a definition in TargetParser, a name on the command line, an appropriate set of predefined macros, and adds appropriate tests. The new architecture name is supported in both AArch32 and AArch64. However, in this commit, no actual _functionality_ is added as part of the new architecture. If you specify -march=armv8.8a, the compiler will accept it and set the right predefines, but generate no code any differently. Differential Revision: https://reviews.llvm.org/D115694	2021-12-31 16:43:53 +00:00
Ties Stuij	63eb7ff47d	[ARM] Implement PAC return address signing mechanism for PACBTI-M This patch implements PAC return address signing for armv8-m. This patch roughly accomplishes the following things: - PAC and AUT instructions are generated. - They're part of the stack frame setup, so that shrink-wrapping can move them inwards to cover only part of a function - The auth code generated by PAC is saved across subroutine calls so that AUT can find it again to check - PAC is emitted before stacking registers (so that the SP it signs is the one on function entry). - The new pseudo-register ra_auth_code is mentioned in the DWARF frame data - With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT validates the corresponding value of SP - Emit correct unwind information when PAC is replaced by PACBTI - Handle tail calls correctly Some notes: We make the assembler accept the `.save {ra_auth_code}` directive that is emitted by the compiler when it saves a register that contains a return address authentication code. For EHABI we need to have the `FrameSetup` flag on the instruction and handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit `.save {ra_auth_code}`, instead of `.save {r12}`. For PACBTI-M, the instruction which computes return address PAC should use SP value before adjustment for the argument registers save are (used for variadic functions and when a parameter is is split between stack and register), but at the same it should be after the instruction that saves FPCXT when compiling a CMSE entry function. This patch moves the varargs SP adjustment after the FPCXT save (they are never enabled at the same time), so in a following patch handling of the `PAC` instruction can be placed between them. Epilogue emission code adjusted in a similar manner. PACBTI-M code generation should not emit any instructions for architectures v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases is handled separately by a future ticket. note on tail calls: If the called function has four arguments that occupy registers `r0`-`r3`, the only option for holding the function pointer itself is `r12`, but this register is used to keep the PAC during function/prologue epilogue and clobbers the function pointer. When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to keep six values - the four function arguments, the function pointer and the PAC, which is obviously impossible. One option would be to authenticate the return address before all callee-saved registers are restored, so we have a scratch register to temporarily keep the value of `r12`. The issue with this approach is that it violates a fundamental invariant that PAC is computed using CFA as a modifier. It would also mean using separate instructions to pop `lr` and the rest of the callee-saved registers, which would offset the advantages of doing a tail call. Instead, this patch disables indirect tail calls when the called function take four or more arguments and the return address sign and authentication is enabled for the caller function, conservatively assuming the caller function would spill LR. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Momchil Velikov - Ties Stuij Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D112429	2021-12-07 10:15:19 +00:00
Ties Stuij	0fbb17458a	[ARM] Implement setjmp BTI placement for PACBTI-M This patch intends to guard indirect branches performed by longjmp by inserting BTI instructions after calls to setjmp. Calls with 'returns-twice' are lowered to a new pseudo-instruction named t2CALL_BTI that is later expanded to a bundle of {tBL,t2BTI}. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Alexandros Lamprineas - Ties Stuij Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D112427	2021-12-06 11:07:10 +00:00
Ties Stuij	5cff77c23f	[clang][ARM] PACBTI-M assembly support Introduce assembly support for Armv8.1-M PACBTI extension. This is an optional extension in v8.1-M. There are 10 new system registers and 5 new instructions, all predicated on the feature. The attribute for llvm-mc is called "pacbti". For armclang, an architecture extension also called "pacbti" was created. This patch is part of a series that adds support for the PACBTI-M extension of the Armv8.1-M architecture, as detailed here: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension The PACBTI-M specification can be found in the Armv8-M Architecture Reference Manual: https://developer.arm.com/documentation/ddi0553/latest The following people contributed to this patch: - Victor Campos - Ties Stuij Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D112420	2021-11-30 09:28:18 +00:00
Mubashar Ahmad	8e47b83ec9	[AArch64][ARM] Enablement of Cortex-A710 Support Phabricator review: https://reviews.llvm.org/D113256	2021-11-18 10:58:05 +00:00
Daniel Kiss	d8075e8781	Reland "[ARM] __cxa_end_cleanup should be called instead of _UnwindResume." This is relanding commit da1d1a08694bbfe0ea7a23ea094612436e8a2dd0 . This patch additionally addresses failures found in buildbots & post review comments. ARM EHABI[1] specifies the __cxa_end_cleanup to be called after cleanup. It will call the UnwindResume. __cxa_begin_cleanup will be called from libcxxabi while __cxa_end_cleanup is never called. This will trigger a termination when a foreign exception is processed while UnwindResume is called because the global state will be wrong due to the missing __cxa_end_cleanup call. Additional test here: D109856 [1] https://github.com/ARM-software/abi-aa/blob/main/ehabi32/ehabi32.rst#941compiler-helper-functions Reviewed By: logan Differential Revision: https://reviews.llvm.org/D111703	2021-10-28 21:45:09 +02:00

1 2 3 4 5 ...

395 Commits