llvm-project

Author	SHA1	Message	Date
Thorsten Schütt	4b028773b2	Revert "[GlobalISel] Import samesign flag" (#114256 ) Reverts llvm/llvm-project#113090	2024-10-30 17:03:17 +01:00
Thorsten Schütt	72b115301d	[GlobalISel] Import samesign flag (#113090 ) Credits: https://github.com/llvm/llvm-project/pull/111419	2024-10-30 16:34:01 +01:00
Jack Styles	86f76c3b17	[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171 ) As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced, `DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that the PC has been used with the signing instruction. This change includes three commits - Libunwind support for the newly introduced DWARF Instruction - CodeGen Support for the DWARF Instructions - Reversing the changes made in #96377. Due to `DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed immediately after the signing instruction, this would mean the CFI Instruction location was not consistent with the generated location when not using FEAT_PAuthLR. The commit reverses the changes and makes the location consistent across the different branch protection options. While this does have a code size effect, this is a negligible one. For the ABI information, see here: `853286c7ab/aadwarf64/aadwarf64.rst (id23)`	2024-10-28 08:22:38 +00:00
Akshat Oke	6360652e9f	Reland [AMDGPU] Serialize WWM_REG vreg flag (#110229 ) (#112492 ) A reland but not an exact copy as `VRegInfo.Flags` from the parser is now an int8 instead of a vector; so only need to copy over the value.	2024-10-21 13:44:09 +05:30
Peter Collingbourne	3cab8827fd	Revert "[AMDGPU] Serialize WWM_REG vreg flag (#110229 )" This reverts commit bec839d8eed9dd13fa7eaffd50b28f8f913de2e2. Caused buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/52/builds/2928	2024-10-15 13:18:43 -07:00
Akshat Oke	8b20f1b924	[MIR] Fix tests for flags in register info (#112179 ) [MIR] Serialize virtual register flags #110228 introduces register flags which appear empty in .mir dumps. Future tests should use `-simplify-mir`.	2024-10-14 18:28:54 +05:30
Akshat Oke	bec839d8ee	[AMDGPU] Serialize WWM_REG vreg flag (#110229 )	2024-10-14 14:37:21 +05:30
Akshat Oke	dbfca24b99	[MIR] Serialize virtual register flags (#110228 ) [MIR] Serialize virtual register flags This introduces target-specific vreg flag serialization. Flags are represented as `uint8_t` and the `TargetRegisterInfo` override provides methods `getVRegFlagValue` to deserialize and `getVRegFlagsOfReg` to serialize.	2024-10-14 14:19:53 +05:30
Stephen Tozer	d826b0c90f	[LLVM] Add HasFakeUses to MachineFunction (#110097 ) Following the addition of the llvm.fake.use intrinsic and corresponding MIR instruction, two further changes are planned: to add an -fextend-lifetimes flag to Clang that emits these intrinsics, and to have -Og enable this flag by default. Currently, some logic for handling fake uses is gated by the optdebug attribute, which is intended to be switched on by -fextend-lifetimes (and by extension -Og later on). However, the decision was made that a general optdebug attribute should be incompatible with other opt_ attributes (e.g. optsize, optnone), since they all express different intents for how to optimize the program. We would still like to allow -fextend-lifetimes with optsize however (i.e. -Os -fextend-lifetimes should be legal), since it may be a useful configuration and there is no technical reason to not allow it. This patch resolves this by tracking MachineFunctions that have fake uses, allowing us to run passes that interact with them and skip passes that clash with them.	2024-10-04 13:13:30 +01:00
Dominik Montada	d853adee00	[MIR] Fix return value when computed properties conflict with given prop (#109923 ) This fixes a test failure when expensive checks are enabled. Use the correct return value when computing machine function properties resulted in an error (e.g. when conflicting with explicitly set values). Without this, the machine verifier would crash even in the presence of parsing errors which should have gently terminated execution.	2024-09-25 10:47:14 +02:00
Dominik Montada	8ba334bc4a	[MIR] Allow overriding isSSA, noPhis, noVRegs in MIR input (#108546 ) Allow setting the computed properties IsSSA, NoPHIs, NoVRegs for MIR functions in MIR input. The default value is still the computed value. If the property is set to false, the computed result is ignored. Conflicting values (e.g. setting IsSSA where the input MIR is clearly not SSA) lead to an error. Closes #37787	2024-09-24 14:21:45 +02:00
gonzalobg	78ae2de4c6	[NVPTX] Load/Store/Fence syncscope support (#106101 ) Adds "initial" support for `syncscope` to the NVPTX backend `load`/`store`/`fence` instructions. Atomic Read-Modify-Write operations intentionally not supported as part of this initial PR.	2024-09-23 10:18:00 -07:00
Abinaya Saravanan	c010b72e9b	[HEXAGON] AddrModeOpt support for HVX and optimize adds (#106368 ) This patch does 3 things: 1. Add support for optimizing the address mode of HVX load/store instructions 2. Reduce the value of Add instruction immediates by replacing with the difference from other Addi instructions that share common base: For Example, If we have the below sequence of instructions: r1 = add(r2,# 1024) ... r3 = add(r2,# 1152) ... r4 = add(r2,# 1280) Where the register r2 has the same reaching definition, They get modified to the below sequence: r1 = add(r2,# 1024) ... r3 = add(r1,# 128) ... r4 = add(r1,# 256) 3. Fixes a bug pass where the addi instructions were modified based on a predicated register definition, leading to incorrect output. Eg: INST-1: if (p0) r2 = add(r13,# 128) INST-2: r1 = add(r2,# 1024) INST-3: r3 = add(r2,# 1152) INST-4: r5 = add(r2,# 1280) In the above case, since r2's definition is predicated, we do not want to modify the uses of r2 in INST-3/INST-4 with add(r1,#128/256) 4.Fixes a corner case It looks like we never check whether the offset register is actually live (not clobbered) at optimization site. Add the check whether it is live at MBB entrance. The rest should have already been verified. 5. Fixes a bad codegen For whatever reason we do transformation without checking if the value in register actually reaches the user. This is second identical fix for this pass. Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com> Co-authored-by: Sergei Larin <slarin@quicinc.com>	2024-09-13 18:48:34 -05:00
Diana Picus	3356208531	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108512 ) This reverts commit `7792b4ae79`. The problem was a conflict with `e55d6f5ea2` "[AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (https://github.com/llvm/llvm-project/pull/107889)" which changed the syntax of V_SET_INACTIVE (and thus made my MIR test crash). ...if only we had a merge queue.	2024-09-13 11:54:30 +02:00
Diana Picus	7792b4ae79	Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 )"" (#108341 ) Reverts llvm/llvm-project#108173 si-init-whole-wave.mir crashes on some buildbots (although it passed both locally with sanitizers enabled and in pre-merge tests). Investigating.	2024-09-12 10:12:09 +02:00
Diana Picus	703ebca869	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 )" (#108173 ) This reverts commit `c7a7767fca`. The buildbots failed because I removed a MI from its parent before updating LIS. This PR should fix that.	2024-09-12 09:11:41 +02:00
Vitaly Buka	c7a7767fca	Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 ) Breaks bots, see #105822. Reverts llvm/llvm-project#105822	2024-09-10 09:51:43 -07:00
Diana Picus	44556e64f2	[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (#105822 ) This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve).	2024-09-10 13:24:53 +02:00
Carl Ritson	16cda01d22	[AMDGPU] V_SET_INACTIVE optimizations (#98864 ) Optimize V_SET_INACTIVE by allow it to run in WWM. Hence WWM sections are not broken up for inactive lane setting. WWM V_SET_INACTIVE can typically be lower to V_CNDMASK. Some cases require use of exec manipulation V_MOV as previous code. GFX9 sees slight instruction count increase in edge cases due to smaller constant bus. Additionally avoid introducing exec manipulation and V_MOVs where a source of V_SET_INACTIVE is the destination. This is a common pattern as WWM register pre-allocation often assigns the same register.	2024-09-05 14:39:28 +09:00
Stephen Tozer	9a58b12fe7	[ExtendLifetimes][NFC] Add explicit triple to new fake-use tests Several tests for the new fake use intrinsic are failing on NVPTX buildbots due to relying on behaviour for their expected triple; this commit adds that triple to each of them to prevent failures. Fixes commit 3d08ade (#86149). Example buildbot failures: https://lab.llvm.org/buildbot/#/builders/160/builds/4175 https://lab.llvm.org/buildbot/#/builders/180/builds/4173	2024-08-29 18:43:35 +01:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Matt Arsenault	b1bcb7ca46	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.	2024-07-15 11:51:44 +04:00
paperchalice	c09ed6a29e	[CodeGen][NewPM] Port `MachineVerifier` to new pass manager (#98628 ) - Add `MachineVerifierPass`. - Use complete `MachineVerifierPass` in `VerifyInstrumentation` if possible. `LiveStacksAnalysis` will be added in future, all other analyses are done.	2024-07-15 12:42:44 +08:00
dyung	adaff46d08	Revert "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green.	2024-07-14 18:48:54 -07:00
Matt Arsenault	78bc1b64a6	AMDGPU: Move attributor into optimization pipeline (#83131 ) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs.	2024-07-14 08:36:33 +04:00
Scott Linder	9f5756abef	[MIR] Replace bespoke DIExpression parser Resolve FIXME by using the LLParser implementation of parseDIExpression from the MIParser.	2024-07-10 19:26:13 +00:00
Stephen Chou	3c24eb39fb	[LLVM][MIR] Support parsing bfloat immediates in MIR parser (#96010 ) Adds support in MIR parser for parsing bfloat immediates, and adds a test for this.	2024-06-25 16:44:14 -04:00
Alan Zhao	836703087d	[BranchFolder] Fix missing debug info with tail merging (#94715 ) `BranchFolder::TryTailMergeBlocks(...)` removes unconditional branch instructions and then recreates them. However, this process loses debug source location information from the previous branch instruction, even if tail merging doesn't change IR. This patch preserves the debug information from the removed instruction and inserts them into the recreated instruction. Fixes #94050	2024-06-20 10:48:18 -07:00
paperchalice	1bc8b3258e	[NewPM][CodeGen] Port `regallocfast` to new pass manager (#94426 ) This pull request port `regallocfast` to new pass manager. It exposes the parameter `filter` to handle different register classes for AMDGPU. IIUC AMDGPU need to allocate different register classes separately so it need implement its own `--<reg-class>-regalloc`. Now users can use e.g. `-passe=regallocfast<filter=sgpr>` to allocate specific register class. The command line option `--regalloc-npm` is still in work progress, plan to reuse the syntax of passes, e.g. use `--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to replace `--sgpr-regalloc` and `--vgpr-regalloc`.	2024-06-07 12:22:42 +08:00
paperchalice	e7939d0df6	[Instrumentation] Support verifying machine function (#90931 ) We need it to test isel related passes. Currently `verifyMachineFunction` is incomplete (no LiveIntervals support), but is enough for testing isel pass, will migrate to complete `MachineVerifierPass` in future.	2024-05-04 09:00:59 +08:00
Fangrui Song	b9ae06ba15	[test] Convert text files from CRLF to LF Skip .pdb, .rc, crlf, and FileCheck/dos-style-eol.txt	2024-05-03 10:09:52 -07:00
David Tellenbach	cf2f32c97f	[MIR] Serialize MachineFrameInfo::isCalleeSavedInfoValid() (#90561 ) In case of functions without a stack frame no "stack" field is serialized into MIR which leads to isCalleeSavedInfoValid being false when reading a MIR file back in. To fix this we should serialize MachineFrameInfo::isCalleeSavedInfoValid() into MIR.	2024-05-01 10:07:51 -07:00
Jonas Paulsson	09bc6abba6	[MachineFrameInfo] Refactoring around computeMaxcallFrameSize() (NFC) (#78001 ) - Use computeMaxCallFrameSize() in PEI::calculateCallFrameInfo() instead of duplicating the code. - Set AdjustsStack in FinalizeISel instead of in computeMaxCallFrameSize().	2024-03-18 10:37:59 -04:00
Sameer Sahasrabuddhe	ec34699f75	[GlobalISel] convergence control tokens and intrinsics (#67006 ) [GlobalISel] Implement convergence control tokens and intrinsics in GMIR In the IR translator, convert the LLVM token type to LLT::token(), which is an alias for the s0 type. These show up as implicit uses on convergent operations. Differential Revision: https://reviews.llvm.org/D158147	2024-03-18 10:34:11 +05:30
Diana Picus	bc6955f18c	[AMDGPU] Don't fix the scavenge slot at offset 0 (#79136 ) At the moment, the emergency spill slot is a fixed object for entry functions and chain functions, and a regular stack object otherwise. This patch adopts the latter behaviour for entry/chain functions too. It seems this was always the intention [1] and it will also save us a bit of stack space in cases where the first stack object has a large alignment. [1] `34c8b835b1`	2024-02-09 09:20:25 +01:00
Nikita Popov	ff9af4c43a	[CodeGen] Convert tests to opaque pointers (NFC)	2024-02-05 14:07:09 +01:00
Quentin Dian	112fba974c	[MIRPrinter] Don't print line break when there is no instructions (NFC) (#80147 ) Per #80143, we can remove the extra line break when there is no instruction.	2024-02-01 22:10:52 +08:00
Quentin Dian	b7738e275d	[MIRPrinter] Don't print space when there is no successor (#80143 ) Extra space causes the checks generated by update_mir_test_checks to be unavailable. ``` # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4 # RUN: llc -mtriple=x86_64-- -o - %s -run-pass=none -verify-machineinstrs -simplify-mir \| FileCheck %s --- name: foo body: \| ; CHECK-LABEL: name: foo ; CHECK: bb.0: ; CHECK-NEXT: successors: ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: bb.1: ; CHECK-NEXT: RET 0, $eax bb.0: successors: bb.1: RET 0, $eax ... ``` The failure log is as follows: ``` llvm/test/CodeGen/MIR/X86/unreachable-block-print.mir:9:16: error: CHECK-NEXT: is on the same line as previous match ; CHECK-NEXT: {{ $}} ^ <stdin>:21:13: note: 'next' match was here successors: ^ <stdin>:21:13: note: previous match ended here successors: ```	2024-01-31 22:35:41 +08:00
Mirko Brkušanin	1d286ad59b	[AMDGPU] Add mark last scratch load pass (#75512 )	2024-01-18 09:36:44 +01:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Matt Arsenault	c44dca15a4	MachineVerifier: Reject extra non-register operands on instructions (#73758 ) We were allowing extra immediate arguments, and only bothering to check if registers were implicit or not. Also consolidate extra operand checks in verifier, to make this testable. We had 3 different places checking if you were trying to build an instruction with more operands than allowed by the definition. We had an assertion in addOperand, a direct check in the MIRParser to avoid the assertion, and the machine verifier checks. Remove the assert and parser check so the verifier can provide a consistent verification experience, which will also handle instructions modified in place.	2023-11-30 22:33:42 +09:00
Nick Desaulniers	b053359892	[X86InstrInfo] support memfold on spillable inline asm (#70832 ) This enables -regalloc=greedy to memfold spillable inline asm MachineOperands. Because no instruction selection framework marks MachineOperands as spillable, no language frontend can observe functional changes from this patch. That will change once instruction selection frameworks are updated. Link: https://github.com/llvm/llvm-project/issues/20571	2023-11-29 08:18:51 -08:00
Shengchen Kan	c9017bc793	[X86] Support EGPR (R16-R31) for APX (#70958 ) 1. Map R16-R31 to DWARF registers 130-145. 2. Make R16-R31 caller-saved registers. 3. Make R16-31 allocatable only when feature EGPR is supported 4. Make R16-31 availabe for instructions in legacy maps 0/1 and EVEX space, except XSAVE*/XRSTOR RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4 Explanations for some seemingly unrelated changes: inline-asm-registers.mir, statepoint-invoke-ra-enter-at-end.mir: The immediate (TargetInstrInfo.cpp:1612) used for the regdef/reguse is the encoding for the register class in the enum generated by tablegen. This encoding will change any time a new register class is added. Since the number is part of the input, this means it can become stale. seh-directive-errors.s: R16-R31 makes ".seh_pushreg 17" legal musttail-varargs.ll: It seems some LLVM passes use the number of registers rather the number of allocatable registers as heuristic. This PR is to reland #67702 after #70222 in order to reduce some compile-time regression when EGPR is not used.	2023-11-09 23:39:40 +08:00
Michael Maitland	801a30aa8f	[CodeGen][MIR] Support parsing of scalable vectors in MIR (#70893 ) This patch builds on the support for vectors by adding ability to parse scalable vectors in MIR and updates error messages to reflect that ability.	2023-11-02 21:49:18 -04:00
akirchhoff-modular	4480e650b3	[YAMLParser] Improve plain scalar spec compliance (#68946 ) The `YAMLParser.h` header file claims support for YAML 1.2 with a few deviations, but our plain scalar parsing failed to parse some valid YAML according to the spec. This change puts us more in compliance with the YAML spec, now letting us parse plain scalars containing additional special characters in cases where they are not ambiguous.	2023-10-17 11:28:14 -06:00
Nikita Popov	8cc2b51e63	Revert "[X86] Support EGPR (R16-R31) for APX (#67702 )" This reverts commit feea5db01360b477b8cf2df03abfa9fc986633d5. This causes significant compile-time regressions, even if EGPR is not used.	2023-10-10 10:34:49 +02:00
Shengchen Kan	feea5db013	[X86] Support EGPR (R16-R31) for APX (#67702 ) 1. Map R16-R31 to DWARF registers 130-145. 2. Make R16-R31 caller-saved registers. 3. Make R16-31 allocatable only when feature EGPR is supported 4. Make R16-31 availabe for instructions in legacy maps 0/1 and EVEX space, except XSAVE*/XRSTOR RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4 Explanations for some seemingly unrelated changes: inline-asm-registers.mir, statepoint-invoke-ra-enter-at-end.mir: The immediate (TargetInstrInfo.cpp:1612) used for the regdef/reguse is the encoding for the register class in the enum generated by tablegen. This encoding will change any time a new register class is added. Since the number is part of the input, this means it can become stale. seh-directive-errors.s: R16-R31 makes ".seh_pushreg 17" legal musttail-varargs.ll: It seems some LLVM passes use the number of registers rather the number of allocatable registers as heuristic.	2023-10-10 10:51:04 +08:00
Anatoly Trosinenko	eb02ee44d3	[AArch64] Move PAuth codegen down the machine pipeline To simplify handling PAuth in the machine outliner, introduce a separate AArch64PointerAuth pass that is executed after both Prologue/Epilogue Inserter and Machine Outliner passes. After moving to AArch64PointerAuth, signLR and authenticateLR are not used outside of their class anymore, so make them private and simplify accordingly. The new pass is added via AArch64PassConfig::addPostBBSections(), so that it can change the code size before branch relaxation occurs. AArch64BranchTargets is placed there too, so it can take into account any PACI(A\|B)SP instructions and not excessively add BTIs at the start of functions. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D159357	2023-09-22 14:49:14 +03:00
Diana Picus	5272ae667d	[AMDGPU] Add IsChainFunction to the MachineFunctionInfo This will represent functions with the amdgpu_cs_chain or amdgpu_cs_chain_preserve calling conventions. Differential Revision: https://reviews.llvm.org/D156410	2023-08-21 12:37:32 +02:00
Sameer Sahasrabuddhe	ef38e6d97f	[GlobalISel] introduce MIFlag::NoConvergent Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too conservative, since calls to functions that do not execute convergent operations should not be marked convergent. This information is available in LLVM IR. The new flag MIFlag::NoConvergent now allows the IR translator to mark an instruction as not performing any convergent operations. It is relevant only on occurrences of opcodes that are marked isConvergent in the target. Differential Revision: https://reviews.llvm.org/D157475	2023-08-20 21:14:46 +05:30

1 2 3 4 5 ...

672 Commits