llvm-project

Author	SHA1	Message	Date
Ruiling, Song	67c55b1ffc	[AMDGPU] Make max dwords of memory cluster configurable (#119342 ) We find it helpful to increase the value for graphics workload. Make it configurable so we can experiment with a different value.	2024-12-18 14:17:27 +08:00
Fangrui Song	cd12922235	[test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -march= is error-prone when running on a host whose OS is different.	2024-12-15 13:08:02 -08:00
Fangrui Song	e339f0a9da	[test] Remove redundant -march=x86-64 when target triple is specified in IR	2024-12-15 11:30:14 -08:00
Fangrui Song	40a4cbb0f2	[MIR,test] Change llc -march=x86-64 to -mtriple=x86_64 Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS). Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as these MIR tests do not utilize object file format specific detail, but it's good to change these tests to neighbor files that use -mtriple=x86_64	2024-12-15 11:23:08 -08:00
Fangrui Song	b279f6b098	[NVPTX,test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS), leaving a target triple which may not make sense. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.	2024-12-15 10:45:11 -08:00
Fangrui Song	2208c97c1b	[Hexagon,test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly.	2024-12-15 10:20:22 -08:00
Fangrui Song	ae26f50aea	[test] Change llc -march=mips* to -mtriple=mips* Similar to 806761a7629df268c8aed49657aeccffa6bca449	2024-12-10 22:14:06 -08:00
Michael Maitland	b816c26289	[RISCV][MIR] Move skip-mir-comment-trailing-whitespace.mir into RISCV subdirectory	2024-11-11 12:02:29 -08:00
Michael Maitland	2b58458225	[MIRLexer][RISCV] Eat a space after the Machine comment (#115365 ) The MIRPrinter emits ` :: ` at the start of a MMO. The MIRLexer eats all the white space after the operand and before the `::` when there is no comment. We need to eat the space after the comment to allow MIRLexer to parse comments on a MMO.	2024-11-11 14:48:31 -05:00
Shilei Tian	6548b6354d	Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403 )" This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.	2024-11-08 20:21:16 -05:00
Shilei Tian	ca33649abe	Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403 )" This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.	2024-11-08 16:36:35 -05:00
Shilei Tian	e215a1e27d	[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403 )	2024-11-08 13:05:35 -05:00
dyung	bc7e099aa8	Revert "[AMDGPU][MIR] Serialize NumPhysicalVGPRSpillLanes" (#115353 ) Reverts llvm/llvm-project#115291 Reverting due to test failures on many bots including https://lab.llvm.org/buildbot/#/builders/174/builds/8049	2024-11-07 13:02:51 -05:00
Akshat Oke	21835ee28d	[AMDGPU][MIR] Serialize NumPhysicalVGPRSpillLanes (#115291 )	2024-11-07 20:08:36 +05:30
Akshat Oke	e76d9214c8	[AMDGPU] Fix 3495d04 MIR test (#114963 ) Needed to specify scratchRSrcReg and spreg in order to stop after prologepilog. - Fixes #113129 test failure	2024-11-05 17:11:47 +05:30
Akshat Oke	3495d04560	[AMDGPU][MIR] Serialize SpillPhysVGPRs (#113129 )	2024-11-05 13:17:25 +05:30
Thorsten Schütt	4b028773b2	Revert "[GlobalISel] Import samesign flag" (#114256 ) Reverts llvm/llvm-project#113090	2024-10-30 17:03:17 +01:00
Thorsten Schütt	72b115301d	[GlobalISel] Import samesign flag (#113090 ) Credits: https://github.com/llvm/llvm-project/pull/111419	2024-10-30 16:34:01 +01:00
Jack Styles	86f76c3b17	[AArch64][Libunwind] Add Support for FEAT_PAuthLR DWARF Instruction (#112171 ) As part of FEAT_PAuthLR, a new DWARF Frame Instruction was introduced, `DW_CFA_AARCH64_negate_ra_state_with_pc`. This instructs Libunwind that the PC has been used with the signing instruction. This change includes three commits - Libunwind support for the newly introduced DWARF Instruction - CodeGen Support for the DWARF Instructions - Reversing the changes made in #96377. Due to `DW_CFA_AARCH64_negate_ra_state_with_pc`'s requirements to be placed immediately after the signing instruction, this would mean the CFI Instruction location was not consistent with the generated location when not using FEAT_PAuthLR. The commit reverses the changes and makes the location consistent across the different branch protection options. While this does have a code size effect, this is a negligible one. For the ABI information, see here: `853286c7ab/aadwarf64/aadwarf64.rst (id23)`	2024-10-28 08:22:38 +00:00
Akshat Oke	6360652e9f	Reland [AMDGPU] Serialize WWM_REG vreg flag (#110229 ) (#112492 ) A reland but not an exact copy as `VRegInfo.Flags` from the parser is now an int8 instead of a vector; so only need to copy over the value.	2024-10-21 13:44:09 +05:30
Peter Collingbourne	3cab8827fd	Revert "[AMDGPU] Serialize WWM_REG vreg flag (#110229 )" This reverts commit bec839d8eed9dd13fa7eaffd50b28f8f913de2e2. Caused buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/52/builds/2928	2024-10-15 13:18:43 -07:00
Akshat Oke	8b20f1b924	[MIR] Fix tests for flags in register info (#112179 ) [MIR] Serialize virtual register flags #110228 introduces register flags which appear empty in .mir dumps. Future tests should use `-simplify-mir`.	2024-10-14 18:28:54 +05:30
Akshat Oke	bec839d8ee	[AMDGPU] Serialize WWM_REG vreg flag (#110229 )	2024-10-14 14:37:21 +05:30
Akshat Oke	dbfca24b99	[MIR] Serialize virtual register flags (#110228 ) [MIR] Serialize virtual register flags This introduces target-specific vreg flag serialization. Flags are represented as `uint8_t` and the `TargetRegisterInfo` override provides methods `getVRegFlagValue` to deserialize and `getVRegFlagsOfReg` to serialize.	2024-10-14 14:19:53 +05:30
Stephen Tozer	d826b0c90f	[LLVM] Add HasFakeUses to MachineFunction (#110097 ) Following the addition of the llvm.fake.use intrinsic and corresponding MIR instruction, two further changes are planned: to add an -fextend-lifetimes flag to Clang that emits these intrinsics, and to have -Og enable this flag by default. Currently, some logic for handling fake uses is gated by the optdebug attribute, which is intended to be switched on by -fextend-lifetimes (and by extension -Og later on). However, the decision was made that a general optdebug attribute should be incompatible with other opt_ attributes (e.g. optsize, optnone), since they all express different intents for how to optimize the program. We would still like to allow -fextend-lifetimes with optsize however (i.e. -Os -fextend-lifetimes should be legal), since it may be a useful configuration and there is no technical reason to not allow it. This patch resolves this by tracking MachineFunctions that have fake uses, allowing us to run passes that interact with them and skip passes that clash with them.	2024-10-04 13:13:30 +01:00
Dominik Montada	d853adee00	[MIR] Fix return value when computed properties conflict with given prop (#109923 ) This fixes a test failure when expensive checks are enabled. Use the correct return value when computing machine function properties resulted in an error (e.g. when conflicting with explicitly set values). Without this, the machine verifier would crash even in the presence of parsing errors which should have gently terminated execution.	2024-09-25 10:47:14 +02:00
Dominik Montada	8ba334bc4a	[MIR] Allow overriding isSSA, noPhis, noVRegs in MIR input (#108546 ) Allow setting the computed properties IsSSA, NoPHIs, NoVRegs for MIR functions in MIR input. The default value is still the computed value. If the property is set to false, the computed result is ignored. Conflicting values (e.g. setting IsSSA where the input MIR is clearly not SSA) lead to an error. Closes #37787	2024-09-24 14:21:45 +02:00
gonzalobg	78ae2de4c6	[NVPTX] Load/Store/Fence syncscope support (#106101 ) Adds "initial" support for `syncscope` to the NVPTX backend `load`/`store`/`fence` instructions. Atomic Read-Modify-Write operations intentionally not supported as part of this initial PR.	2024-09-23 10:18:00 -07:00
Abinaya Saravanan	c010b72e9b	[HEXAGON] AddrModeOpt support for HVX and optimize adds (#106368 ) This patch does 3 things: 1. Add support for optimizing the address mode of HVX load/store instructions 2. Reduce the value of Add instruction immediates by replacing with the difference from other Addi instructions that share common base: For Example, If we have the below sequence of instructions: r1 = add(r2,# 1024) ... r3 = add(r2,# 1152) ... r4 = add(r2,# 1280) Where the register r2 has the same reaching definition, They get modified to the below sequence: r1 = add(r2,# 1024) ... r3 = add(r1,# 128) ... r4 = add(r1,# 256) 3. Fixes a bug pass where the addi instructions were modified based on a predicated register definition, leading to incorrect output. Eg: INST-1: if (p0) r2 = add(r13,# 128) INST-2: r1 = add(r2,# 1024) INST-3: r3 = add(r2,# 1152) INST-4: r5 = add(r2,# 1280) In the above case, since r2's definition is predicated, we do not want to modify the uses of r2 in INST-3/INST-4 with add(r1,#128/256) 4.Fixes a corner case It looks like we never check whether the offset register is actually live (not clobbered) at optimization site. Add the check whether it is live at MBB entrance. The rest should have already been verified. 5. Fixes a bad codegen For whatever reason we do transformation without checking if the value in register actually reaches the user. This is second identical fix for this pass. Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com> Co-authored-by: Sergei Larin <slarin@quicinc.com>	2024-09-13 18:48:34 -05:00
Diana Picus	3356208531	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108512 ) This reverts commit `7792b4ae79`. The problem was a conflict with `e55d6f5ea2` "[AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (https://github.com/llvm/llvm-project/pull/107889)" which changed the syntax of V_SET_INACTIVE (and thus made my MIR test crash). ...if only we had a merge queue.	2024-09-13 11:54:30 +02:00
Diana Picus	7792b4ae79	Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 )"" (#108341 ) Reverts llvm/llvm-project#108173 si-init-whole-wave.mir crashes on some buildbots (although it passed both locally with sanitizers enabled and in pre-merge tests). Investigating.	2024-09-12 10:12:09 +02:00
Diana Picus	703ebca869	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 )" (#108173 ) This reverts commit `c7a7767fca`. The buildbots failed because I removed a MI from its parent before updating LIS. This PR should fix that.	2024-09-12 09:11:41 +02:00
Vitaly Buka	c7a7767fca	Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054 ) Breaks bots, see #105822. Reverts llvm/llvm-project#105822	2024-09-10 09:51:43 -07:00
Diana Picus	44556e64f2	[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (#105822 ) This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve).	2024-09-10 13:24:53 +02:00
Carl Ritson	16cda01d22	[AMDGPU] V_SET_INACTIVE optimizations (#98864 ) Optimize V_SET_INACTIVE by allow it to run in WWM. Hence WWM sections are not broken up for inactive lane setting. WWM V_SET_INACTIVE can typically be lower to V_CNDMASK. Some cases require use of exec manipulation V_MOV as previous code. GFX9 sees slight instruction count increase in edge cases due to smaller constant bus. Additionally avoid introducing exec manipulation and V_MOVs where a source of V_SET_INACTIVE is the destination. This is a common pattern as WWM register pre-allocation often assigns the same register.	2024-09-05 14:39:28 +09:00
Stephen Tozer	9a58b12fe7	[ExtendLifetimes][NFC] Add explicit triple to new fake-use tests Several tests for the new fake use intrinsic are failing on NVPTX buildbots due to relying on behaviour for their expected triple; this commit adds that triple to each of them to prevent failures. Fixes commit 3d08ade (#86149). Example buildbot failures: https://lab.llvm.org/buildbot/#/builders/160/builds/4175 https://lab.llvm.org/buildbot/#/builders/180/builds/4173	2024-08-29 18:43:35 +01:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Matt Arsenault	b1bcb7ca46	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive.	2024-07-15 11:51:44 +04:00
paperchalice	c09ed6a29e	[CodeGen][NewPM] Port `MachineVerifier` to new pass manager (#98628 ) - Add `MachineVerifierPass`. - Use complete `MachineVerifierPass` in `VerifyInstrumentation` if possible. `LiveStacksAnalysis` will be added in future, all other analyses are done.	2024-07-15 12:42:44 +08:00
dyung	adaff46d08	Revert "AMDGPU: Move attributor into optimization pipeline (#83131 )" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851 ) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green.	2024-07-14 18:48:54 -07:00
Matt Arsenault	78bc1b64a6	AMDGPU: Move attributor into optimization pipeline (#83131 ) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs.	2024-07-14 08:36:33 +04:00
Scott Linder	9f5756abef	[MIR] Replace bespoke DIExpression parser Resolve FIXME by using the LLParser implementation of parseDIExpression from the MIParser.	2024-07-10 19:26:13 +00:00
Stephen Chou	3c24eb39fb	[LLVM][MIR] Support parsing bfloat immediates in MIR parser (#96010 ) Adds support in MIR parser for parsing bfloat immediates, and adds a test for this.	2024-06-25 16:44:14 -04:00
Alan Zhao	836703087d	[BranchFolder] Fix missing debug info with tail merging (#94715 ) `BranchFolder::TryTailMergeBlocks(...)` removes unconditional branch instructions and then recreates them. However, this process loses debug source location information from the previous branch instruction, even if tail merging doesn't change IR. This patch preserves the debug information from the removed instruction and inserts them into the recreated instruction. Fixes #94050	2024-06-20 10:48:18 -07:00
paperchalice	1bc8b3258e	[NewPM][CodeGen] Port `regallocfast` to new pass manager (#94426 ) This pull request port `regallocfast` to new pass manager. It exposes the parameter `filter` to handle different register classes for AMDGPU. IIUC AMDGPU need to allocate different register classes separately so it need implement its own `--<reg-class>-regalloc`. Now users can use e.g. `-passe=regallocfast<filter=sgpr>` to allocate specific register class. The command line option `--regalloc-npm` is still in work progress, plan to reuse the syntax of passes, e.g. use `--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to replace `--sgpr-regalloc` and `--vgpr-regalloc`.	2024-06-07 12:22:42 +08:00
paperchalice	e7939d0df6	[Instrumentation] Support verifying machine function (#90931 ) We need it to test isel related passes. Currently `verifyMachineFunction` is incomplete (no LiveIntervals support), but is enough for testing isel pass, will migrate to complete `MachineVerifierPass` in future.	2024-05-04 09:00:59 +08:00
Fangrui Song	b9ae06ba15	[test] Convert text files from CRLF to LF Skip .pdb, .rc, crlf, and FileCheck/dos-style-eol.txt	2024-05-03 10:09:52 -07:00
David Tellenbach	cf2f32c97f	[MIR] Serialize MachineFrameInfo::isCalleeSavedInfoValid() (#90561 ) In case of functions without a stack frame no "stack" field is serialized into MIR which leads to isCalleeSavedInfoValid being false when reading a MIR file back in. To fix this we should serialize MachineFrameInfo::isCalleeSavedInfoValid() into MIR.	2024-05-01 10:07:51 -07:00
Jonas Paulsson	09bc6abba6	[MachineFrameInfo] Refactoring around computeMaxcallFrameSize() (NFC) (#78001 ) - Use computeMaxCallFrameSize() in PEI::calculateCallFrameInfo() instead of duplicating the code. - Set AdjustsStack in FinalizeISel instead of in computeMaxCallFrameSize().	2024-03-18 10:37:59 -04:00
Sameer Sahasrabuddhe	ec34699f75	[GlobalISel] convergence control tokens and intrinsics (#67006 ) [GlobalISel] Implement convergence control tokens and intrinsics in GMIR In the IR translator, convert the LLVM token type to LLT::token(), which is an alias for the s0 type. These show up as implicit uses on convergent operations. Differential Revision: https://reviews.llvm.org/D158147	2024-03-18 10:34:11 +05:30

1 2 3 4 5 ...

688 Commits