llvm-project

Author	SHA1	Message	Date
Jon Chesterfield	d77ae7f251	[amdgpu] Reimplement LDS lowering Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variable basis. Unit tests are set up to pass with the default lowering of "module" or "hybrid" with this patch defaulting to "module", which will be a less dramatic codegen change relative to the current. This reflects the sparsity of test coverage for the table lowering method. Hybrid is better than module in every respect and will be default in a subsequent patch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139433	2022-12-07 22:02:54 +00:00
Chris Bieneman	c861ea8736	Generate DXIL Shader hash DXIL shader bitcode is hashed and the hash is placed into the final output object file in its own data part. This change modifies the DXContainerGlobals pass to compute the shader hash (just an MD5 of the bitcode) and put the shader hash data into a global for the HASH part. This also sets the hash flag as appropriate for if the hashed shader contained debug information. There is additional handling required to get debug information in shaders working correctly with our tooling, but that will be addressed in subsequent patches. Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D139357	2022-12-07 15:22:55 -06:00
Craig Topper	2c52d516da	Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC" This reverts commit d24915207c631b7cf637081f333b41bc5159c700. Thinking about this more this probably chewed up 100+ bytes of stack for each recursive call. So this probably needs more thought. The code simplification wasn't that much.	2022-12-07 12:59:31 -08:00
Matt Arsenault	90f60a6a73	NVPTX: Cleanup check for denormal mode Go through the common query and be explicit about the supported flush type.	2022-12-07 15:56:21 -05:00
Koakuma	f8f41c3fcd	[SPARC] Lower SELECT_CC to MOVr on 64-bit target whenever possible On 64-bit target, when doing i64 SELECT_CC where one of the comparison operands is a constant zero, try to fold the compare and MOVcc into a MOVr instruction. For all integers, EQ and NE comparison are available, additionally for signed integers, GT, GE, LT, and LE is also available. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138922	2022-12-07 15:34:58 -05:00
Brad Smith	7806f86a5e	Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros" 2 of the Sparc tests are now failing. This reverts commit 2c41310fc146a1f609147c65ac5f30e5a57e84a8.	2022-12-07 15:27:57 -05:00
Craig Topper	258bb453fb	[RISCV] Without Zfh, promote f16 inputs before creating RISCVISD::FCVT_W(U)_RV64 nodes. This allows us to remove a couple more Zfhmin isel patterns.	2022-12-07 12:25:30 -08:00
Craig Topper	e3540fb948	[RISCV] Promote f16 fp_to_int_sat with Zfhmin during lowering instead of isel. We already have a custom handler for FP_TO_(S/U)INT_SAT. It's easy enought to inject an FP_EXTEND in there.	2022-12-07 11:58:30 -08:00
James Y Knight	099001979f	[SPARC] Simplify instruction decoder. After https://reviews.llvm.org/D137653 named sub-operands can be used in the auto-generated instruction decoders. This allows the auto-generated decoders to work properly, so all the hand-coded decoders in the sparc target can be removed. In some instances, a manually-written decoder had not been implemented for an instruction, and thus that instruction was not decoded properly. These have been fixed (and tests added). Differential Revision: https://reviews.llvm.org/D137727	2022-12-07 14:37:08 -05:00
James Y Knight	372240dfe3	[TableGen] More named sub-operands work. Commit a538d1f13a13 first added support for named sub-operands in CodeEmitterGen. We now add a few more features to that, enabling further target cleanups. 1. Adds support for handling an EncoderMethod in a sub-operand in CodeEmitterGen. Previously, the specified encoder of a sub-operand was ignored, and only the default used. 2. Adds support for sub-operands in DecoderEmitter, along with support for tied sub-operands. The changes to the decoder required a few minor tweaks to a few targets, where existing brokeness was exposed. In order to keep this patch small, I left FIXMEs which will be addressed in upcoming patches. (Except MIPS16, since its object file emission/decoding is totally broken). Differential Revision: https://reviews.llvm.org/D137653	2022-12-07 14:37:08 -05:00
Craig Topper	b12fe0d429	[RISCV] Consolidate identical (fcopysign FPR32:, FPR16:) isel patterns. NFC	2022-12-07 11:35:55 -08:00
Keith Smiley	c9b6d641f0	Fix @llvm.global_ctors docs (NFC)	2022-12-07 11:24:08 -08:00
Tim Northover	6b98824a58	AArch64: emit `fcmp ord %a, zeroinitializer` as a single fcmeq. Most "ord" checks need two real-world compares to implement, but this is the canonical form of a "!isnan" check, which is equivalent to comparing the input for equality against itself.	2022-12-07 19:17:30 +00:00
Koakuma	2c41310fc1	[SPARC] Mark the %g0 register as constant & use it to materialize zeros Materialize zeros by copying from %g0, which is now marked as constant. This makes it possible for some common operations (like integer negation) to be performed in fewer instructions. This continues @arichardson's patch at D132561. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138887	2022-12-07 13:34:13 -05:00
Joe Nash	bbfbec94b1	[AMDGPU] Enable OMod on more VOP3 instructions OMod was disabled if OpSel was enabled, but that restriction is more specific than necessary. Any VOP3 with float operands can use OMod. On GFX11, FMAC_F16_e64 can use op_sel. Previously, SIFoldOperands and convertToThreeAddress were accidentally correct when they reinterpreted the zero OMod operand on V_FMAC_F16_e64 as the OpSel operand on V_FMA_F16_gfx9_e64. Now we explicitly add op_sel if required. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139469	2022-12-07 13:30:33 -05:00
Philip Reames	14ea545a7d	[RISCV][InsertVSETVLI] Generalize scalar move rule for when AVL is unchanged By definition, the AVL of the scalar move is equally zero to the prior AVL if they are the same value. This generalizes the existing code to the case where the scalar move has a register AVL which is unknown, but unchanged from the preceeding instruction. This doesn't cause any interesting diffs on its own, but another patch makes this case much more common. Split off to reduce a future diff.	2022-12-07 10:28:31 -08:00
Craig Topper	f1fd5c9b36	[RISCV] Remove pseudos for whole register load, store, and move. The MC layer instructions have the correct register classes, and the pseudos don't have any additional operands. So there doesn't seem to be any reason for them to exist. The pseudos were incorrectly going through code in RISCVMCInstLower that converted LMUL>1 register classes to LMUL1 register class. This makes the MCInst technically malformed, and prevented the vl2r.v, vl4r.v, and vl8r.v InstAliases from matching. This accounts for all of the .ll test diffs. Differential Revision: https://reviews.llvm.org/D139511	2022-12-07 10:19:58 -08:00
Haojian Wu	50daddf279	Fix an -Wunused-variable warning in release build, NFC	2022-12-07 18:59:17 +01:00
Craig Topper	938d0d6d7b	[RISCV] Replace uses of hasStdExtC with COrZca. Except MakeCompressible which will need more work. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D139504	2022-12-07 09:34:01 -08:00
Mirko Brkusanin	fe42ebe442	[AMDGPU][GlobalISel] Fix legalizing image intrinsics for new types We no longer need to increase vector size to 16 for intrinsics that use more than 8 vgprs for addr. There is no image intrinsic that needs more than 12 so all currently existing cases will be covered. Using incorrect size was causing an error in instruction selection because instructions were updated to require new types (9x32, 10x32, 11x32, 12x32). Differential Revision: https://reviews.llvm.org/D139546	2022-12-07 18:20:58 +01:00
Simon Pilgrim	de0ff1fcb1	X86SelectionDAGInfo.cpp - move dyn_cast check inside if(). NFC. Minor cleanup - we only use the non-null pointer inside the if() block	2022-12-07 14:21:24 +00:00
Jay Foad	632118a090	[AMDGPU] Use SOP_Pseudo more consistently. NFC. SOPK_Pseudo was not inheriting from SOP_Pseudo at all, and some other Pseudo classes were needlessly redefining things that were already defined by SOP_Pseudo. Differential Revision: https://reviews.llvm.org/D139527	2022-12-07 12:56:42 +00:00
Haojian Wu	a2215149ae	Add implementation isTargetCanonicalConstantNode for hexagon. This fixes an infinite compiling loop caused by https://reviews.llvm.org/D137140 Differential Revision: https://reviews.llvm.org/D139525	2022-12-07 13:21:29 +01:00
Stephen Thomas	ab2e27faa4	[AMDGPU] Small cleanup in insertWaitcntInBlock() Move some code that checks if an instruction is a waitcount into a separate function, mainly to aid readability in the logic where it is used. Differential Revision: https://reviews.llvm.org/D139522	2022-12-07 11:58:59 +00:00
Janek van Oirschot	587747d8d1	[AMDGPU] G_IS_FPCLASS lower() support for IEEE fp types Simplified globalisel version of sdag's expandIS_FPCLASS. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D139128	2022-12-07 11:53:09 +00:00
Anton Sidorenko	f8ed709345	[MachineCombiner] Extend reassociation logic to handle inverse instructions Machine combiner supports generic reassociation only of associative and commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we can extend this generic support to handle patterns like (X + A) - Y => (X - Y) + A), where `-` is the inverse of `+`. This patch adds interface functions to process reassociation patterns of associative/commutative instructions and their inverse variants with minimal changes in backends. Differential Revision: https://reviews.llvm.org/D136754	2022-12-07 13:50:28 +03:00
David Sherwood	93d9c2e563	[SVE] Commonise bfmlal* and fmlal* instruction classes Given the significant commonality between the bfmlal* and fmlal* instructions it makes sense to use just a single class for both. We can do this now that the bfmlal* lane intrinsics take a i32 index. Differential Revision: https://reviews.llvm.org/D138906	2022-12-07 09:30:32 +00:00
Piotr Sobczak	1e3abd82b9	[AMDGPU] Fix wide spills Update spill code to account for new vector types with bit widths: 288, 320, 352, 384. Related to D138205. Differential Revision: https://reviews.llvm.org/D139203	2022-12-07 10:23:52 +01:00
David Sherwood	bfb6f47e9e	[SVE] Change some bfloat lane intrinsics to use i32 immediates Almost all of the other SVE LLVM IR intrinsics take i32 values for lane indices or other immediates. We should bring the bfloat intrinsics in line with that. It will also make it easier to add support for the SVE2.1 float intrinsics in future, since they reuse the same underlying instruction classes. I've maintained backwards compatibility with the old i64 variants and used the autoupgrade mechanism. Differential Revision: https://reviews.llvm.org/D138788	2022-12-07 09:19:54 +00:00
Qiu Chaofan	62f20f51ce	[PowerPC] Support test data class intrinsic of 128-bit float We've exploited test data class instructions introduced in ISA 3.0. This change unifies the scalar intrinsics into ppc_test_data_class and add support for 128-bit precision float values using xststdcqp. Vector versions of the intrinsic can't be unified because they return vector int instead of int. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138105	2022-12-07 16:44:12 +08:00
Jay Foad	b4ce9e9521	[AMDGPU] Change handling of s_endpgm's optional operand. NFC. s_endpgm is a special SOPP instruction in that its operand is optional and if it is not present then we don't want to print a space after the mnemonic. Previously this was handled by defaulting real_name to the mnemonic with a trailing space, and having s_endpgm override it to be the mnemonic with no trailing space. This patch implements a different approach where the separator between Mnemonic and AsmOperands defaults to a space, but s_endpgm overrides it to be the empty string. Differential Revision: https://reviews.llvm.org/D139412	2022-12-07 08:12:14 +00:00
Yeting Kuo	0f8c761c48	[VP][RISCV] Recommit "Add vp.fshl/fshr and RISC-V support." This reverts commit 7883e5b061bdbbe8bee5f479ebe911db5045b7e9. The original commit was reverted that it didn't update test files after D136263 landed. The recommit fixed those. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139509	2022-12-07 15:58:12 +08:00
Xiaodong Liu	6d34074d86	Reland: "[LoongArch] Use tablegen size for getInstSizeInBytes" Correct the pseudo atomic instruction size for branch relaxation and branch folding passes. Inspired by D118175, D118009 and D117970. Depends on D138481 Reviewed By: SixWeining, gonglingqin, xen0n Differential Revision: https://reviews.llvm.org/D138469	2022-12-07 15:51:23 +08:00
Justin Bogner	bcfdaa96f5	[AMDGPU] Handle `min(max(x, y), max(min(x, y), z))` in med3 combines Differential Revision: https://reviews.llvm.org/D139508	2022-12-06 22:59:43 -08:00
Justin Bogner	916ae0a060	[AMDGPU] Handle nnan and fast on the call in fpmed3 patterns We were only allowing these med3 patterns if the operands were known to not be NaN, but we should also allow it if the calls to max/min have the `nnan` or `fast` flags. Differential Revision: https://reviews.llvm.org/D139506	2022-12-06 22:57:52 -08:00
Kazu Hirata	7883e5b061	Revert "[VP][RISCV] Add vp.fshl/fshr and RISC-V support." This reverts commit 70de0e014013b4d97febe6704881a9a8c893d078. I'm seeing: Failed Tests (2): LLVM :: CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll LLVM :: CodeGen/RISCV/rvv/fshr-fshl-vp.ll Also reported at: https://lab.llvm.org/buildbot/#/builders/123/builds/14531	2022-12-06 22:27:43 -08:00
Yeting Kuo	8c8a6e1488	[RISCV] Add basic cost model for vp float rounding instructions. Reviewed By: craig.topper, reames Differential Revision: https://reviews.llvm.org/D137766	2022-12-07 14:15:13 +08:00
Monk Chiang	7b50c18360	[RISCV] Codegen support for Zfhmin. The Zfhmin subset only has FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S. If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are also included. Since most instructions are not included for Zfhmin, so most operations are promoted. The patch primarily about making f16 a legal type. RISC-V ISA info: https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139391	2022-12-06 22:14:15 -08:00
Craig Topper	d42c76aba0	[RISCV] Remove trailing whitespace. NFC	2022-12-06 21:20:00 -08:00
Yeting Kuo	70de0e0140	[VP][RISCV] Add vp.fshl/fshr and RISC-V support. The patch made VectorLegalizer expand ISD::VP_FSHL and ISD::VP_FSHR to achieve the codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D138379	2022-12-07 12:16:36 +08:00
Kazu Hirata	405fc404bf	[ADT] Don't including None.h (NFC) These source files no longer use None, so they do not need to include None.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-06 20:14:51 -08:00
Michael Maitland	297e95b865	[RISCV][CodeGen] Kill dead pseudo classes and replace with specific LMUL versions. NFC Since changes to account for LMUL in scheduler model existed over patches, we had to keep both LMUL specific and all LMUL classes around. Now that only the LMUL specific classes are used, we can remove the old ones.	2022-12-06 17:11:20 -08:00
Michael Maitland	7b36502854	[RISCV][CodeGen] Account for LMUL for Vector Integer load store instructions It is likley that subtargets act differently for a vector load store instructions based on the LMUL. This patch creates seperate SchedRead, SchedWrite, WriteRes, ReadAdvance for each relevant LMUL. Differential Revision: https://reviews.llvm.org/D137429	2022-12-06 16:57:35 -08:00
Michael Maitland	1a43227ba5	[RISCV][CodeGen] Account for LMUL for Vector Permutation Instructions It is likley that subtargets act differently for vector fixed-point arithmetic instructions based on the LMUL. This patch creates seperate SchedRead, SchedWrite, WriteRes, ReadAdvance for each relevant LMUL. Differential Revision: https://reviews.llvm.org/D137428	2022-12-06 16:54:49 -08:00
Gregory Alfonso	cb38be9ed3	[NFC] Use Register instead of unsigned for variables that receive a Register object Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D139451	2022-12-07 00:23:34 +00:00
Craig Topper	8d30b9e64f	[RISCV] Move VSPILL/VRELOAD expansion for vector tuples to eliminateFrameIndex. We need a scratch GPR to increment the base pointer for each subsequent register. We currently reuse the input GPR for the base pointer without declaring it as a Def of the pseudo. We can't add it as a Def of the pseudo at creation time because it doesn't get register allocated. This was tried in D109405. Seems the only choice we have is to scavenge the GPR. This patch moves the expansion to eliminateFrameIndex where we can create virtual registers that will be scavenged. This also eliminates the extra operand for passing vlenb from frame lowering to expand pseudos. I need to do more testing on real world code, but wanted to get this up for early review. I hope this will fix the issue reported in D123394, but I haven't checked yet. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D139169	2022-12-06 15:42:00 -08:00
Craig Topper	d6cfdf0440	[RISCV] Pass ZB_Undefined to countTrailingZeros/countLeadingZeros. NFC We know the input is not zero so we can simplify the generated code.	2022-12-06 14:57:28 -08:00
Craig Topper	d24915207c	[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC We should be able to rely on RVO here.	2022-12-06 14:57:27 -08:00
Krzysztof Parzyszek	c589730ad5	[YAML] Convert Optional to std::optional	2022-12-06 12:49:32 -08:00
Amy Kwan	48634b3b93	[NFC][PowerPC] Add NFC fixes to PPCInstrinfo.cpp when getting the defined machine instruction. This patch adds the following NFC fixes to PPCInstrInfo.cpp when getting the DefMI: - Fix documentation error to state that we want to flag a use of register between the def and the MI (in post-RA) - Setting the DefMI to null if the DefMI is neither an LI or and ADDI (while still being in SSA form). In terms of setting the DefMI to null, this change aims to account for the scenario of when we end up going through all operands on the machine instruction MI and updating OpNoForForwarding accordingly once an ADDI is found as the DefMI. It is possible that once an ADDI is found, we will continue to go through all operands in attempts to find an LI, but end up looking at every operand until we reach the end if we have not yet found an LI. In the case where the end is reached and we never end up finding an LI/ADDI, DefMI would be pointing to the last operand of MI while OpNoForForwarding would still be pointing at the previous ADDI operand found. We reset DefMI to avoid having DefMI point to an instruction that differs from the one represented by OpNoForForwarding. Differential Revision: https://reviews.llvm.org/D137483	2022-12-06 14:23:50 -06:00

1 2 3 4 5 ...

70086 Commits