llvm-project

Author	SHA1	Message	Date
Yeting Kuo	d83620d101	[RISCV] Support vector strict_fsetcc/fsetccs. The patch supports vector strict_fsetcc/fsetccs. Instead of revserving fflags, the method to implement scalar quiet compares, the patch implement quiet compares by masking the signaling compares when either input is NaN [0]. [0]: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-floating-point-compare-instructions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D147998	2023-04-14 09:10:41 +08:00
Yeting Kuo	6858a920b8	[RISCV] Support vector type strict_[su]int_to_fp and strict_fp_to_[su]int. Also the patch loose the fixed vector contraint in llvm/lib/IR/Verifier.cpp. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D147380	2023-04-06 10:09:44 +08:00
Dinar Temirbulatov	7f05bdf4ee	[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag. Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve mode we have custom lowering for 128-bit vector splats and later in DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom lowering for SME. Later, we restored SPLAT opertion via performMulCombine().	2023-04-05 10:10:55 +00:00
Craig Topper	219ff07f72	[Targets] Rename Flag->Glue. NFC Long long ago Glue was called Flag, and it was never completely renamed.	2023-04-02 19:28:51 -07:00
Luke Lau	ec26c9cdc0	[RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN This enables the interleaved access pass on O1 and above, and causes interleaving/deinterleaving shuffles of fixed length vectors with stores/loads to be lowered into vssegN/vlsegN. We need to be careful and make sure that we only lower vsseg/vlseg whenever we know the fixed vector type will fit within the minimum vlen, and that the interleaving factor is supported for the given LMUL. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145085	2023-04-02 16:47:44 +01:00
Luke Lau	80f3be9603	Revert "[RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN" This reverts commit b95913e8c3a3521b85d689a358e620d89a4e83de.	2023-04-02 15:56:24 +01:00
Luke Lau	b95913e8c3	[RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN This enables the interleaved access pass on O1 and above, and causes interleaving/deinterleaving shuffles of fixed length vectors with stores/loads to be lowered into vssegN/vlsegN. We need to be careful and make sure that we only lower vsseg/vlseg whenever we know the fixed vector type will fit within the minimum vlen, and that the interleaving factor is supported for the given LMUL. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145085	2023-04-02 15:20:21 +01:00
Yeting Kuo	84c8c2b4b4	[DAG][RISCV] Allow scalable vector ISD::STRICT_FP_ROUND and support vector ISD::STRICT_FP_ROUND for RISC-V. The patch customized lower vector type ISD::STRICT_FP_ROUND to RISCVISD::STRICT_FP_ROUND. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D147113	2023-03-30 08:20:02 +08:00
Yeting Kuo	5cb4619a1f	[RISCV][NFC] Fix ident in RISCVISelLowering.h. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D147120	2023-03-29 16:11:10 +08:00
Yeting Kuo	0676c6d91f	[RISCV] Support vector type strict_fma. Like D145900, the patch also supports fixed vector strict_fma nodes in RISC-V by customized lowering them to riscv_strict_vfmadd_vl nodes. riscv_strict_vfmadd_vl is created to avoid some riscv_vfmadd_vl optimizations happening to original strict_fma nodes. The patch also adds combine patterns for riscv_strict_fmadd_vl nodes with negation operands. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146939	2023-03-28 09:01:46 +08:00
Craig Topper	29463612d2	[RISCV] Replace RISCV -> RISC-V in comments. NFC To be consistent with RISC-V branding guidelines https://riscv.org/about/risc-v-branding-guidelines/ Think we should be using RISC-V where possible. More patches will follow. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146449	2023-03-27 09:50:17 -07:00
Yeting Kuo	946d29e7e9	[RISCV] Support vector type strict_fsqrt. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146911	2023-03-27 14:02:22 +08:00
Craig Topper	b50c6857a4	[RISCV] Move fli selection in RISCVISelDAGToDAG.cpp. NFC We custom isel for ConstantFP that has higher priority than isel patterns. We were previously detecting valid FP constants for fli to early exit from the custom code. This detection called getLoadFPImm. Then we would run the isel patterns which would call getLoadFPImm a second time. With a little bit more code we can directly select the fli instruction in the custom handler and avoid a second call. Remove the incorrect mayRaiseFPException flag from the FLI instructions. Reviewed By: joshua-arch1 Differential Revision: https://reviews.llvm.org/D146093	2023-03-21 19:33:27 -07:00
Yeting Kuo	9637e950cb	[RISCV] Support ISD::STRICT_FADD/FSUB/FMUL/FDIV for vector types. The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed isd nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145900	2023-03-15 07:47:16 +08:00
LiaoChunyu	eb54254b6e	[RISCV] Return false from shouldFormOverflowOp when type is i8 and i16 i8 and i16 are not using overflow. Reduce the number of zero extension instructions. To reduce the uncertainty of the unknown, most of the checks of the virtual function are kept Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D143646	2023-03-14 20:42:55 +08:00
Craig Topper	30705e9770	[RISCV] Support Zfa fli instructions with vector splats. -Return false from RISCVDAGToDAGISel::selectFPImm for fli constants so we don't try to use integer expansion. -Support fli.h with Zvfh+Zfhmin. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D145766	2023-03-10 09:16:21 -08:00
Yeting Kuo	b2c48559c8	[IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND. The patch mainly does two things. The first is allowing scalable vector ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower strict_fpextend to riscv_strict_fpextend_vl, the strict version of riscv_fpextend_vl. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145548	2023-03-09 17:37:59 +08:00
LiaoChunyu	fbace95408	[RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepare Similar to ARM and SystemZ. Related Patchs: D101778(preferZeroCompareBranch) https://reviews.llvm.org/rG9a9421a461166482465e786a46f8cced63cd2e9f ( == 0 to u< 1） Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D142071	2023-02-28 14:43:40 +08:00
Manolis Tsamis	f6262201d8	[RISCV] Add vendor-defined XTheadMemIdx (Indexed Memory Operations) extension The vendor-defined XTHeadMemIdx (no comparable standard extension exists at the time of writing) extension adds indexed load/store instructions as well as load/store and update register instructions. It is supported by the C9xx cores (e.g., found in the wild in the Allwinner D1) by Alibaba T-Head. The current (as of this commit) public documentation for this extension is available at: https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf Support for these instructions has already landed in GNU Binutils: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=27cfd142d0a7e378d19aa9a1278e2137f849b71b Depends on D144002 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D144249	2023-02-24 00:17:58 +01:00
Luke Lau	8d15e7275f	[RISCV] Lower interleave and deinterleave intrinsics Lower the two intrinsics introduced in D141924. These intrinsics can be combined with loads and stores into the much more efficient segmented load and store instructions in a following patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D144092	2023-02-23 16:23:02 +00:00
Philip Reames	9168c98553	[RISCV] Extract a helper routine for computing (runtime) VLMax [nfc]	2023-02-21 09:55:59 -08:00
Manolis Tsamis	bbb58a2302	[RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension The vendor-defined XTHeadMemPair (no comparable standard extension exists at the time of writing) extension adds two-GPR load/store pair instructions. It is supported by the C9xx cores (e.g., found in the wild in the Allwinner D1) by Alibaba T-Head. The current (as of this commit) public documentation for this extension is available at: https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf Support for these instructions has already landed in GNU Binutils: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=6e17ae625570ff8f3c12c8765b8d45d4db8694bd Depends on D143847 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D144002	2023-02-21 12:21:49 +01:00
Luke Lau	7e2f2f0fc8	[RISCV][NFC] Make a note of the operands for RISCVISD::VNSRL_VL Split out from https://reviews.llvm.org/D144092 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D144387	2023-02-21 09:45:09 +00:00
Philipp Tomsich	16a66af0a0	Revert "[RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension" This reverts commit d2918544a7fc4b5443879fe12f32a712e6dfe325.	2023-02-17 19:45:55 +01:00
Manolis Tsamis	d2918544a7	[RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension The vendor-defined XTHeadMemPair (no comparable standard extension exists at the time of writing) extension adds two-GPR load/store pair instructions. It is supported by the C9xx cores (e.g., found in the wild in the Allwinner D1) by Alibaba T-Head. The current (as of this commit) public documentation for this extension is available at: https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf Support for these instructions has already landed in GNU Binutils: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=6e17ae625570ff8f3c12c8765b8d45d4db8694bd Depends on D143847 Differential Revision: https://reviews.llvm.org/D144002	2023-02-17 19:45:22 +01:00
Matt Arsenault	09dd4d870e	DAG: Remove hasBitPreservingFPLogic This doesn't make sense as an option. fneg and fabs are bit preserving by definition. If a target has some fneg or fabs instruction that are not bitpreserving it's incorrect to lower fneg/fabs to use it.	2023-02-14 10:25:24 -04:00
Roland McGrath	34b21e817f	[RISCV] Use OS-specific stack-guard ABI for Fuchsia Fuchsia provides a slot relative to tp for the stack-guard value, which is cheaper to materialize than the default GOT load. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D143353	2023-02-05 18:45:59 -08:00
Alex Bradbury	ae14754612	[RISCV] Implement isMultiStoresCheaperThanBitsMerge hook Grabs the same logic and reasoning from the X86 implementation of the hook. The benefit is slightly less clear for when the soft float ABI is used (i.e. there's no transfer from an FPR to a GPR), but I've opted not to gate it based on ABI. Differential Revision: https://reviews.llvm.org/D140408	2023-01-31 12:47:48 +00:00
Luke Lau	f5a6447196	[RISCV] Combine FP_TO_INT to vfwcvt/fvncvt Adds new pseudo instructions to make sure that the fcvt instructions have all rounding mode (RM) and unsigned (XU) variants across single-width, widening and narrowing conversions. And likewise, extends the VL patterns to accompany them. We don't add new VL nodes for the widening/narrowing conversions though, instead we just add specific patterns for vfcvts on those wider/narrower types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D142102	2023-01-24 09:44:57 +00:00
Luke Lau	a0d80c2398	[RISCV] Generalize performFP_TO_INTCombine to vectors Like in the scalar domain, combine calls to (fp_to_int (ftrunc X)) on scalable and fixed-length vectors into a single vfcvt instruction. For truncating rounds, the static vfcvt.rtz rounding mode is used. Otherwise use the VFCVT_RM_ variants to set the rounding mode dynamically. Closes https://github.com/llvm/llvm-project/issues/56737 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D141599	2023-01-18 10:53:24 +00:00
Roman Lebedev	cc39c3b17f	[Codegen][LegalizeIntegerTypes] New legalization strategy for scalar shifts: shift through stack https://reviews.llvm.org/D140493 is going to teach SROA how to promote allocas that have variably-indexed loads. That does bring up questions of cost model, since that requires creating wide shifts. Indeed, our legalization for them is not optimal. We either split it into parts, or lower it into a libcall. But if the shift amount is by a multiple of CHAR_BIT, we can also legalize it throught stack. The basic idea is very simple: 1. Get a stack slot 2x the width of the shift type 2. store the value we are shifting into one half of the slot 3. pad the other half of the slot. for logical shifts, with zero, for arithmetic shift with signbit 4. index into the slot (starting from the base half into which we spilled, either upwards or downwards) 5. load 6. split loaded integer This works for both little-endian and big-endian machines: https://alive2.llvm.org/ce/z/YNVwd5 And better yet, if the original shift amount was not a multiple of CHAR_BIT, we can just shift by that remainder afterwards: https://alive2.llvm.org/ce/z/pz5G-K I think, if we are going perform shift->shift-by-parts expansion more than once, we should instead go through stack, which is what this patch does. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140638	2023-01-14 19:12:18 +03:00
Yeting Kuo	5280d3e738	[RISCV] Teach lowerCTLZ_CTTZ_ZERO_UNDEF to handle conversion i32/i64 vectors to f32 vectors. Previously lowerCTLZ_CTTZ_ZERO_UNDEF converted the source to float value by ISD::UINT_TO_FP. ISD::UINT_TO_FP uses dynamic rounding mode, so the rounding may make the exponent of the result not as expected when converting i32/i64 to f32. This is the reason why we constrained lowerCTLZ_CTTZ_ZERO_UNDEF to only handle an i32 source when the f64 type having the same element count as source is legal. The patch teaches lowerCTLZ_CTTZ_ZERO_UNDEF converts i32/i64 vectors to f32 vectors by vfcvt.f.xu.v with RTZ rounding mode. Using RTZ is to make sure the exponent of results is correct, although f32 could not totally represent each value in i32/i64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140782	2023-01-12 14:42:47 +08:00
Francesco Petrogalli	ac1ffd3cac	[TargetParser] Generate the defs for RISCV CPUs using llvm-tblgen. Rework the change to prevent build failures. NFCI. The failing code was submitted as cf7a8305a2b4ddfd299c748136cb9a2960ef7089 and reverted via 8bd65e535fb33bc48805bafed8217b16a853e158. The rework in this new commit prevents failures like the following: FAILED: tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/RISCV.cpp.o /usr/bin/c++ [bunch of non interesting stuff] -c <path-to>/llvm-project/clang/lib/Basic/Targets/RISCV.cpp In file included from <path-to>/llvm-project/clang/lib/Basic/Targets/RISCV.cpp:19: <path-to>/llvm-project/llvm/include/llvm/TargetParser/RISCVTargetParser.h:29:10: fatal error: llvm/TargetParser/RISCVTargetParserDef.inc: No such file or directory 29 \| #include "llvm/TargetParser/RISCVTargetParserDef.inc" \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These failures happen because the library LLVMTargetParser depends on RISCVTargetParserTableGen, which is a tablegen target that generates the list of CPUs in llvm/TargetParser/RISCVTargetParserDef.inc. This *.inc file is included by the public header file llvm/TargetParser/RISCVTargetParser.h. The header file llvm/TargetParser/RISCVTargetParser.h is also used in components (clangDriver and clangBasic) that link into LLVMTargetParser, but on some configurations such components might end up being built before TargetParser is ready. The fix is to make sure that clangDriver and clangBasic depend on the tablegen target RISCVTargetParserTableGen, which generates the .inc file whether or not LLVMTargetParser is ready. WRT the original patch at https://reviews.llvm.org/D137517, this commit is just adding RISCVTargetParserTableGen in the DEPENDS list of clangDriver and clangBasic.	2023-01-11 11:18:44 +01:00
Francesco Petrogalli	8bd65e535f	Revert "[TargetParser] Generate the defs for RISCV CPUs using llvm-tblgen." This reverts commit cf7a8305a2b4ddfd299c748136cb9a2960ef7089.	2023-01-11 10:22:56 +01:00
Francesco Petrogalli	cf7a8305a2	[TargetParser] Generate the defs for RISCV CPUs using llvm-tblgen. This patch removes the file `llvm/include/llvm/TargetParser/RISCVTargetParser.def` and replaces it with a tablegen-generated `.inc` file out of `llvm/lib/Target/RISCV/RISCV.td`. The module system has been updated to make sure we can build clang/llvm with `-DLLVM_ENABLE_MODULES=On` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137517	2023-01-11 10:00:04 +01:00
jacquesguan	db3f3243bb	[RISCV] Use vfirst.m to extract the first element from mask vector. This patch uses vfirst.m to extract the first bit of mask. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139512	2023-01-03 11:24:18 +08:00
Anton Sidorenko	37f9eec142	[RISCV] Allow conversion of fp divisions to fp multiplications by the reciprocal If the divisor is repeated at least twice, we will convert the FDIVs to the calculation of the reciprocal and FMULs. We perform the transformation only under fast-math mode. FDIVs must have 'arcp' flag. Differential Revision: https://reviews.llvm.org/D140024	2022-12-15 13:00:36 +03:00
jacquesguan	c2f199fa48	[DAGCombiner] Scalarize extend/truncate for splat vector. This revision scalarizes extend/truncate for splat vector. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122875	2022-12-12 14:53:10 +08:00
jacquesguan	f7a46aa8fb	[RISCV] Fold vector binary operatrion into select with identity constant. This patch implements shouldFoldSelectWithIdentityConstant for RISCV. It would try to generate vmerge after the binary instruction and let them folded to maksed instruction later. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131551	2022-12-06 11:19:31 +08:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Krzysztof Parzyszek	864aaa21b4	TargetLowering: convert Optional to std::optional	2022-12-01 16:19:10 -08:00
Philip Reames	7d82c99403	[RISCV][TTI] Account for constant materialization cost when costing arithmetic operations At the IR level, we generally assume that constants are free to materialize. However, for RISCV due to some quirks of the ISA, materializing arbitrary constants can be rather expensive. We frequently fallback to constant pool loads. We've been slowly moving in the direction of modeling the cost of the remat as part of the instruction cost. This has the effect of disincentivizing vectorization - mostly SLP - when we'd have to materialize an expensive constant. We need better modeling of which constants are expensive and not, but the moment let's be consistent with how we model arithmetic and memory instructions. The difference between the two is that arithmetic can sometimes fold a splat operation which stores can not. Differential Revision: https://reviews.llvm.org/D138941	2022-11-30 07:20:51 -08:00
Philip Reames	b25672ba82	[RISCV] Separate out helper for checking if vector splat supported for operand [nfc]	2022-11-29 11:05:46 -08:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Yeting Kuo	ed9638c44b	[VP][RISCV] Add vp.nearbyint and RISC-V support. nearbyint has the property to execute without exception. For not modifying fflags, the patch added new machine opcode PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair of frflags and fsflags. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D137685	2022-11-16 14:05:35 +08:00
Craig Topper	dde8423f21	[RISCV] Expand i32 abs to negw+max at isel. This adds a RISCVISD::ABSW to remember that we started with an i32 abs. Previously we used a DAG combine of (sext_inreg (abs)) to delay emitting a freeze from type legalization in order to make ComputeNumSignBits optimizations work on other promoted nodes. This new approach always uses negw+max even if the result doesn't need to be sign extended. This helps the RISCVSExtWRemoval pass if the sext.w is in another basic block.	2022-11-14 19:44:05 -08:00
Craig Topper	6254495c6b	[RISCV] Move RVVBitsPerBlock to TargetParser.h so we can use it in clang. NFC Differential Revision: https://reviews.llvm.org/D137266	2022-11-02 13:09:14 -07:00
Yeting Kuo	71e4e35581	[VP][RISCV] Add vp.rint and RISC-V support. FRINT uses dynamic rounding mode instead of static rounding mode. The patch rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136662	2022-11-01 14:52:47 +08:00
Craig Topper	e94dc58dff	[RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven. This avoids the call overhead as well as the the save/restore of fflags and the snan handling in the libm function. The save/restore of fflags and snan handling are needed to be correct for -ftrapping-math. I think we can ignore them in the default environment. The inline sequence will generate an invalid exception for nan and an inexact exception if fractional bits are discarded. I've used a custom inserter to explicitly create the control flow around the float->int->float conversion. We can probably avoid the final fsgnj after the conversion for no signed zeros FMF, but I'll leave that for future work. Note the comparison constant is slightly different than glibc uses. They use 1<<53 for double, I'm using 1<<52. I believe either are valid. Numbers >= 1<<52 can't have any fractional bits. It's ok to do the float->int->float conversion on numbers between 1<<53 and 1<<52 since they will all fit in 64. We only have a problem if the double can't fit in i64 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D136508	2022-10-26 14:36:49 -07:00
Philip Reames	60c91fd364	[RISCV] Disallow scale for scatter/gather RISCV doesn't actually support a scaled form of indexed load and store. We previously handled this by forming the scaled SDNode, and then doing custom legalization during lowering. This patch instead adds a callback via TLI to prevent formation entirely. This has two effects: * First, the GEP gets expanded (and used). Instead of the shift being created with an SDLoc of the memory operation, it has the SDLoc of the GEP instruction. This avoids the scheduler perturbing IR order when there's no reason to. * Second, we fix what appears to be a bug in index calculation with RV32. The rules for GEPs require index calculation be done in particular bitwidth, and it appears the custom legalization code got this wrong for the case where index type exceeds pointer width. (Or at least, I trust the generic GEP lowering to be correct a lot more.) The DAGCombiner change to handle VPScatter/VPGather is technically separate, but is required to prevent a regression on those intrinsics. Differential Revision: https://reviews.llvm.org/D134382	2022-09-22 15:31:26 -07:00

1 2 3 4 5 ...

303 Commits