llvm-project

Author	SHA1	Message	Date
Serge Pavlov	462d5830da	[GlobalISel] Add support for *_fpmode intrinsics The change implements support of the intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` in Global Instruction Selector. Now they are lowered into library function calls. Differential Revision: https://reviews.llvm.org/D158260	2023-10-09 21:14:07 +07:00
Matt Arsenault	1328a8534b	AMDGPU: Fix handling of -0 in round lowering (#65761 )	2023-09-19 09:14:17 +03:00
Allen	eaf23b2480	[GIsel][AArch64] Legalize <2 x i16> for G_INSERT_VECTOR_ELT (#65830 ) Widen the vector elements to 64 bits to make sure it legal instead by clamping the number of elements. Depend on D153394. Fixes https://github.com/llvm/llvm-project/issues/63826	2023-09-12 21:15:01 +08:00
Jay Foad	71ca53b6cf	[GlobalISel] Lower G_SHUFFLE_VECTOR with scalar result (#65275 )	2023-09-04 13:32:43 -04:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
David Green	58a2f839fd	[AArch64][GISel] Expand coverage of FDiv and move into place. This adds some more extensive test coverage for fdiv through global isel, switching the opcodes to use the more complete ActionDefinitions to handle more cases and moving it into the position of the existing code which is no longer needed.	2023-08-30 22:09:53 +01:00
David Green	ef0b8cf3f4	[AArch64][GISel] Expand coverage of FAdd and FSub. This adds some more extensive test coverage for fadd/fsub through global isel, switching the opcodes to use the more complete ActionDefinitions to handle more cases.	2023-08-23 09:51:06 +01:00
Tuan Chuong Goh	a40c984976	[AArch64][GlobalISel] Support more legal types for EXTEND Expand (s/z/any)ext instructions to be compatible with more types for GlobalISel. This patch mainly focuses on 64-bit and 128-bit vectors with element size of powers of 2. It also notably handles larger than legal vectors. Differential Revision: https://reviews.llvm.org/D157113	2023-08-21 09:51:17 +01:00
Craig Topper	c6dee6982f	[GlobalISel][Mips] Sync G_UADDE and G_USUBE legalization with LegalizeDAG. This modifies the G_UADDE legalizaton to a version that looks shorter on Mips and RISC-V when feeding the equivalent IR to SelectionDAG. This also removes the boolean select from G_USUBE. Comments taken from LegalizeDAG and tweaked. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158232	2023-08-17 20:36:55 -07:00
Jie Fu	d1a4b8c56f	[GlobalISel] Remove unused variable 'Or' (NFC) /Users/jiefu/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:3450:10: error: unused variable 'Or' [-Werror,-Wunused-variable] auto Or = MIRBuilder.buildOr(CarryOut, And, Res_ULT_LHS); ^ 1 error generated.	2023-08-18 06:40:41 +08:00
Craig Topper	ebb2e5ebb2	[GlobalISel][Mips] Correct corner case in G_UADDE legalization. If carryin was 1, and RHS is 0xffffffff we were not giving a carry out. In that case Res would be equal to LHS, so Res <u LHS would be false. But there should be a carry out since carryin+RHS wraps around to 0. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157943	2023-08-17 15:06:16 -07:00
David Green	cf65afbf93	[AArch64][GISel] Extend lowering for fp round intrinsics. This extends the lowering of ceil, floor, nearbyint, rint, round, roundeven and trunc. They are all very similar, so can reuse the same legalization info. selectIntrinsicTrunc and selectIntrinsicRound can be removed as they can be selected via tablegen patterns, and G_INTRINSIC_ROUNDEVEN is marked as a gisel equivalent of froundeven. Otherwise this reuses the existing code, filling it out to handle more types. Differential Revision: https://reviews.llvm.org/D157679	2023-08-17 16:25:32 +01:00
David Green	a3f2751f78	[AArch64][GISel] Add handling for G_VECREDUCE_FMAXIMUM and G_VECREDUCE_FMINIMUM This is a lot of copy-pasting for the existing handling of G_VECREDUCE_FMAX/G_VECREDUCE_FMIN to add handling for G_VECREDUCE_FMAXIMUM/G_VECREDUCE_FMINIMUM in the same way. Differential Revision: https://reviews.llvm.org/D156615	2023-08-14 10:03:25 +01:00
David Green	d199478af4	[AArch64][GISel] Handling for G_VECREDUCE_FMIN and G_VECREDUCE_FMAX This adds legalization for G_VECREDUCE_FMIN and G_VECREDUCE_FMAX, where the selection can go via tablegen patterns. I haven't tried to get non-power2 types working yet, just the more legal types. Differential Revision: https://reviews.llvm.org/D156614	2023-08-14 09:19:47 +01:00
Bjorn Pettersson	a7ee80fab2	[llvm] Drop some more typed pointer bitcasts etc.	2023-08-13 16:46:56 +02:00
Amara Emerson	b9669789c3	[GlobalISel][NFC] Introduce a GVecReduce wrapper class and a minor refactor.	2023-08-12 13:55:08 -07:00
David Green	acd17ea662	[AArch64][GISel] Expand handling for G_FSQRT to more vector types Similar to G_FABS, these can reuse the existing lowering to successfully handle more types.	2023-08-11 10:16:45 +01:00
Matt Arsenault	1ca0808db2	GlobalISel: Don't expand stacksave/stackrestore in IRTranslator In some (likely invalid edge cases anyway), it's not correct to directly copy the stack pointer register.	2023-08-09 18:33:55 -04:00
Sameer Sahasrabuddhe	d9847cde48	[GlobalISel] convergent intrinsics Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154766	2023-07-31 12:15:39 +05:30
David Green	6edc9a7662	[AArch64][GISel] Additional FPExt vector lowering Similar to D155311, this adds lowering for more vector cases for FPExt Differential Revision: https://reviews.llvm.org/D155601	2023-07-23 16:58:13 +01:00
David Green	74c0bdff7d	[AArch64][GISel] Additional FPTrunc vector lowering I was attempting to add llvm.reduce.fminimum/fmaximum support for GlobalISel. In the process I noticed that llvm.reduce.fmin/fmax was missing, and could do with being added first. That led on to adding additional vector support for minnum/maxnum, which in turn led to needing to handle fptrunc and fpext for some of the fp16 types. So this patch extends the vector handling for fptrunc, adding support for f16 types which are clamped to 4 elements, and scalarizing the rest. I went round in circles a little with how smaller than legal vectors should be handled, but this seems simple and seems to work, if not always optimally yet. Differential Revision: https://reviews.llvm.org/D155311	2023-07-18 18:52:19 +01:00
Ivan Kosarev	e705b2b1f4	Fix warnings about unused varibles on builds without asserts.	2023-07-12 14:45:29 +01:00
Ivan Kosarev	15e7749e19	[Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154528	2023-07-12 11:55:19 +01:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	003b58f65b	IR: Add llvm.frexp intrinsic Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and exponent parts. AMDGPU has native instructions to return the two halves, but could use some generic legalization and optimization handling. For example, we should be able to handle legalization of f16 on older targets, and for bf16. Additionally antique targets need a hardware workaround which would be better handled in the backend rather than in library code where it is now.	2023-06-28 14:50:16 -04:00
David Green	2802739dfd	[NFC] Replace ;; with ;	2023-06-11 10:25:24 +01:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Mateja Marjanovic	cf76074a36	[AMDGPU][GlobalISel] Check exact width in get*ClassForBitWidth and widen if necessary Instead of checking if the given bitwidth is less or equal to a bitwidth of an existing RegClass, check if it has the exact same value. For LLVM vector types that don't have a corresponding Register Class, widen them during legalization. That goes for G_EXTRACT_VECTOR_ELT, G_INSERT_VECTOR_ELT and G_BUILD_VECTOR. Differential revision: https://reviews.llvm.org/D148096 Reviewers: foad, arsenm	2023-05-03 17:32:24 +02:00
Mateja Marjanovic	6175ec0bb6	Revert "[AMDGPU][GlobalISel] Widen the vector operand in G_BUILD/INSERT/EXTRACT_VECTOR" This reverts commit b25c7cafcbe1b52ea2d1ff5e5c2f13674b5f297d.	2023-05-03 17:28:01 +02:00
Mateja Marjanovic	b25c7cafcb	[AMDGPU][GlobalISel] Widen the vector operand in G_BUILD/INSERT/EXTRACT_VECTOR Widen the vector operand type in G_BUILD_VECTOR, G_INSERT_VECTOR_ELT, G_EXTRACT_VECTOR_ELT to the nearest larger RegClass.	2023-05-03 17:14:38 +02:00
Sergei Barannikov	38d84e3d76	[GISel] Legalize G_FSUB to G_FADD + G_FNEG even if G_FNEG is illegal `G_FNEG` used to be legalized to `G_FSUB -0, x` causing infinite loop. This is no longer the case after D84287. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D148187	2023-04-15 08:11:49 +03:00
Amara Emerson	719024a0d0	[GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types. These reduce the typing and clutter from: Register Dst = MI.getOperand(0).getReg(); Register Src1 = MI.getOperand(1).getReg(); Register Src2 = MI.getOperand(2).getReg(); Register Src3 = MI.getOperand(3).getReg(); LLT DstTy = MRI.getType(Dst); ... etc etc To just: auto [Dst, Src1, Src2, Src3] = MI.getFirst4Regs(); auto [DstTy, Src1Ty, Src2Ty, Src3Ty] = MI.getFirst4LLTs(); Or even more concise: auto [Dst, DstTy, Src1, Src1Ty, Src2, Src2Ty, Src3, Src3Ty] = MI.getFirst4RegLLTs(); Differential Revision: https://reviews.llvm.org/D144687	2023-04-12 16:43:14 -07:00
Matt Arsenault	9356ec1516	CodeGen: Reorder case handling for is.fpclass legalization Subnormal and zero checks can be combined into one, so move the code closer to reduce the diff in a future change.	2023-03-17 11:29:50 -04:00
Matt Arsenault	61f2f2c64a	GlobalISel: Use FPClassTest in is.fpclass lowering	2023-03-17 10:23:01 -04:00
Vladislav Dzhidzhoev	3a51eed948	[AArch64][GlobalISel] Legalize G_SHUFFLE_VECTOR with smaller dest size Legalize G_SHUFFLE_VECTOR having destination vector length smaller than source vector length by reshaping destination vector. Differential Revision: https://reviews.llvm.org/D144670	2023-02-27 23:46:44 +01:00
Jessica Del	fc672b6a8b	[AMDGPU] Improved wide multiplies These checks show optimized instructions if an operand is known to be (partially) zero. Change-Id: Ie2f6d0d3ee9d5b279d1f4c1dd0787492e39cc77a Differential Revision: https://reviews.llvm.org/D140208	2023-02-22 16:39:06 +01:00
Kazu Hirata	b7ffd9686d	Use APInt::getAllOnes instead of APInt::getAllOnesValue (NFC) Note that getAllOnesValue has been soft-deprecated in favor of getAllOnes.	2023-02-19 22:54:23 -08:00
Chen Zheng	6ee2f770ef	[PowerPC][GISel] add support for fpconstant Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D133340	2023-02-14 02:39:22 +00:00
Kazu Hirata	f20b5071f3	[llvm] Use llvm::bit_floor instead of llvm::PowerOf2Floor (NFC)	2023-01-28 09:06:31 -08:00
Diana Picus	f95a5fbe7c	MachineIRBuilder: Rename buildMerge. NFC `buildMerge` may build a G_MERGE_VALUES, G_BUILD_VECTOR or G_CONCAT_VECTORS. Rename it to `buildMergeLikeInstr`. This is a follow-up suggested in https://reviews.llvm.org/D140964 Differential Revision: https://reviews.llvm.org/D141372	2023-01-13 09:32:58 +01:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Kevin Athey	ec7cffc579	Revert "Revert "[AArch64][GlobalISel][Legalizer] Legalize G_SHUFFLE_VECTOR with different lengths"" This reverts commit 192cc76e0be688106492989cd845ba786a7ae36d. Reverted Revert, as build was fixed while I was examining.	2022-12-15 11:19:24 -08:00
Kevin Athey	192cc76e0b	Revert "[AArch64][GlobalISel][Legalizer] Legalize G_SHUFFLE_VECTOR with different lengths" This reverts commit 4c52fb1a5ee20846627d16e38f5dec08c08f8884. Breaks sanitizer ubsan buildbot: https://lab.llvm.org/buildbot/#/builders/85/builds/12983	2022-12-15 11:15:55 -08:00
Vladislav Dzhidzhoev	4c52fb1a5e	[AArch64][GlobalISel][Legalizer] Legalize G_SHUFFLE_VECTOR with different lengths Legalize G_SHUFFLE_VECTOR having destination vector length greater than source vector length by reshaping source vectors. Partial implementation of SelectionDAGBuilder::visitShuffleVector. Differential Revision: https://reviews.llvm.org/D132190	2022-12-15 15:03:34 +03:00
Amara Emerson	53445f5b1c	[GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug. We currently have a bug where the legalizer, when dealing with phi operands, may create instructions in the phi's incoming blocks at points which are effectively dead due to a possible exception throw. Say we have: throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL B returnbb bb: %v = phi i1 %true, throwbb, %false.... When legalizing we may need to widen the i1 %true value, and to do that we need to create new extension instructions in the incoming block. Our insertion point currently is the MBB::getFirstTerminator() which puts the IP before the unconditional branch terminator in throwbb. These extensions may never be executed if the call throws, and therefore we need to emit them before the call (but not too early, since our new instruction may need values defined within throwbb as well). throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL %true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws! B returnbb bb: %v = phi i32 %true, throwbb, %false.... To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START is a terminator, which tries to model the fact that in the IR, the original invoke inst is actually a terminator as well. By using that as the new insertion point, we make sure to place new instructions on always executing paths. Unfortunately we still need to make the legalizer use a new insertion point API that I've added, since the existing `getFirstTerminator()` method does a reverse walk up the block, and any non-terminator instructions cause it to bail out. To avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new method that does a forward walk instead. Differential Revision: https://reviews.llvm.org/D137905	2022-12-07 10:28:51 -08:00
Janek van Oirschot	587747d8d1	[AMDGPU] G_IS_FPCLASS lower() support for IEEE fp types Simplified globalisel version of sdag's expandIS_FPCLASS. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D139128	2022-12-07 11:53:09 +00:00
Janek van Oirschot	322966f8f8	[AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU Uses existing SelectionDAG lowering of the llvm.amdgcn.class intrinsic for llvm.is.fpclass	2022-11-28 16:00:36 -05:00
Kazu Hirata	3ccbfc34c0	[GlobalISel] Use std::optional in LegalizerHelper.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-26 14:44:54 -08:00
Matt Arsenault	1fe1299a93	GlobalISel: Legalize strict_fsub In the future should probably have a more convenient way to switch between building strict and non-strict ops.	2022-11-18 15:21:41 -08:00
Matt Arsenault	08ec15e44b	AMDGPU/GlobalISel: Fix strictfp fmul	2022-11-18 08:53:49 -08:00

1 2 3 4 5 ...

595 Commits