llvm-project

Author	SHA1	Message	Date
Arthur Eubanks	6f538f6a2d	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264. This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-14 17:47:08 +00:00
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Florian Hahn	90fd99c079	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c. Extra tests have been added in 52d29eb287. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-14 12:33:48 +01:00
Florian Hahn	52d29eb287	[LV] Add extra cost model tests with truncated inductions. Extra test cases that caused revert of https://github.com/llvm/llvm-project/pull/92555	2024-06-13 20:42:53 +01:00
Jay Foad	d4a0154902	[llvm-project] Fix typo "seperate" (#95373 )	2024-06-13 20:20:27 +01:00
Arthur Eubanks	46080abe9b	Revert "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-13 16:37:21 +00:00
Florian Hahn	00798354c5	[VPlan] First step towards VPlan cost modeling. (#92555 ) This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-13 14:26:18 +01:00
Florian Hahn	c46a6e6c92	[LV] Remove unnecessary getRuntimeVF call when computing vector TC. As Step is VF * UF, there is no need to compute it again, which may require multiple instructions for scalable VFs.	2024-06-12 14:35:37 +01:00
Florian Hahn	2e4c06780c	[LV] Add extra X86 cost tests for any_of reduction and multi-exit loops. Add extra test coverage to ensure decisions do not change when transitioning to a VPlan-based cost model.	2024-06-10 13:13:04 +01:00
Florian Hahn	2f4ebf8545	[VPlan] Handle more cases in VPInstruction::onlyFirstPartUsed. Handle binary ops and a few other instructions in onlyFirstPartUsed; they only use the first part if they themselves only have their first part used.	2024-06-09 13:19:44 +01:00
Florian Hahn	998c33e5fc	[VPlan] Mark FirstOrderRecurrenceSplice as not having side-effects. Now that FOR exit and resume value creation is explicitly modeled in VPlan (05e1b5340b0caf1, 07b330132c0b) it doesn't depend on the first order recurrence splice being preserved and it can now be marked as not having side-effects. This allows removal of first-order-recurrence-splce if the FOR is only used in the exit or as scalar ph resume value.	2024-06-08 21:40:30 +01:00
Florian Hahn	a43d999d14	[VPlan] Check if only first part is used for all per-part VPInsts. Apply the onlyFirstPartUsed logic generally to all per-part VPInstructions. Note that the test changes remove the second part of an unsued first-order recurrence splice.	2024-06-08 20:31:54 +01:00
Florian Hahn	4f9c0fa223	[LV] Add test with dead load and vector pointer.	2024-06-07 16:14:02 +01:00
Farzon Lotfi	2f0308ed02	[arm64] Add tan intrinsic lowering (#94545 ) This change is an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This PR is just for Tan. Now that x86 tan backend landed: https://github.com/llvm/llvm-project/pull/90503 we can add other backends since the shared pieces are in tree now. Changes: - `llvm/include/llvm/Analysis/VecFuncs.def` - vectorization of tan for arm64 backends. - `llvm/lib/Target/AArch64/AArch64FastISel.cpp` - Add tan to the libcall table - `llvm/lib/Target/AArch64/AArch64ISelLowering.cpp` - Add tan expansion for f128, f16, and vector\neon operations - `llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp` define `G_FTAN` as a legal arm64 instruction resolves #94755	2024-06-07 09:42:06 -04:00
Farzon Lotfi	1d87433593	[x86] Add tan intrinsic part 4 (#90503 ) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082	2024-06-05 15:01:33 -04:00
Florian Hahn	05e1b5340b	[VPlan] Model FOR resume value extraction in VPlan. (#93396 ) This patch uses the ExtractFromEnd VPInstruction opcode to extract the value of a FOR to be used as resume value for the ph in the scalar loop. It adds a new live-out that temporarily wraps the FOR phi in the scalar loop. fixFixedOrderRecurrence will process live outs for fixed order recurrence phis by creating a new phi node in the scalar preheader, using the generated value for the live-out as incoming value from the middle block and the original start value as incoming value for the other edge. Creation of the phi in the preheader, as well as updating the phi in the scalar loop will also be moved to VPlan in the future, eventually retiring fixFixedOrderRecurrence Depends on https://github.com/llvm/llvm-project/pull/93395 PR: https://github.com/llvm/llvm-project/pull/93396	2024-06-05 11:18:06 +01:00
Florian Hahn	e949b54a5b	[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499 ) Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns the minimum of the countable exits. When analyzing dependences and computing runtime checks, we need the smallest upper bound on the number of iterations. In terms of memory safety, it shouldn't matter if any uncomputable exits leave the loop, as long as we prove that there are no dependences given the minimum of the countable exits. The same should apply also for generating runtime checks. Note that this shifts the responsiblity of checking whether all exit counts are computable or handling early-exits to the users of LAA. Depends on https://github.com/llvm/llvm-project/pull/93498 PR: https://github.com/llvm/llvm-project/pull/93499	2024-06-04 22:23:30 +01:00
Florian Hahn	e775efcec4	[LV] Apply loop guards when checking recur during hoisting RT checks. Apply loop guards when checking if the recurrence is non-negative in cases where runtime checks are hoisted out of an inner loop.	2024-06-04 20:37:46 +01:00
Florian Hahn	164597616c	[LV] Add test for RT check hoisting where loop guards simplify check. Add a test case with a missed simplification when hoisting runtime checks due to not applying loop guards.	2024-06-04 09:32:22 +01:00
Ramkumar Ramachandra	59cb55d384	VPlan: add missing case for LogicalAnd; fix crash (#93553 ) VPTypeAnalysis::inferScalarTypeForRecipe is missing the case for VPInstruction::LogicalAnd, due to which the test vplan-incomplete-cases.ll crashes. Add this missing case, and move the test in vplan-infer-not-or-type.ll to vplan-incomplete-cases.ll, showing correct codegen for trip-counts 2 and 3.	2024-06-04 08:58:16 +01:00
Florian Hahn	07b330132c	[VPlan] Model FOR extract of exit value in VPlan. (#93395 ) This patch introduces a new ExtractFromEnd VPInstruction opcode to extract the value of a FOR for users outside the loop (i.e. in the scalar loop's exits). This moves the first part of fixing first order recurrences to VPlan, and removes some additional code to patch up live-outs, which is now handled automatically. The majority of test changes is due to changes in the order of which the extracts are generated now. As we are now using VPTransformState to generate the extracts, we may be able to re-use existing extracts in the loop body in some cases. For scalable vectors, in some cases we now have to compute the runtime VF twice, as each extract is now independent, but those should be trivial to clean up for later passes (and in line with other places in the code that also liberally re-compute runtime VFs). PR: https://github.com/llvm/llvm-project/pull/93395	2024-06-03 20:20:30 +01:00
Florian Hahn	f7e63e8b46	[LV] Operands feeding pointers of interleave member pointers are free. For interleave groups we only create a pointer for the start of the interleave group, not all original loads/stores. Mark single-use ops feeding interleave group mem ops as free when vectorizing.	2024-06-01 13:59:29 +01:00
Florian Hahn	4c6367b3e5	[LV] Add test with strided interleave groups and maximizing bandwidth.	2024-06-01 12:26:00 +01:00
Florian Hahn	f38d84ce32	[VPlan] Use ir-bb prefix for VPIRBasicBlock. Follow-up to adjust the names and tests after https://github.com/llvm/llvm-project/pull/93398.	2024-05-30 17:43:40 -07:00
Ramkumar Ramachandra	43100766f2	LV: generalize profitability criterion over TC (#93300 ) Generalize LoopVectorizationPlanner::isMoreProfitable smoothly across the fixed-vector and scalable-vector cases, taking the trip-count into account, and fixing logical pitfalls that arise from a lack of generality.	2024-05-30 10:54:32 +01:00
Florian Hahn	8b037862b6	[VPlan] Preserve DT (and SCEV) in VPlan-native path (#93287 ) As a follow-up to b2f65e80, use the DTU to also update and preserve the DT in the native path. This should also allow preserving SCEV in the native path PR: https://github.com/llvm/llvm-project/pull/93287	2024-05-27 17:03:53 -07:00
Florian Hahn	bb4c8f9219	[SCEV] Don't add predicates already implied by UnionPredicate. (#93397 ) Update SCEVUnionPredicate::add to only add predicates from another union predicate, if they aren't alread implied by the union predicate we add them to. Note that there exists logic elsewhere to avoid adding predicates if they are already implied, but this logic misses cases when only some predicates of a union predicate are implied by the current set of predicates. PR: https://github.com/llvm/llvm-project/pull/93397	2024-05-26 18:31:36 -07:00
Florian Hahn	686600b521	[LV] Add test showing missed removal of implied predicate. Tests for https://github.com/llvm/llvm-project/pull/93397	2024-05-26 17:23:14 -07:00
Florian Hahn	ac17fbc076	[VPlan] Add test for printing FOR with live-out. Add additional test coverage for printing VPlans with a first-order recurrence with its result used outside the loop.	2024-05-25 21:25:57 -07:00
Shih-Po Hung	0338c55ea5	[LV, VPlan] Check if plan is compatible to EVL transform (#92092 ) The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.	2024-05-25 08:22:49 +08:00
Ramkumar Ramachandra	bb0d29a72d	[LV] fix logical error in trunc cost (#91136 ) In LoopVectorizationCostModel::getInstructionCost(), when the condition canTruncateToMinimalBitwidth() is satisfied, for a trunc, the source type is computed as the smallest type of the source vector and the destination vector, and the destination type is computed as the largest type of the instruction and destination type. This is clearly a logical error, as the original source vector type could be smaller than the original destination vector type, and the trunc semantics are broken because we're attempting to widen. Fixes #47665.	2024-05-24 18:01:58 +01:00
Shih-Po Hung	b008a2d12a	[LV][NFC] precommit test for EVL transform (#92203 ) A precommit test case to show vector loops generated from EVL transform - This is a precommit test for https://github.com/llvm/llvm-project/pull/92092	2024-05-24 23:21:59 +08:00
Ramkumar Ramachandra	dc148c9fb8	[LV] add test for #47665 , #88802 (#91135 )	2024-05-24 10:50:43 +01:00
Freddy Ye	4def1ce101	Reland "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93136 ) This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.	2024-05-24 13:46:34 +08:00
David Green	46541a3636	[ARM] Add a extra MVE low-trip-count loop. NFC This makes use of half floats, which makes the masked stores expensive.	2024-05-23 21:50:47 +01:00
Freddy Ye	aa4069ea96	Revert "[X86] Remove knl/knm specific ISAs supports (#92883 )" (#93123 ) This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.	2024-05-23 10:25:23 +08:00
Freddy Ye	282d2ab58f	[X86] Remove knl/knm specific ISAs supports (#92883 ) Cont. patch after https://github.com/llvm/llvm-project/pull/75580	2024-05-23 09:46:44 +08:00
Simon Pilgrim	0873b4ca29	[LoopVectorize] optimal-epilog-vectorization-profitability.ll - fix LABLE -> LABEL typo Typo identified in #91854	2024-05-22 11:07:24 +01:00
Florian Hahn	a56e6dfd2e	[LV] Add test for header mask and invariant compare cost-modeling. Additional test coverage for the VPlan-based cost model work.	2024-05-22 09:57:35 +01:00
Sander de Smalen	1015f51dd9	[AArch64] NFC: Rename -force-streaming-compatible-sve to -force-streaming-compatible (#92774 ) The behaviour of the flag should be equivalent to __arm_streaming_compatible. At the moment, the name suggests that '-force-streaming-compatible-sve' on its own (i.e. without specifying `+sve`) enables the compiler to use the streaming-compatible subset of SVE instructions, but the semantics merely are that the function can be called with either PSTATE.SM=0 or PSTATE.SM=1.	2024-05-22 07:58:54 +01:00
Florian Hahn	352dc7d4bb	[LV] Propagate PredicatedBBsAfterVectorization to predecessors. This fixes some cases where predicated BBs where missed previously, leading to under-estimating the cost of those blocks.	2024-05-21 10:27:32 +01:00
hev	1e86e92428	[LoongArch] Enable interleaved vectorization (#92629 ) This PR enables interleaved vectorization for LoongArch, with a default interleaving factor of `2`.	2024-05-21 15:31:02 +08:00
Florian Hahn	82c5d350d2	[VPlan] Add commutative binary OR matcher, use in transform. (#92539 ) Split off from https://github.com/llvm/llvm-project/pull/89386, this extends the binary matcher to support matching commuative operations. This is used for a new m_c_BinaryOr matcher, used in simplifyRecipe. PR: https://github.com/llvm/llvm-project/pull/92539	2024-05-20 13:03:48 +01:00
Nikita Popov	8e8d2595da	[ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872 ) This patch canonicalizes constant expression GEPs to use i8 source element type, aka ptradd. This is the ConstantFolding equivalent of the InstCombine canonicalization introduced in #68882. I believe all our optimizations working on constant expression GEPs (like GlobalOpt etc) have already been switched to work on offsets, so I don't expect any significant fallout from this change. This is part of: https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699	2024-05-20 11:47:30 +02:00
Florian Hahn	b050048d35	[VPlan] Simplify (X && Y) \|\| (X && !Y) -> X. (#89386 ) Simplify a common pattern generated for masks when folding the tail. PR: https://github.com/llvm/llvm-project/pull/89386	2024-05-19 15:45:23 +00:00
Florian Hahn	1e7d047c71	[VPlan] Mark LoopInfo preserved in native-path as well (NFC). LoopInfo is updated during VPlan execution now, so it will also be updated correctly in the native path.	2024-05-17 12:18:01 +01:00
Craig Topper	487b43cdc9	[RISCV] Pass subvector type to isLegalInterleavedAccessType in getInterleavedMemoryOpCost. (#91825 ) isLegalInterleavedAccessType expects the subvector type, but getInterleavedMemoryOpCost is called with the full vector type. So we need to divide by Factor.	2024-05-15 21:47:29 -07:00
Pietro Ghiglio	83d9aa2768	[VPlan] Add scalar inferencing support for addrspace cast (#92107 ) Fixes https://github.com/llvm/llvm-project/issues/91434 PR: https://github.com/llvm/llvm-project/pull/92107	2024-05-15 14:03:21 +01:00
Florian Hahn	b0a1ae2cca	[LV] Add additional variants of tests with udiv/urem/sdiv/srem in TC. Add additional tests with udiv/urem/sdiv/srem in trip counts, where the divisor is constant. For https://github.com/llvm/llvm-project/pull/92177.	2024-05-15 11:17:23 +01:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00

1 2 3 4 5 ...

2475 Commits