llvm-project

Author	SHA1	Message	Date
Sander de Smalen	13ccb09725	[LV] Don't let ForceTargetInstructionCost override Invalid cost. Invalid costs can be used to avoid vectorization with a given VF, which is used for scalable vectors to avoid things that the code-generator cannot handle. If we override the cost using the -force-target-instruction-cost option of the LV, we would override this mechanism, rendering the flag useless. This change ensures the cost is only overriden when the original cost that was calculated is valid. That allows the flag to be used in combination with the -scalable-vectorization option. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106677	2021-07-26 20:27:49 +01:00
Sander de Smalen	b9051ba848	[LV] Remove assert that VF cannot be scalable in setCostBasedWideningDecision. Scalarization for scalable vectors is not (yet) supported, so the LV discards a VF when scalarization is chosen as the widening decision. It should therefore not assert that the VF is not scalable when it computes the decision to scalarize. The code can get here when both the interleave-cost, gather/scatter cost and scalarization-cost are all illegal. This may e.g. happen for SVE when the VF=1, to avoid generating `<vscale x 1 x eltty>` types that the code-generator cannot yet handle. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106656	2021-07-26 17:11:45 +01:00
Sander de Smalen	981e9dce54	[LV] Don't assume isScalarAfterVectorization if one of the uses needs widening. This fixes an issue that was found in D105199, where a GEP instruction is used both as the address of a store, as well as the value of a store. For the former, the value is scalar after vectorization, but the latter (as value) requires widening. Other code in that function seems to prevent similar cases from happening, but it seems this case was missed. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106164	2021-07-26 16:01:55 +01:00
Florian Hahn	7a1e73f0b9	Recommit "[VPlan] Add recipe for first-order rec phis, make splicing explicit." This reverts the revert commit b1777b04dc4b1a9fee0e7effa7e177892ab32ef0. The patch originally got reverted due to a crash: https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2 The underlying issue was that we were not using the stored values from the modified memory recipes, but the out-of-date values directly from the IR (accessed via the VPlan). This should be fixed in d995d6376. A reduced version of the reproducer has been added in 93664503be6b.	2021-07-26 15:50:30 +01:00
Kerry McLaughlin	e484e1ae03	[SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths Fixes more casts to `<FixedVectorType>` for the cases where the instruction is a Insert/ExtractElementInst. For fixed-width, this part of truncateToMinimalBitWidths is tested by AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract element instructions for instance, which is not possible for scalable vectors. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D106163	2021-07-26 13:44:51 +01:00
Florian Hahn	d995d63767	[VPlan] Use stored value from recipes for interleave groups. Instead of getting the VPValue for the stored IR values through the current plan, use the stored value of the recipes directly. This way, the correct VPValues are used if the store recipes have been modified in the VPlan and the IR value is not correct any longer. This can happen, e.g. due to D105008.	2021-07-26 12:05:23 +01:00
David Sherwood	0aff1798b5	[Analysis] Add simple cost model for strict (in-order) reductions I have added a new FastMathFlags parameter to getArithmeticReductionCost to indicate what type of reduction we are performing: 1. Tree-wise. This is the typical fast-math reduction that involves continually splitting a vector up into halves and adding each half together until we get a scalar result. This is the default behaviour for integers, whereas for floating point we only do this if reassociation is allowed. 2. Ordered. This now allows us to estimate the cost of performing a strict vector reduction by treating it as a series of scalar operations in lane order. This is the case when FP reassociation is not permitted. For scalable vectors this is more difficult because at compile time we do not know how many lanes there are, and so we use the worst case maximum vscale value. I have also fixed getTypeBasedIntrinsicInstrCost to pass in the FastMathFlags, which meant fixing up some X86 tests where we always assumed the vector.reduce.fadd/mul intrinsics were 'fast'. New tests have been added here: Analysis/CostModel/AArch64/reduce-fadd.ll Analysis/CostModel/AArch64/sve-intrinsics.ll Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll Differential Revision: https://reviews.llvm.org/D105432	2021-07-26 10:26:06 +01:00
Nico Weber	b1777b04dc	Revert "[VPlan] Add recipe for first-order rec phis, make splicing explicit." Makes clang crash: https://reviews.llvm.org/D105008#2903350 This reverts commit d2a73fb44ea0b8c981e4b923f811f18793fc4770. Also revert a minor formatting follow-up: This reverts commit 82834a673246f27a541ffcc57e0eb65b008102ef.	2021-07-25 17:39:28 -04:00
Caroline Concatto	5a4de84d55	[LoopVectorize] Fix crash for predicated instruction with scalable VF This patch avoids computing discounts for predicated instructions when the VF is scalable. There is no support for vectorization of loops with division because the vectorizer cannot guarantee that zero divisions will not happen. This loop now does not use VF scalable ``` for (long long i = 0; i < n; i++) if (cond[i]) a[i] /= b[i]; ``` Differential Revision: https://reviews.llvm.org/D101916	2021-07-22 12:48:27 +01:00
David Green	72dc5cab4f	[LV] Make use of PatternMatchers in getReductionPatternCost. NFC Pulled out of D106166, this modifies getReductionPatternCost to use PatternMatchers, hopefully simplifying the code a little.	2021-07-21 11:34:30 +01:00
David Green	4272e64acd	[LV] Change interface of getReductionPatternCost to return Optional Currently the Instruction cost of getReductionPatternCost returns an Invalid cost to specify "did not find the pattern". This changes that to return an Optional with None specifying not found, allowing Invalid to mean an infinite cost as is used elsewhere. Differential Revision: https://reviews.llvm.org/D106140	2021-07-20 16:44:50 +01:00
Caroline Concatto	cf78995c4a	[NFC][LoopVectorizer] Remove VF.isScalable() assertion from collectInstsToScalarize and getInstructionCost This patch removes the assertion when VF is scalable and replaces getKnownMinValue() by getFixedValue(), so it still guards the code against scalable vector types. The assertions were used to guarantee that getknownMinValue were not used for scalable vectors. Differential Revision: https://reviews.llvm.org/D106359	2021-07-20 15:56:30 +01:00
Florian Hahn	d2a73fb44e	[VPlan] Add recipe for first-order rec phis, make splicing explicit. This patch adds a VPFirstOrderRecurrencePHIRecipe, to further untangle VPWidenPHIRecipe into distinct recipes for distinct use cases/lowering. See D104989 for a new recipe for reduction phis. This patch also introduces a new `FirstOrderRecurrenceSplice` VPInstruction opcode, which is used to make the forming of the vector recurrence value explicit in VPlan. This more accurately models def-uses in VPlan and also simplifies code-generation. Now, the vector recurrence values are created at the right place during VPlan-codegeneration, rather than during post-VPlan fixups. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D105008	2021-07-20 16:14:17 +02:00
Kerry McLaughlin	49d73130ca	[LV] Avoid scalable vectorization for loops containing alloca This patch returns an Invalid cost from getInstructionCost() for alloca instructions if the VF is scalable, as otherwise loops which contain these instructions will crash when attempting to scalarize the alloca. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105824	2021-07-16 11:47:13 +01:00
Sander de Smalen	239d01fa88	Reland "[LV] Print remark when loop cannot be vectorized due to invalid costs." The original patch was: https://reviews.llvm.org/D105806 There were some issues with undeterministic behaviour of the sorting function, which led to scalable-call.ll passing and/or failing. This patch fixes the issue by numbering all instructions in the array first, and using that number as the order, which should provide a consistent ordering. This reverts commit a607f64118240f70bf1b14ec121b65f49d63800d.	2021-07-16 10:52:01 +01:00
Sander de Smalen	a607f64118	Revert "[LV] Print remark when loop cannot be vectorized due to invalid costs." This reverts commit efaf3099c8cec1954831ee28a2f75a72096f50eb. This reverts commit dc7bdc1e7121693df112f2fdb11cc6b88580ba4b. Reverting patches due to buildbot failures.	2021-07-15 15:21:57 +01:00
Sander de Smalen	dc7bdc1e71	[LV] Fix determinism for failing scalable-call.ll test. The sort function for emitting an OptRemark was not deterministic, which caused scalable-call.ll to fail on some buildbots. This patch fixes that. This patch also fixes an issue where `Instruction::comesBefore()` is called when two Instructions are in different basic blocks, which would otherwise cause an assertion failure.	2021-07-15 13:16:59 +01:00
Sander de Smalen	efaf3099c8	[LV] Print remark when loop cannot be vectorized due to invalid costs. This patch emits remarks for instructions that have invalid costs for a given set of vectorization factors. Some example output: t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load dst[i] = sinf(src[i]); ^ t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32 dst[i] = sinf(src[i]); ^ t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store dst[i] = sinf(src[i]); ^ Reviewed By: fhahn, kmclaughlin Differential Revision: https://reviews.llvm.org/D105806	2021-07-14 17:11:33 +01:00
Sander de Smalen	d2e4ccc790	[LV] Ignore candidate VFs with invalid costs. This follows on from discussion on the mailing-list: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151047.html to interpret an Invalid cost as 'infinitely expensive', as this simplifies some of the legalization issues with scalable vectors. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D105473	2021-07-12 09:58:22 +01:00
Sander de Smalen	239fcda268	[LV] NFCI: Do cost comparison on InstructionCost directly. Instead of performing the isMoreProfitable() operation on InstructionCost::CostTy the operation is performed on InstructionCost directly, so that it can handle the case where one of the costs is Invalid. This patch also changes the CostTy to be int64_t, so that the type is wide enough to deal with multiplications with e.g. `unsigned MaxTripCount`. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D105113	2021-07-10 11:57:16 +01:00
David Green	38c9a4068d	[TTI] Remove IsPairwiseForm from getArithmeticReductionCost This patch removes the IsPairwiseForm flag from the Reduction Cost TTI hooks, along with some accompanying code for pattern matching reductions from trees starting at extract elements. IsPairWise is now assumed to be false, which was the predominant way that the value was used from both the Loop and SLP vectorizers. Since the adjustments such as D93860, the SLP vectorizer has not relied upon this distinction between paiwise and non-pairwise reductions. This also removes some code that was detecting reductions trees starting from extract elements inside the costmodel. This case was double-counting costs though, adding the individual costs on the individual instruction _and_ the total cost of the reduction. Removing it changes the costs in llvm/test/Analysis/CostModel/X86/reduction.ll to not double count. The cost of reduction intrinsics is still tested through the various tests in llvm/test/Analysis/CostModel/X86/reduce-xyz.ll. Differential Revision: https://reviews.llvm.org/D105484	2021-07-09 11:51:16 +01:00
Philip Reames	723144665b	[LV] Unconditionally branch from middle to scalar preheader if the scalar loop must execute (try 4) Resubmit after the following changes: * Fix a latent bug related to unrolling with required epilogue (see e49d65f). I believe this is the cause of the prior PPC buildbot failure. * Disable non-latch exits for epilogue vectorization to be safe (9ffa90d) * Split out assert movement (600624a) to reduce churn if this gets reverted again. Previous commit message (try 3) Resubmit after fixing test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll Previous commit message... This is a resubmit of 3e5ce4 (which was reverted by 7fe41ac). The original commit caused a PPC build bot failure we never really got to the bottom of. I can't reproduce the issue, and the bot owner was non-responsive. In the meantime, we stumbled across an issue which seems possibly related, and worked around a latent bug in 80e8025. My best guess is that the original patch exposed that latent issue at higher frequency, but it really is just a guess. Original commit message follows... If we know that the scalar epilogue is required to run, modify the CFG to end the middle block with an unconditional branch to scalar preheader. This is instead of a conditional branch to either the preheader or the exit block. The motivation to do this is to support multiple exit blocks. Specifically, the current structure forces us to identify immediate dominators and which exit block to branch from in the middle terminator. For the multiple exit case - where we know require scalar will hold - these questions are ill formed. This is the last change needed to support multiple exit loops, but since the diffs are already large enough, I'm going to land this, and then enable separately. You can think of this as being NFCIish prep work, but the changes are a bit too involved for me to feel comfortable tagging the review that way. Differential Revision: https://reviews.llvm.org/D94892	2021-07-07 07:44:35 -07:00
Dylan Fleming	7215dcfe36	[SVE] Fix ShuffleVector cast<FixedVectorType> in truncateToMinimalBitwidths Depends on D104239 Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105341	2021-07-07 15:30:10 +01:00
Dylan Fleming	7586b47fb6	[SVE] Fix cast<FixedVectorType> in truncateToMinimalBitwidths Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D104239	2021-07-07 09:58:05 +01:00
Philip Reames	9ffa90d6c2	[LV] Disable epilogue vectorization for non-latch exits When skimming through old review discussion, I noticed a post commit comment on an earlier patch which had gone unaddressed. Better late (4 months), than never right? I'm not aware of an active problem with the combination of non-latch exits and epilogue vectorization, but the interaction was not considered and I'm not modivated to make epilogue vectorization work with early exits. If there were a bug in the interaction, it would be pretty hard to hit right now (as we canonicalize towards bottom tested loops), but an upcoming change to allow multiple exit loops will greatly increase the chance for error. Thus, let's play it safe for now.	2021-07-06 10:57:10 -07:00
Florian Hahn	ef0d147cdc	Recommit "[VPlan] Add VPReductionPHIRecipe (NFC)." and follow-ups. This reverts commit 706bbfb35bd31051e46ac77aab3e9b2dbc3abe78. The committed version moves the definition of VPReductionPHIRecipe out of an ifdef only intended for ::print helpers. This should resolve the build failures that caused the revert	2021-07-06 14:15:42 +01:00
Kerry McLaughlin	a7512401e5	[LV] Prevent vectorization with unsupported element types. This patch adds a TTI function, isElementTypeLegalForScalableVector, to query whether it is possible to vectorize a given element type. This is called by isLegalToVectorizeInstTypesForScalable to reject scalable vectorization if any of the instruction types in the loop are unsupported, e.g: int foo(__int128_t* ptr, int N) #pragma clang loop vectorize_width(4, scalable) for (int i=0; i<N; ++i) ptr[i] = ptr[i] + 42; This example currently crashes if we attempt to vectorize since i128 is not a supported type for scalable vectorization. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D102253	2021-07-06 13:06:21 +01:00
Florian Hahn	706bbfb35b	Revert "[VPlan] Add VPReductionPHIRecipe (NFC)." and follow-ups This reverts commit 3fed6d443f802c43aade1b5b1b09f5e2f8b3edb1, bbcbf21ae60c928e07dde6a1c468763b3209d1e6 and 6c3451cd76cbd0cd973d9c2b08b168dcd0bce3c2. The changes causing build failures with certain configurations, e.g. https://lab.llvm.org/buildbot/#/builders/67/builds/3365/steps/6/logs/stdio lib/libLLVMVectorize.a(LoopVectorize.cpp.o): In function `llvm::VPRecipeBuilder::tryToCreateWidenRecipe(llvm::Instruction, llvm::ArrayRef<llvm::VPValue>, llvm::VFRange&, std::unique_ptr<llvm::VPlan, std::default_delete<llvm::VPlan> >&) [clone .localalias.8]': LoopVectorize.cpp:(.text._ZN4llvm15VPRecipeBuilder22tryToCreateWidenRecipeEPNS_11InstructionENS_8ArrayRefIPNS_7VPValueEEERNS_7VFRangeERSt10unique_ptrINS_5VPlanESt14default_deleteISA_EE+0x63b): undefined reference to `vtable for llvm::VPReductionPHIRecipe' collect2: error: ld returned 1 exit status	2021-07-06 12:10:03 +01:00
Florian Hahn	6c3451cd76	[VPlan] Add VPReductionPHIRecipe (NFC). This patch is a first step towards splitting up VPWidenPHIRecipe into separate recipes for the 3 distinct cases they model: 1. reduction phis, 2. first-order recurrence phis, 3. pointer induction phis. This allows untangling the code generation and allows us to reduce the reliance on LoopVectorizationCostModel during VPlan code generation. Discussed/suggested in D100102, D100113, D104197. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104989	2021-07-06 11:25:28 +01:00
Kerry McLaughlin	17b701c43c	[LV] Collect a list of all element types found in the loop (NFC) Splits `getSmallestAndWidestTypes` into two functions, one of which now collects a list of all element types found in the loop (`ElementTypesInLoop`). This ensures we do not have to iterate over all instructions in the loop again in other places, such as in D102253 which disables scalable vectorization of a loop if any of the instructions use invalid types. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105437	2021-07-06 10:37:41 +01:00
Nikita Popov	fabc17192e	[IRBuilder] Add type argument to CreateMaskedLoad/Gather Same as other CreateLoad-style APIs, these need an explicit type argument to support opaque pointers. Differential Revision: https://reviews.llvm.org/D105395	2021-07-04 12:17:59 +02:00
David Sherwood	51b4ab26ca	[NFC] Add new setDebugLocFromInst that uses the class Builder by default In lots of places we were calling setDebugLocFromInst and passing in the same Builder member variable found in InnerLoopVectorizer. I personally found this confusing so I've changed the interface to take an Optional<IRBuilder<> *> and we can now pass in None when we want to use the class member variable. Differential Revision: https://reviews.llvm.org/D105100	2021-07-01 14:23:34 +01:00
David Sherwood	7b7b5b5a26	[NFC] Rename shadowed variable in InnerLoopVectorizer::createInductionVariable Avoid creating a IRBuilder stack variable with the same name as the class member.	2021-06-30 11:11:49 +01:00
Philip Reames	e49d65f36d	[LV] Fix bug when unrolling (only) a loop with non-latch exit If we unroll a loop in the vectorizer (without vectorizing), and the cost model requires a epilogue be generated for correctness, the code generation must actually do so. The included test case on an unmodified opt will access memory one past the expected bound. As a result, this patch is fixing a latent miscompile. Differential Revision: https://reviews.llvm.org/D103700	2021-06-29 08:04:26 -07:00
David Sherwood	9de63367d8	Revert "[NFC] Remove shadowed variable in InnerLoopVectorizer::createInductionVariable" This reverts commit 9dde51416209a5552156384b9c2b08b676818d70.	2021-06-29 15:20:22 +01:00
David Sherwood	9dde514162	[NFC] Remove shadowed variable in InnerLoopVectorizer::createInductionVariable Avoid creating a IRBuilder stack variable with the same name as the class member.	2021-06-29 14:34:30 +01:00
David Sherwood	8a3365fba2	Revert "[NFC] Remove shadowed variable in InnerLoopVectorizer::createInductionVariable" This reverts commit dcfc2c3fac980b137415c17f2f19c06c3e2bd7fb.	2021-06-29 14:04:42 +01:00
Florian Hahn	47215e1c62	[LV] Fix crash when target instruction for sinking is dead. This patch fixes a crash when the target instruction for sinking is dead. In that case, no recipe is created and trying to get the recipe for it results in a crash. To ensure all sink targets are alive, find & use the first previous alive instruction. Note that the case where the sink source is dead is already handled. Found by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35320 Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104603	2021-06-29 13:31:22 +01:00
David Sherwood	303b6d5e98	[LoopVectorize] Add support for scalable vectorization of invariant stores Previously in setCostBasedWideningDecision if we encountered an invariant store we just assumed that we could scalarize the store and called getUniformMemOpCost to get the associated cost. However, for scalable vectors this is not an option because it is not currently possibly to scalarize the store. At the moment we crash in VPReplicateRecipe::execute when trying to scalarize the store. Therefore, I have changed setCostBasedWideningDecision so that if we are storing a scalable vector out to a uniform address and the target supports scatter instructions, then we should use those instead. Tests have been added here: Transforms/LoopVectorize/AArch64/sve-inv-store.ll Differential Revision: https://reviews.llvm.org/D104624	2021-06-29 11:56:09 +01:00
David Sherwood	dcfc2c3fac	[NFC] Remove shadowed variable in InnerLoopVectorizer::createInductionVariable Avoid creating a IRBuilder stack variable with the same name as the class member.	2021-06-29 09:14:35 +01:00
Kerry McLaughlin	f99672568f	[LoopVectorize] Fix strict reductions where VF = 1 Currently we will allow loops with a fixed width VF of 1 to vectorize if the -enable-strict-reductions flag is set. However, the loop vectorizer will not use ordered reductions if `VF.isScalar()` and the resulting vectorized loop will be out of order. This patch removes `VF.isVector()` when checking if ordered reductions should be used. Also, instead of converting the FAdds to reductions if the VF = 1, operands of the FAdds are changed such that the order is preserved. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D104533	2021-06-28 11:27:10 +01:00
Florian Hahn	80aa7e147e	[VPlan] Merge predicated-triangle regions, after sinking. Sinking scalar operands into predicated-triangle regions may allow merging regions. This patch adds a VPlan-to-VPlan transform that tries to merge predicate-triangle regions after sinking. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D100260	2021-06-28 11:10:38 +01:00
Florian Hahn	f1a6430272	[VPlan] Track both incoming values for first-order recurrence phis. This patch updates VPWidenPHI recipes for first-order recurrences to also track the incoming value from the back-edge. Similar to D99294, which did the same for reductions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104197	2021-06-27 14:29:35 +01:00
Florian Hahn	7f36981977	[LV] Adjust trip count based on IsOrdered in widenPHIInstruction (NFC). Suggested in D104197, avoids the early exit.	2021-06-26 13:13:25 +01:00
Florian Hahn	91053e327c	[LV] Reflow comment for VectorizationCostTy (NFC).	2021-06-25 14:20:06 +01:00
Florian Hahn	833bdbe93c	[LV] Support sinking recipe in replicate region after another region. This patch handles sinking a replicate region after another replicate region. In that case, we can connect the sink region after the target region. This properly handles the case for which an assertion has been added in 337d7652823f. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34842. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D103514	2021-06-24 13:58:42 +01:00
Roman Lebedev	37dfc467ac	[NFC] LoopVectorizationCostModel::getMaximizedVFForTarget(): clarify debug msg This really isn't talking about vectors in general, but only about either fixed or scalable vectors, and it's pretty confusing to see it state that there aren't any vectors :)	2021-06-17 21:07:34 +03:00
Florian Hahn	80a403348b	[VPlan] Support PHIs as LastInst when inserting scalars in ::get(). At the moment, we create insertelement instructions directly after LastInst when inserting scalar values in a vector in VPTransformState::get. This results in invalid IR when LastInst is a phi, followed by another phi. In that case, the new instructions should be inserted just after the last PHI node in the block. At the moment, I don't think the problematic case can be triggered, but it can happen once predicate regions are merged and multiple VPredInstPHI recipes are in the same block (D100260). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104188	2021-06-17 09:36:44 +01:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit 0ee439b705e82a4fe20e2, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Simon Pilgrim	5e6bfb661e	[Analysis] Pass RecurrenceDescriptor as const reference. NFCI. We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.). Differential Revision: https://reviews.llvm.org/D104029	2021-06-11 10:24:14 +01:00

1 2 3 4 5 ...

1400 Commits