749 Commits

Author SHA1 Message Date
Florian Hahn
ef89e3efa9
[VPlan] Collect ephemeral values for VPlan.
Port collectEphemeralValues to VPlan as collectEphemeralRecipesForVPlan,
use it in willGenerateVectors. This fixes a regression caused by
29b8b72117 for loops where the only vector values are ephemeral.
2024-07-09 21:34:49 +01:00
Florian Hahn
27ccc8835e
[LV] Add tests with ephemeral values that are widened.
Add tests with loops with ephemeral values that are widened.
After 29b8b72117, @ephemeral_load_and_compare_another_load_used_outside
is vectorized even though the only vector values that are generated are
ephemeral.
2024-07-08 13:15:39 +01:00
Florian Hahn
29b8b72117
[LV] Move check if any vector insts will be generated to VPlan. (#96622)
This patch moves the check if any vector instructions will be generated
from getInstructionCost to be based on VPlan. This simplifies
getInstructionCost, is more accurate as we check the final result and
also allows us to exit early once we visit a recipe that generates
vector instructions.

The helper can then be re-used by the VPlan-based cost model to match
the legacy selectVectorizationFactor behavior, this fixing a crash and
paving the way to recommit
https://github.com/llvm/llvm-project/pull/92555.

PR: https://github.com/llvm/llvm-project/pull/96622
2024-07-07 20:08:01 +01:00
Florian Hahn
99d6c6d936
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.

Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.

After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.

This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.

Depends on https://github.com/llvm/llvm-project/pull/92525

PR: https://github.com/llvm/llvm-project/pull/92651
2024-07-05 10:08:42 +01:00
Noah Goldstein
7c96469ea8 [ValueTracking] Extend LHS/RHS with matching operand to work without constants.
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, `X u> Y` implies `X != 0`.

In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Closes #85557
2024-07-03 20:18:51 +08:00
David Green
352a836176
[InstCombine] Canonicalize non-i8 gep of mul to i8 (#96606)
This is a small canonicalization for `gep i32, p, (mul x, C)` -> `gep
i8, p, (mul x, C*4)`, so that the mul can combine both of the constant
multiplications, and we take a small step towards canonicalizing more
geps to i8.

It currently doesn't attempt to check for multiple uses on the mul, but
that should be possible if it sounds better. Let me know what you think
of the idea in general.
2024-06-26 14:25:54 +01:00
Florian Hahn
8681bb8bed
[LV] Add additional test coverage for cost modeling.
Add missing tests uncovered by
https://github.com/llvm/llvm-project/pull/92555.

Includes test for https://github.com/llvm/llvm-project/issues/96294 and
https://github.com/llvm/llvm-project/issues/96328
2024-06-26 10:18:01 +01:00
Nikita Popov
eeb0884e66 [LoopUnroll] Use poison instead of undef for preheader value 2024-06-25 12:09:58 +02:00
Florian Hahn
3808ba78de
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.

Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.


PR: https://github.com/llvm/llvm-project/pull/95816
2024-06-20 13:42:20 +01:00
Florian Hahn
b9702bb12f
[LV] Consider insts feeding interleave group pointers free.
For interleave groups, we only generate a pointer for the start of the
interleave group (the instruction at the insert position). The other
addresses for other members are alreayd considered free, but so are
their operands, if they are only used in address computations for
other interleave group members.
2024-06-19 17:06:52 +01:00
Florian Hahn
3be7312f81
[LV] Add more masked store cost tests with different masks.
Add additional masked store tests which caused crashes with earlier
versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-19 15:34:03 +01:00
Florian Hahn
fb86cb7ec1
[LV] Add extra tests for interleave-group, reduction store costing.
Add extra cost model tests exposed by VPlan cost-model transition,
causing revert in 6f538f6a2d3224efda985e9eb09012fa4275ea92
2024-06-18 14:35:51 +01:00
Florian Hahn
52d29eb287
[LV] Add extra cost model tests with truncated inductions.
Extra test cases that caused revert of
https://github.com/llvm/llvm-project/pull/92555
2024-06-13 20:42:53 +01:00
Florian Hahn
2e4c06780c
[LV] Add extra X86 cost tests for any_of reduction and multi-exit loops.
Add extra test coverage to ensure decisions do not change when
transitioning to a VPlan-based cost model.
2024-06-10 13:13:04 +01:00
Florian Hahn
998c33e5fc
[VPlan] Mark FirstOrderRecurrenceSplice as not having side-effects.
Now that FOR exit and resume value creation is explicitly modeled in
VPlan (05e1b5340b0caf1, 07b330132c0b) it doesn't depend on the first
order recurrence splice being preserved and it can now be marked as not
having side-effects. This allows removal of first-order-recurrence-splce
if the FOR is only used in the exit or as scalar ph resume value.
2024-06-08 21:40:30 +01:00
Florian Hahn
a43d999d14
[VPlan] Check if only first part is used for all per-part VPInsts.
Apply the onlyFirstPartUsed logic generally to all per-part
VPInstructions. Note that the test changes remove the second part
of an unsued first-order recurrence splice.
2024-06-08 20:31:54 +01:00
Farzon Lotfi
1d87433593
[x86] Add tan intrinsic part 4 (#90503)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294


Much of this change was following how G_FSIN and G_FCOS were used.

Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
-  `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic

resolves https://github.com/llvm/llvm-project/issues/70082
2024-06-05 15:01:33 -04:00
Florian Hahn
05e1b5340b
[VPlan] Model FOR resume value extraction in VPlan. (#93396)
This patch uses the ExtractFromEnd VPInstruction opcode
to extract the value of a FOR to be used as resume value for the ph in
the scalar loop.

It adds a new live-out that temporarily wraps the FOR phi in the scalar
loop. fixFixedOrderRecurrence will process live outs for fixed order
recurrence phis by creating a new phi node in the scalar preheader, 
using the generated value for the live-out as incoming value from the
middle block and the original start value as incoming value for the
other edge. Creation of the phi in the preheader, as well as updating
the phi in the scalar loop will also be moved to VPlan in the future,
eventually retiring fixFixedOrderRecurrence

Depends on https://github.com/llvm/llvm-project/pull/93395

PR: https://github.com/llvm/llvm-project/pull/93396
2024-06-05 11:18:06 +01:00
Florian Hahn
e949b54a5b
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.

Depends on https://github.com/llvm/llvm-project/pull/93498

PR: https://github.com/llvm/llvm-project/pull/93499
2024-06-04 22:23:30 +01:00
Florian Hahn
f7e63e8b46
[LV] Operands feeding pointers of interleave member pointers are free.
For interleave groups we only create a pointer for the start of the
interleave group, not all original loads/stores. Mark single-use ops
feeding interleave group mem ops as free when vectorizing.
2024-06-01 13:59:29 +01:00
Florian Hahn
4c6367b3e5
[LV] Add test with strided interleave groups and maximizing bandwidth. 2024-06-01 12:26:00 +01:00
Freddy Ye
4def1ce101
Reland "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93136)
This reverts commit aa4069ea96e5eb62bc8c7895b9d920f129611b3a.
2024-05-24 13:46:34 +08:00
Freddy Ye
aa4069ea96
Revert "[X86] Remove knl/knm specific ISAs supports (#92883)" (#93123)
This reverts commit 282d2ab58f56c89510f810a43d4569824a90c538.
2024-05-23 10:25:23 +08:00
Freddy Ye
282d2ab58f
[X86] Remove knl/knm specific ISAs supports (#92883)
Cont. patch after https://github.com/llvm/llvm-project/pull/75580
2024-05-23 09:46:44 +08:00
Nikita Popov
8e8d2595da
[ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872)
This patch canonicalizes constant expression GEPs to use i8 source
element type, aka ptradd. This is the ConstantFolding equivalent of the
InstCombine canonicalization introduced in #68882.

I believe all our optimizations working on constant expression GEPs
(like GlobalOpt etc) have already been switched to work on offsets, so I
don't expect any significant fallout from this change.

This is part of:
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699
2024-05-20 11:47:30 +02:00
Florian Hahn
67d840b60f
[VPlan] Relax over-aggressive assertion in VPTransformState::get().
There are cases where a vector value has some users that demand the
the single scalar value only (NeedsScalar), while other users demand the
vector value (see attached test cases). In those cases, the NeedsScalar
users should only demand the first lane.

Fixes https://github.com/llvm/llvm-project/issues/91883.
2024-05-14 19:10:49 +01:00
Florian Hahn
55fc5eb95f
[LV] Add additional cost model tests with inductions and truncates.
Add test coverage for additional cases not covered by current tests with
multiple inductions and truncates.
2024-04-23 09:24:01 +01:00
Ramkumar Ramachandra
73e7f2ff70
LoopVectorize: guard marking iv as scalar; fix bug (#88730)
When collecting loop scalars, LoopVectorize over-eagerly marks the
induction variable and its update as scalars after vectorization, even
if the induction variable update is a first-order recurrence. Guard the
process with this check, fixing a crash.

Fixes #72969.
2024-04-18 14:41:07 +01:00
Ramkumar Ramachandra
63d8058ef5
LoopVectorize: guard appending InstsToScalarize; fix bug (#88720)
In the process of collecting instructions to scalarize, LoopVectorize
uses faulty reasoning whereby it also adds instructions that will be
scalar after vectorization. If an instruction satisfies
isScalarAfterVectorization() for the given VF, it should not be appended
to InstsToScalarize. Add this extra guard, fixing a crash.

Fixes #55096.
2024-04-18 10:03:07 +01:00
Florian Hahn
a9bafe91dd
[VPlan] Split VPWidenMemoryInstructionRecipe (NFCI). (#87411)
This patch introduces a new VPWidenMemoryRecipe base class and distinct
sub-classes to model loads and stores.

This is a first step in an effort to simplify and modularize code
generation for widened loads and stores and enable adding further more
specialized memory recipes.

PR: https://github.com/llvm/llvm-project/pull/87411
2024-04-17 11:00:58 +01:00
Noah Goldstein
b6bd41db31 [InstCombine] Add canonicalization of sitofp -> uitofp nneg
This is essentially the same as #82404 but has the `nneg` flag which
allows the backend to reliably undo the transform.

Closes #88299
2024-04-16 15:26:25 -05:00
Yingwei Zheng
b109477615
[InstCombine] Infer nsw/nuw for trunc (#87910)
This patch adds support for inferring trunc's nsw/nuw flags.
2024-04-11 19:10:53 +08:00
Florian Hahn
c836983671
[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770)
VPBlendRecipe does not use the first mask operand. Removing it allows
VPlan-based DCE to remove unused mask computations.

This also fixes #87410, where unused Not VPInstructions are considered
having only their first lane demanded, but some of their operands
providing a vector value due to other users.

Fixes https://github.com/llvm/llvm-project/issues/87410

PR: https://github.com/llvm/llvm-project/pull/87770
2024-04-09 11:14:05 +01:00
Florian Hahn
233c030dcb
[LV] Add extra tests for induction cost modeling. 2024-04-06 12:36:07 +01:00
Alexey Bataev
413a66f339
[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172)
This patch introduces generating VP intrinsics in the Loop Vectorizer.

Currently the Loop Vectorizer supports vector predication in a very
limited capacity via tail-folding and masked load/store/gather/scatter
intrinsics. However, this does not let architectures with active vector
length predication support take advantage of their capabilities.
Architectures with general masked predication support also can only take
advantage of predication on memory operations. By having a way for the
Loop Vectorizer to generate Vector Predication intrinsics, which (will)
provide a target-independent way to model predicated vector
instructions. These architectures can make better use of their
predication capabilities.

Our first approach (implemented in this patch) builds on top of the
existing tail-folding mechanism in the LV (just adds a new tail-folding
mode using EVL), but instead of generating masked intrinsics for memory
operations it generates VP intrinsics for loads/stores instructions. The
patch adds a new VPlanTransforms to replace the wide header predicate
compare with EVL and updates codegen for load/stores to use VP
store/load with EVL.

Other important part of this approach is how the Explicit Vector Length
is computed. (VP intrinsics define this vector length parameter as
Explicit Vector Length (EVL)). We use an experimental intrinsic
`get_vector_length`, that can be lowered to architecture specific
instruction(s) to compute EVL.

Also, added a new recipe to emit instructions for computing EVL. Using
VPlan in this way will eventually help build and compare VPlans
corresponding to different strategies and alternatives.

Differential Revision: https://reviews.llvm.org/D99750
2024-04-04 18:30:17 -04:00
Florian Hahn
6ef829941b
Recommit "[VPlan] Replace disjoint or with add instead of dropping disjoint. (#83821)"
Recommit with a fix for the use-after-free causing the revert.
This reverts the revert commit f872043e055f4163c3c4b1b86ca0354490174987.

Original commit message:

Dropping disjoint from an OR may yield incorrect results, as some
analysis may have converted it to an Add implicitly (e.g. SCEV used for
dependence analysis). Instead, replace it with an equivalent Add.

This is possible as all users of the disjoint OR only access lanes where
the operands are disjoint or poison otherwise.

Note that replacing all disjoint ORs with ADDs instead of dropping the
flags is not strictly necessary. It is only needed for disjoint ORs that
SCEV treated as ADDs, but those are not tracked.

There are other places that may drop poison-generating flags; those
likely need similar treatment.

Fixes https://github.com/llvm/llvm-project/issues/81872

PR: https://github.com/llvm/llvm-project/pull/83821
2024-03-27 19:11:18 +00:00
Florian Hahn
06bb8c9f20
[VPlan] Explicitly handle scalar pointer inductions. (#83068)
Add a new PtrAdd opcode to VPInstruction that corresponds to
IRBuilder::CreatePtrAdd, which creates a GEP with source element type
i8.

This is then used to model scalarizing VPWidenPointerInductionRecipe by
introducing scalar-steps to model the index increment followed by a
PtrAdd.

Note that PtrAdd needs to be able to generate code for only the first
lane or for all lanes. This may warrant introducing a separate recipe
for scalarizing that can be created without relying on the underlying
IR.

Depends on https://github.com/llvm/llvm-project/pull/80271

PR: https://github.com/llvm/llvm-project/pull/83068
2024-03-26 16:01:57 +01:00
Benjamin Kramer
f872043e05 Revert "[VPlan] Replace disjoint or with add instead of dropping disjoint. (#83821)"
This reverts commit c2c1e6ee4ce0df3d000ba880fa6cf58441da6462. It creates
a use after free.

==8342==ERROR: AddressSanitizer: heap-use-after-free on address 0x50f000001760 at pc 0x55b9fb84a8fb bp 0x7ffc18468a10 sp 0x7ffc18468a08
READ of size 1 at 0x50f000001760 thread T0
 #0 0x55b9fb84a8fa in dropPoisonGeneratingFlags llvm/lib/Transforms/Vectorize/VPlan.h:1040:13
 #1 0x55b9fb84a8fa in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>)::$_0::operator()(llvm::VPRecipeBase*) const llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1236:23
 #2 0x55b9fb84a196 in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Can be reproduced with asan on
Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
Transforms/LoopVectorize/X86/pr81872.ll
Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
2024-03-20 15:14:58 +01:00
Noah Goldstein
6960ace534 Revert "[InstCombine] Canonicalize (sitofp x) -> (uitofp x) if x >= 0"
This reverts commit d80d5b923c6f611590a12543bdb33e0c16044d44.

It wasn't a particularly important transform to begin with and caused
some codegen regressions on targets that prefer `sitofp` so dropping.

Might re-visit along with adding `nneg` flag to `uitofp` so its easily
reversable for the backend.
2024-03-20 00:50:45 -05:00
Florian Hahn
c2c1e6ee4c
[VPlan] Replace disjoint or with add instead of dropping disjoint. (#83821)
Dropping disjoint from an OR may yield incorrect results, as some
analysis may have converted it to an Add implicitly (e.g. SCEV used for
dependence analysis). Instead, replace it with an equivalent Add.

This is possible as all users of the disjoint OR only access lanes where
the operands are disjoint or poison otherwise.

Note that replacing all disjoint ORs with ADDs instead of dropping the
flags is not strictly necessary. It is only needed for disjoint ORs that
SCEV treated as ADDs, but those are not tracked.

There are other places that may drop poison-generating flags; those
likely need similar treatment.

Fixes https://github.com/llvm/llvm-project/issues/81872


PR: https://github.com/llvm/llvm-project/pull/83821
2024-03-19 20:16:18 +01:00
Noah Goldstein
d80d5b923c [InstCombine] Canonicalize (sitofp x) -> (uitofp x) if x >= 0
Just a standard canonicalization.

Proofs: https://alive2.llvm.org/ce/z/9W4VFm

Closes #82404
2024-03-13 18:26:21 -05:00
annamthomas
866ac9a165
[LV] Address postcommit review for PR84782 (#84797)
This testcase was added to show miscompile in
https://github.com/llvm/llvm-project/issues/81872
2024-03-11 13:23:00 -04:00
annamthomas
34acdb3ec2
Precommit testcase for pr81872 (#84782)
Testcase shows miscompile when dropping disjoint flag from disjoint or
during vectorization.
2024-03-11 12:16:52 -04:00
Florian Hahn
911055e34f
[VPlan] Consistently use (Part, 0) for first lane scalar values (#80271)
At the moment, some VPInstructions create only a single scalar value,
but use VPTransformatState's 'vector' storage for this value. Those
values are effectively uniform-per-VF (or in some cases
uniform-across-VF-and-UF). Using the vector/per-part storage doesn't
interact well with other recipes, that more accurately using (Part,
Lane) to look up scalar values and prevents VPInstructions creating
scalars from interacting with other recipes working with scalars.

This PR tries to unify handling of scalars by using (Part, 0) for scalar
values where only the first lane is demanded. This allows using
VPInstructions with other recipes like VPScalarCastRecipe and is also
needed when using VPInstructions in more cases otuside the vector loop
region to generate scalars.

Depends on https://github.com/llvm/llvm-project/pull/80269
2024-02-26 19:06:43 +00:00
Benjamin Kramer
e7c60915e6 Remove duplicated REQUIRES: asserts 2024-02-23 12:01:30 +01:00
Ramkumar Ramachandra
f5c8e9e531
LoopVectorize/test: guard pr72969 with asserts (#82653)
Follow up on 695a9d8 (LoopVectorize: add test for crash in #72969) to
guard pr72969.ll with REQUIRES: asserts, in order to be reasonably
confident that it will crash reliably.
2024-02-22 19:55:18 +00:00
Benjamin Kramer
3168af56bc LoopVectorize: Mark crash test as requiring assertions 2024-02-22 20:25:58 +01:00
Ramkumar Ramachandra
695a9d84dc
LoopVectorize: add test for crash in #72969 (#74111) 2024-02-22 16:00:33 +00:00
Rohit Aggarwal
36adfec155
Adding support of AMDLIBM vector library (#78560)
Hi,

AMD has it's own implementation of vector calls. This patch include the
changes to enable the use of AMD's math library using -fveclib=AMDLIBM.
Please refer https://github.com/amd/aocl-libm-ose 

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-02-15 12:13:07 +05:30
Nilanjana Basu
c1c5b854ad
[LV] Remove loop trip count threshold for deciding whether to interleave a loop (#67725)
A set of microbenchmarks (https://github.com/llvm/llvm-test-suite/pull/26) showed that loop interleaving can be beneficial for loops with low trip count as well. Loop interleaving count computation is updated accordingly in prior patches while this patch removes the loop trip count threshold for interleaving.
2024-02-05 17:23:58 -08:00