llvm-project

Author	SHA1	Message	Date
Alexandros Lamprineas	6c2ad8ac7b	[TLI][NFC] Autogenerate vectorized call tests for SLEEF/ArmPL. (#76146 ) This patch prepares the ground for #76060. * Unifies ArmPL and SLEEF tests for better coverage * Replaces deprecated float* and double* types with ptr * Adds noalias attribute to pointer arguments * Adds some cmd-line options to the RUN lines to simplify output * Removes datalayout since target triple is provided * Removes checks for return statements * Refactors the regex filter for autogenerated checks * Removes redundant test file suffix (already under the AArch64 dir)	2023-12-22 16:29:18 +00:00
Paschalis Mpeis	2349731992	[TLI] Add SLEEFGNUABI mappings for fmod/fmodf fixed-width. (#75803 ) Cleanup test sleef-calls-aarch64.ll: - make the util update script's regex more clear - eliminate scalar epilogues in tests	2023-12-20 09:08:17 +00:00
Nikita Popov	a5f3415533	[InstCombine] Replace non-demanded undef vector with poison If an operand (esp to shufflevector or insertelement) is not demanded, canonicalize it from undef to poison.	2023-12-18 16:12:37 +01:00
Shih-Po Hung	b97c5a9554	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 21:16:11 +08:00
Simon Pilgrim	b7fc78255e	Revert rG2047ab00eaf0a17e71ce5e8a5b27a8c90f034c3d "[VPlan] Add a test for testing unused interleave recipes (#75026 )" vplan-unused-interleave-group.ll is causing buildbot failures	2023-12-14 10:25:41 +00:00
Shih-Po Hung	2047ab00ea	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 17:36:58 +08:00
Nilanjana Basu	41a3828838	[LV] Added pre-commit tests for changing loop interleaving count computation (#74689 ) Added more pre-commit tests for evaluating changes to loop interleaving count computation in (https://github.com/llvm/llvm-project/pull/73766). The new set of tests address the change in IC computation to minimize the remainder TC of the vectorized loop while maximizing the IC when the remainder TC is the same.	2023-12-12 11:09:25 +05:30
Florian Hahn	a5891fa4d2	[VPlan] Initial modeling of VF * UF as VPValue. (#74761 ) This patch starts initial modeling of VF * UF in VPlan. Initially, introduce a dedicated VFxUF VPValue, which is then populated during VPlan::prepareToExecute. Initially, the VF * UF applies only to the main vector loop region. Once we extend the scope of VPlan in the future, we may want to associate different VFxUFs with different vector loop regions (e.g. the epilogue vector loop) This allows explicitly parameterizing recipes that rely on the VF * UF, like the canonical induction increment. At the moment, this mainly helps to avoid generating some duplicated calls to vscale with scalable vectors. It should also allow using EVL as induction increments explicitly in D99750. Referring to VF * UF is also needed in other places that we plan to migrate to VPlan, like the minimum trip count check during skeleton creation. The first version creates the value for VF * UF directly in prepareToExecute to limit the scope of the patch. A follow-on patch will model VF * UF computation explicitly in VPlan using recipes. Moved from Phabricator (https://reviews.llvm.org/D157322)	2023-12-08 18:30:30 +00:00
Florian Hahn	5ea6a3fc6d	[VPlan] Compute scalable VF in preheader for induction increment. (#74762 ) UF * VF is loop invariant and can be computed directly in the preheader. This prepares the code for #74761 and reduces the test changes.	2023-12-08 12:18:31 +00:00
Graham Hunter	d0d5ef8133	[LV] Add support for linear arguments for vector function variants (#73941 ) If we have vectorized variants of a function which take linear parameters, we should be able to vectorize assuming the strides match.	2023-12-08 10:24:05 +00:00
Nikita Popov	d77067d08a	[ValueTracking] Add dominating condition support in computeKnownBits() (#73662 ) This adds support for using dominating conditions in computeKnownBits() when called from InstCombine. The implementation uses a DomConditionCache, which stores which branches may provide information that is relevant for a given value. DomConditionCache is similar to AssumptionCache, but does not try to do any kind of automatic tracking. Relevant branches have to be explicitly registered and invalidated values explicitly removed. The necessary tracking is done inside InstCombine. The reason why this doesn't just do exactly the same thing as AssumptionCache is that a lot more transforms touch branches and branch conditions than assumptions. AssumptionCache is an immutable analysis and mostly gets away with this because only a handful of places have to register additional assumptions (mostly as a result of cloning). This is very much not the case for branches. This change regresses compile-time by about ~0.2%. It also improves stage2-O0-g builds by about ~0.2%, which indicates that this change results in additional optimizations inside clang itself. Fixes https://github.com/llvm/llvm-project/issues/74242.	2023-12-06 14:17:18 +01:00
Graham Hunter	f0f899932b	[LV] Linear argument tests for vectorization of function calls (#73936 ) Tests to exercise vectorization of function calls where a vector variant takes a linear parameter.	2023-12-06 11:55:03 +00:00
Florian Hahn	bbd1941a38	[VPlan] Add disjoint flag to VPRecipeWithIRFlags. (#74364 ) A new disjoint flag was added for OR instructions in #72583. Update VPRecipeWithIRFlags to also support the new flag. This allows printing and preserving the disjoint flag in vectorized code.	2023-12-05 15:21:59 +00:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Florian Hahn	cd4348349a	[VPlan] Sink cases where no truncate is needed in truncateMinimalBWs. MinBWs contains entries that specify the minimum required bitwidth. In some cases, the old and new bitwidths can be equal (see test case) and in those cases no truncations are needed, so skip those cases. Fixes #74307.	2023-12-04 15:35:54 +00:00
Florian Hahn	efec4cc501	[LV] Remove unused CHECK lines, remove IR references from test. Clean up sve-tail-folding-option.ll by removing the unused CHECK-TF-NEOVERSE-V1 prefix (note the use of non-opaque pointers) and remove IR value references.	2023-12-04 13:06:30 +00:00
Florian Hahn	c890582912	[VPlan] Account for live-in entries in MinBW used by replicate recipes. In some cases MinBWs may contain entries for live-ins that are not used by VPWidenRecipe or VPWidenSelectRecipes. In those cases, the live-ins won't get processed, so make sure we include them in the count when used as operands in VPWidenCast and VPWidenSelectRecipe. Fixes https://github.com/llvm/llvm-project/issues/74231	2023-12-03 11:15:29 +00:00
Craig Topper	7ec4f6094e	[InstCombine] Infer disjoint flag on Or instructions. (#72912 ) The disjoint flag was recently added to IR in #72583 We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.	2023-12-02 14:11:12 -08:00
Florian Hahn	70535f5e60	[VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version. This patch replaces the IR based truncateToMinimalBitwidths with a VPlan version. This has 3 benefits: 1) the VPlan-based version is simpler; we don't need to implement special codegen for each supported instruction type like the IR based one. 2) Removes a dependency on the cost-model after VPlan execution and 3) Removes a use of getVPValue that uses underlying values after VPlan execution (See removed FIXME). Depends on D149081. Depends on D149079. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149903	2023-12-02 16:12:38 +00:00
Paschalis Mpeis	1bfb84b477	[NFC][TLI] Improve tests for ArmPL and SLEEF Intrinsics. (#73352 ) Auto-generate test `armpl-intrinsics.ll` and simplify tests: - Eliminate scalar tail with no tail-folding flag. - Use active lane mask for shorter check lines (no long `shufflevectors`). - Eliminate scalar loops by providing `noalias` to relevant arguments and run `simplifycfg` to drop them. - Update script now use `@llvm.compiler.used` instead of a longer regex.	2023-11-29 11:19:10 +00:00
Graham Hunter	104b7c624e	[LV] Add support for uniform parameters on vectorized function variants (#72891 ) Parameters marked as uniform take a scalar value, assuming the value is invariant in the scalar loop.	2023-11-28 15:01:32 +00:00
Nikita Popov	f0faff8b9b	[LoopVectorize] Regenerate test checks (NFC)	2023-11-28 15:50:27 +01:00
Craig Topper	03d4a9d94d	[InstCombine] Set disjoint flag when turning Add into Or. (#72702 ) The disjoint flag was recently added to IR in #72583	2023-11-27 12:54:11 -08:00
pasmpe01	de6c9c84e2	[TLI][AArch64] Add TLI Mappings of @llvm.exp10 for ArmPL and SLEEF. Update regex to _explicitly_ show which exp versions are added. The previous regex used `exp[^e]` to avoid matching calls like: `@llvm.experimental.stepvector`. Note: ArmPL Mappings for scalable types are not yet utilized (eg, `llvm.exp10.nxv2f64`, `llvm.exp10.nxv4f32`), as `replace-with-veclib` pass needs improvements.	2023-11-24 12:24:33 +00:00
Graham Hunter	b1fba568f6	[SVE] Don't require lookup when demangling vector function mappings (#72260 ) We can determine the VF from a combination of the mangled name (which indicates the arguments that take vectors) and the element sizes of the arguments for the scalar function the mapping has been established for. The assert when demangling fails has been removed in favour of just not adding the mapping, which prevents the crash seen in https://github.com/llvm/llvm-project/issues/71892 This patch also stops using _LLVM_ as an ISA for scalable vector tests, since there aren't defined rules for the way vector arguments should be handled (e.g. packed vs. unpacked representation).	2023-11-23 17:15:48 +00:00
Florian Hahn	32d1197a8f	[LV] Use SCEV for subtraction of src/sink for diff runtime checks. Instead of expanding the src/sink SCEV expressions and emitting an IR sub to compute the difference, the subtraction can be directly be performed by ScalarEvolution. This allows the subtraction to be simplified by SCEV, which in turn can reduced the number of redundant runtime check instructions generated. It also allows to generate checks that are invariant w.r.t. an outer loop, if he inner loop AddRecs have the same outer loop AddRec as start.	2023-11-22 12:48:04 +00:00
Graham Hunter	84ebe5b7e8	[LV] Precommit tests for uniform arguments for vector function variants See https://github.com/llvm/llvm-project/pull/68879	2023-11-20 13:30:25 +00:00
Nilanjana Basu	e2210cefb1	[LV] Pre-committing tests for changing loop interleaving count computation (#70272 ) Added tests for evaluating changes to loop interleaving count computation and for removing loop interleaving threshold in subsequent patches.	2023-11-17 17:38:04 -08:00
Florian Hahn	e5e71affb7	[LV] Reverse mask up front, not when creating vector pointer. (#72163 ) Reverse mask early on when populating BlockInMask. This will enable separating mask management and address computation from the memory recipes in the future and is also needed to enable explicit unrolling in VPlan.	2023-11-17 13:59:35 +00:00
Florian Hahn	95eaaa7d71	[LV] Replace undef with constant and pointer argument in tests. This makes the tests more defined, prevents uses of the add being folded and remove UB when loading from undef.	2023-11-16 12:23:17 +00:00
Graham Hunter	b070629c10	[LV] Increase max VF if vectorized function variants exist (#66639 ) If there are function calls in the candidate loop and we have vectorized variants available, try some wider VFs in case the conservative initial maximum based on the widest types in the loop won't actually allow us to make use of those function variants.	2023-11-13 10:27:10 +00:00
Philip Reames	3f2ed812f0	[InstCombine] Infer nneg on zext when forming from non-negative sext (#70706 ) Builds on #67982 which recently introduced the nneg flag on a zext instruction. InstCombine is one of our largest canonicalizers of zext from non-negative sext instructions, so set the flag there.	2023-10-30 12:09:43 -07:00
Igor Kirillov	70904226e1	[LoopVectorize] Enhance Vectorization decisions for predicate tail-folded loops with low trip counts (#69588 ) * Avoid using `CM_ScalarEpilogueNotAllowedLowTripLoop` for loops known to be predicate tail-folded, delegating to `areRuntimeChecksProfitable` to decide on the profitability of vectorizing loops with runtime checks. * Update the `areRuntimeChecksProfitable` function to consider the `ScalarEpilogueLowering` setting when assessing vectorization of a loop. With this patch, we can make more informed decisions for loops with low trip counts, especially when leveraging Profile-Guided Optimization (PGO) data.	2023-10-30 13:43:26 +00:00
Alex Richardson	e39f6c1844	[opt] Infer DataLayout from triple if not specified There are many tests that specify a target triple/CPU flags but no DataLayout which can lead to IR being generated that has unusual behaviour. This commit attempts to use the default DataLayout based on the relevant flags if there is no explicit override on the command line or in the IR file. One thing that is not currently possible to differentiate from a missing datalayout `target datalayout = ""` in the IR file since the current APIs don't allow detecting this case. If it is considered useful to support this case (instead of passing "-data-layout=" on the command line), I can change IR parsers to track whether they have seen such a directive and change the callback type. Differential Revision: https://reviews.llvm.org/D141060	2023-10-26 12:07:37 -07:00
Lou Knauer	852bac4439	[VPlan] Support scalable vectors in outer-loop vectorization This patch enables scalable vectors in the VPlan-native path. If a vectorization factor is specified via loop vectorization hints, that factor is used. If no vectorization factor is specified, but the target preferes scalable vectorization, a scalable vectorization factor is selected. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D157484	2023-10-20 23:17:35 +01:00
Graham Hunter	1abc28fea0	[NFC][LV] Add test for vectorizing fmuladd with another call (#68601 ) As requested in (#66521) I confirmed a crash with "return" instead of "continue" in setVectorizedCallDecision's fmuladd reduction recognition.	2023-10-20 10:23:31 +01:00
JolantaJensen	afdb18df4d	[NFC][AArch64][LV] Reorganise LV tests using symbols from SLEEF (#68207 ) The tests introduced by https://reviews.llvm.org/D134719 and later modified in https://reviews.llvm.org/D146839 are not testing LV in isolation. This patch: 1. Assures that all tests test LV in isolation. 2. Adds LV tests using llvm intrinsics that have libm mappings. llrint, llround and lrint are not included as currently IR verifier pass does not allow to use vector types with them.	2023-10-13 12:10:21 +01:00
Rin	df8e0d057d	[AArch64][LoopVectorize] Use upper bound trip count instead of the constant TC when choosing max VF (#67697 ) This patch is based off of https://github.com/llvm/llvm-project/pull/67543. We are currently using the exact trip count to make decisions regarding the maximum VF. We can instead use the upper bound TC, which will be the same as the constant trip count when that is known.	2023-10-09 16:26:19 +01:00
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
Rin	d3e4702c0f	[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543 ) Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.	2023-10-05 10:24:30 +01:00
JolantaJensen	01797dad86	Fix mechanism propagating mangled names for TLI function mappings (#66656 ) Currently the mappings from TLI are used to generate the list of available "scalar to vector" mappings attached to scalar calls as "vector-function-abi-variant" LLVM IR attribute. Function names from TLI are wrapped in mangled name following the pattern: _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)] The problem is the mangled name uses _LLVM_ as the ISA name which prevents the compiler to compute vectorization factor for scalable vectors as it cannot make any decision based on the _LLVM_ ISA. If we use "s" as the ISA name, the compiler can make decisions based on VFABI specification where SVE spacific rules are described. This patch is only a refactoring stage where there is no change to the compiler's behaviour.	2023-10-02 18:58:39 +01:00
Florian Hahn	97687b7aea	[VPlan] Add active-lane-mask as VPlan-to-VPlan transformation. This patch updates the mask creation code to always create compares of the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up front when tail folding and introduce active-lane-mask as later transformation. This effectively makes (ICMP_ULE, wide canonical IV, backedge-taken-count) the canonical form for tail-folding early on. Introducing more specific active-lane-mask recipes is treated as a VPlan-to-VPlan optimization. This has the advantage of keeping the logic (and complexity) of introducing active-lane-mask recipes in a single place, instead of spreading the logic out across multiple functions. It also simplifies initial VPlan construction and enables treating introducing EVL as similar optimization. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158779	2023-09-25 13:34:45 +01:00
Florian Hahn	96e83d3705	[LV] Use IRBuilder to create and optimize middle-block compare. Split off from D150398 to avoid builder-related diff changes there. Using IRBuilder to create ICmps simplifies the result if both operands are constants. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158332	2023-08-29 11:42:18 +01:00
Kerry McLaughlin	5d814b3848	Revert "[AArch64][SVE2] Change the cost of extends with S/URHADD to 0" This reverts commit dda2cd2505301aa626fcd3e8dea2a447227d00ca.	2023-08-14 10:44:13 +00:00
Kerry McLaughlin	dda2cd2505	[AArch64][SVE2] Change the cost of extends with S/URHADD to 0 When SVE2 is enabled, we can combine an add of 1, add & shift right by 1 to a single s/urhadd instruction. If the operands to the adds are extended, these extends will fold into the s/urhadd and their costs should be 0. Reviewed By: dtemirbulatov Differential Revision: https://reviews.llvm.org/D157628	2023-08-14 10:32:06 +00:00
Florian Hahn	af635a5547	[VPlan] Model wrap flags directly, remove NUW opcodes (NFC) Model wrap flags directly using VPRecipeWithIRFlags and clean up the duplicated NUW opcodes. D157144 will build on this and also model FMFs for VPInstruction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157194	2023-08-08 12:12:30 +01:00
Florian Hahn	93c5bae00e	[VPlan] Use printOperands for VPInstruction. Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.	2023-08-08 11:31:21 +01:00
Jolanta Jensen	3feb63e112	[TLI][AArch64] Add SLEEF mappings to scalable vector functions for fmod and fmodf This patch adds SLEEF mappings to scalable vector functions for fmod and fmodf. Differential Revision: https://reviews.llvm.org/D156920	2023-08-03 14:33:33 +00:00
Florian Hahn	cdb7d5767c	[LV] Add test for select truncation. Add test coverage for truncating selects for D149903.	2023-08-01 18:53:36 +01:00
Florian Hahn	707359ecf5	Recommit "[LV] Re-use existing broadcast value for live-ins." This reverts commit 245ec675a4e41f7ec24dfc998720bffdc46a6c53. Recommits eea9258648ce with a fix to only erase the instruction from the first part if it is defined outside the loop. This fixes a use-after-free error reported.	2023-08-01 15:54:02 +01:00

1 2 3 4 5 ...

514 Commits