llvm-project

Author	SHA1	Message	Date
Kazu Hirata	94f9cbbe49	[Scalar] Remove unused includes (NFC) (#114645 ) Identified with misc-include-cleaner.	2024-11-02 08:32:26 -07:00
Sameer Sahasrabuddhe	e0ac087ff0	[LoopUnroll] Consider convergence control tokens when unrolling (#91715 ) - There is no restriction on a loop with controlled convergent operations when the relevant tokens are defined and used within the loop. - When a token defined outside a loop is used inside (also called a loop convergence heart), unrolling is allowed only in the absence of remainder or runtime checks. - When a token defined inside a loop is used outside, such a loop is said to be "extended". This loop can only be unrolled by also duplicating the extended part lying outside the loop. Such unrolling is disabled for now. - Clean up loop hearts: When unrolling a loop with a heart, duplicating the heart will introduce multiple static uses of a convergence control token in a cycle that does not contain its definition. This violates the static rules for tokens, and needs to be cleaned up into a single occurrence of the intrinsic. - Spell out the initializer for UnrollLoopOptions to improve readability. Original implementation [D85605] by Nicolai Haehnle <nicolai.haehnle@amd.com>.	2024-06-06 13:13:46 +05:30
Florian Hahn	175d297102	[LoopUnroll] Add CSE to remove redundant loads after unrolling. (#83860 ) This patch adds loadCSE support to simplifyLoopAfterUnroll. It is based on EarlyCSE's implementation using ScopeHashTable and is using SCEV for accessed pointers to check to find redundant loads after unrolling. This applies to the late unroll pass only, for full unrolling those redundant loads will be cleaned up by the regular pipeline. The current approach constructs MSSA on-demand per-loop, but there is still small but notable compile-time impact: stage1-O3 +0.04% stage1-ReleaseThinLTO +0.06% stage1-ReleaseLTO-g +0.05% stage1-O0-g +0.02% stage2-O3 +0.09% stage2-O0-g +0.04% stage2-clang +0.02% https://llvm-compile-time-tracker.com/compare.php?from=c089fa5a729e217d0c0d4647656386dac1a1b135&to=ec7c0f27cb5c12b600d9adfc8543d131765ec7be&stat=instructions:u This benefits some workloads with runtime-unrolling disabled, where users use pragmas to force unrolling, as well as with runtime unrolling enabled. On SPEC/MultiSource, this removes a number of loads after unrolling on AArch64 with runtime unrolling enabled. ``` External/S...te/526.blender_r/526.blender_r 96 MultiSourc...rks/mediabench/gsm/toast/toast 39 SingleSource/Benchmarks/Misc/ffbench 4 External/SPEC/CINT2006/403.gcc/403.gcc 18 MultiSourc.../Applications/JM/ldecod/ldecod 4 MultiSourc.../mediabench/jpeg/jpeg-6a/cjpeg 6 MultiSourc...OE-ProxyApps-C/miniGMG/miniGMG 9 MultiSourc...e/Applications/ClamAV/clamscan 4 MultiSourc.../MallocBench/espresso/espresso 3 MultiSourc...dence-flt/LinearDependence-flt 2 MultiSourc...ch/office-ispell/office-ispell 4 MultiSourc...ch/consumer-jpeg/consumer-jpeg 6 MultiSourc...ench/security-sha/security-sha 11 MultiSourc...chmarks/McCat/04-bisect/bisect 3 SingleSour...tTests/2020-01-06-coverage-009 12 MultiSourc...ench/telecomm-gsm/telecomm-gsm 39 MultiSourc...lds-flt/CrossingThresholds-flt 24 MultiSourc...dence-dbl/LinearDependence-dbl 2 External/S...C/CINT2006/445.gobmk/445.gobmk 6 MultiSourc...enchmarks/mafft/pairlocalalign 53 External/S...31.deepsjeng_r/531.deepsjeng_r 3 External/S...rate/510.parest_r/510.parest_r 58 External/S...NT2006/464.h264ref/464.h264ref 29 External/S...NT2017rate/502.gcc_r/502.gcc_r 45 External/S...C/CINT2006/456.hmmer/456.hmmer 6 External/S...te/538.imagick_r/538.imagick_r 18 External/S.../CFP2006/447.dealII/447.dealII 4 MultiSourc...OE-ProxyApps-C++/miniFE/miniFE 12 External/S...2017rate/525.x264_r/525.x264_r 36 MultiSourc...Benchmarks/7zip/7zip-benchmark 33 MultiSourc...hmarks/ASC_Sequoia/AMGmk/AMGmk 2 MultiSourc...chmarks/VersaBench/8b10b/8b10b 1 MultiSourc.../Applications/JM/lencod/lencod 116 MultiSourc...lds-dbl/CrossingThresholds-dbl 24 MultiSource/Benchmarks/McCat/05-eks/eks 15 ``` PR: https://github.com/llvm/llvm-project/pull/83860	2024-05-02 11:01:24 +01:00
Sergey Kachkov	ffd79b3312	[LoopUnroll] Consider simplified operands while retrieving TTI instruction cost (#70929 ) Get more precise cost of instruction after LoopUnroll considering that some operands of it can be simplified, e.g. induction variable will be replaced by constant after full unrolling.	2024-02-06 17:01:38 +03:00
modiking	99ddd77ed9	[LoopUnroll] Introduce PragmaUnrollFullMaxIterations as a hard cap on how many iterations we try to unroll (#78648 ) Fixes [PR77842](https://github.com/llvm/llvm-project/issues/77842) where UBSAN causes pragma full unroll to try and unroll INT_MAX times. This sets a cap to make sure we don't attempt this and crash the compiler. Testing: ninja check-all with new test --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-02-05 17:01:00 -08:00
boxu.zhang	d3ef867082	[LoopUnroll] Make UnrollMaxUpperBound to be overridable by target (#76029 ) The UnrollMaxUpperBound should be target dependent, since different chips provide different register set which brings different ability of storing more temporary values of a program. So I add a MaxUpperBound value in UnrollingPreference which can be override by targets. All uses of UnrollMaxUpperBound are replaced with UP.MaxUpperBound. The default value is still 8 and the command line argument '--unroll-max-upperbound' takes final effect if provided.	2023-12-21 09:47:46 +01:00
XiangZhang	1d6a678591	[LoopUnroll] Make use of MaxTripCount for loops with "#pragma unroll" (#74703 ) Fix loop unroll fail caused by branches folding. For example: SimplifyCFG foldloop branches then cause loop unroll failed for "#program unroll" loop. ``` #program unroll for (int I = 0; I < ConstNum; ++I) { // folding "I < ConstNum" and "Cond2" if (Cond2) { break; } xxx loop body; } ``` The pragma unroll metadata only takes effect if there is an exact trip count, but not if there is an upper bound trip count. This patch make it work with an upper bound trip count as well in shouldPragmaUnroll(). Loop unroll is important in stack nervous devices (e.g. GPU, and that is why a lot of GPU code mark loop with "#program unroll"). It usually much simplify the address (offset) calculations in old iterations, then we can do a lot of others optimizations, e.g, SROA, for these simplifed address (escape alloca the whole aggregates).	2023-12-08 19:43:10 +08:00
Nikita Popov	296671f059	[LoopUnroll] Store more information in UnrollCostEstimator (NFCI) Instead of having ApproximateLoopSize() use a bunch of out parameters, from which we later construct an UnrollCostEstimator, directly construct UnrollCostEstimator which holds all the information derived from loop analysis. This makes it easier to add additional metrics in the future.	2023-09-27 12:06:13 +02:00
Kazu Hirata	e35cfc03d3	[Transforms] Remove unused function createSimpleLoopUnrollPass The last use was removed by: commit d623b2f95fd559901f008a0588dddd0949a8db01 Author: Arthur Eubanks <aeubanks@google.com> Date: Fri Mar 10 17:24:19 2023 -0800	2023-06-10 13:51:38 -07:00
Nikita Popov	143ed21b26	Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)" This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7. In preparation for reverting a dependent revision.	2023-06-05 16:45:38 +02:00
Nikita Popov	5362a0d859	[LCSSA] Remove unused ScalarEvolution argument (NFC) After D149435, LCSSA formation no longer needs access to ScalarEvolution, so remove the argument from the utilities.	2023-05-02 12:17:05 +02:00
Yashwant Singh	aea2a14736	[LoopUnroll] Prevent LoopFullUnrollPass to perform partial/runtime unrolling FullLoopUnroll was performing runtime unrolling in certain cases when '#pragma unroll' was specified. Patch to fix this by introducing new parameter to tryToUnrollLoop() to differentiate between LoopUnrollPass and FullLoopUnrollPass. Based on the discussion here (https://discourse.llvm.org/t/loop-unroller-fails-to-unroll-loop/69834) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D148071	2023-04-13 10:21:24 +05:30
Liren Peng	529ee9750b	[NFC] Use single quotes for single char output during `printPipline` Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D144365	2023-02-22 02:35:13 +00:00
Anna Thomas	05b060b0b0	[LoopPeel] Expose ValueMap of last peeled iteration. NFC The value map of last peeled iteration is computed within peelLoop API. This patch exposes it for callers of peelLoop. While this is not currently used by upstream passes, we have a usecase downstream which benefits from this API update. Future users of peelLoop can also use the ValueMap if needed. Similar value maps are exposed by other loop utilities such as loop cloning. Differential Revision: https://reviews.llvm.org/D138228	2022-12-19 09:55:29 -05:00
Fangrui Song	51b685734b	[Transforms,CodeGen] std::optional::value => operator*/operator-> value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).	2022-12-16 23:21:27 +00:00
Kazu Hirata	6eb0b0a045	Don't include Optional.h These files no longer use llvm::Optional.	2022-12-14 21:16:22 -08:00
Fangrui Song	3152156334	[Transforms/Scalar] llvm::Optional => std::optional	2022-12-13 08:05:14 +00:00
Fangrui Song	c178ed33bd	Transforms/Utils: llvm::Optional => std::optional	2022-12-12 08:29:05 +00:00
Kazu Hirata	f7dffc28b3	Don't include None.h (NFC) I've converted all known uses of None to std::nullopt, so we no longer need to include None.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-10 11:24:26 -08:00
Kazu Hirata	8a7cbea525	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-08 23:22:00 -08:00
Kazu Hirata	343de6856e	[Transforms] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:37 -08:00
Kazu Hirata	3bb0c7075b	[Scalar] Use std::optional in LoopUnrollPass.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-26 17:29:32 -08:00
Matt Arsenault	53fa00b3ae	LoopUnroll: Pass through AssumptionCache (NFC) Using these queries with a context instruction and without a cache seems to be about 2x slower than with it so this theoretically improves compile time.	2022-09-26 14:52:59 -04:00
Daniil Fukalov	9c710ebbdb	[TTI] NFC: Reduce InstructionCost::getValue() usage... in order to propagate `InstructionCost` value upper. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D103406	2022-08-26 16:37:32 +03:00
Simon Pilgrim	fdec50182d	[CostModel] Replace getUserCost with getInstructionCost * Replace getUserCost with getInstructionCost, covering all cost kinds. * Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks. Original Patch by @samparker (Sam Parker) Differential Revision: https://reviews.llvm.org/D79483	2022-08-18 11:55:23 +01:00
Kazu Hirata	50724716cd	[Transforms] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-08-14 12:51:58 -07:00
Kazu Hirata	0e37ef0186	[Transforms] Fix comment typos (NFC)	2022-08-07 23:55:24 -07:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Philip Reames	206f10d3f6	Plumb InstructionCost through unroll costing Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case. Differential Revision: https://reviews.llvm.org/D127305	2022-06-09 15:42:53 -07:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Anna Thomas	205246cb64	[CompileTime] [Passes] Avoid computing unnecessary analyses. NFC Similar to c515b2f39e77, If there are no loops in the function as seen through LI, we should avoid computing the remaining expensive analyses (such as SCEV, BPI). Reordered the analyses requests and early return if there are no loops. The logic of avoiding expensive analyses is applied to LoopVectorizer, LoopLoadElimination and LoopUnrollPass, i.e. all function passes which operate on loops. This is an NFC with compile time improvement. Differential Revision: https://reviews.llvm.org/D124529	2022-04-29 10:00:06 -04:00
Whitney Tsang	80304c5f88	[LoopUnroll] Always respect user unroll pragma IMO when user provide unroll pragma, compiler should always respect it. It is not clear to me why loop unroll pass currently ensure that the unrolled loop size is limited by PragmaUnrollThreshold. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119148	2022-04-11 14:33:24 -04:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Anna Thomas	bc48a26655	[LoopPeel] Use reference instead of pointer for DT argument Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining null DT checks. Also make the argument a reference so that it is clear the argument is a nonnull DT. Extracted from D118472.	2022-02-01 17:00:08 -05:00
Kazu Hirata	5a667c0e74	[llvm] Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2021-12-28 08:52:25 -08:00
Zaara Syeda	dd245bab9f	[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam If a loop isn't forced to be unrolled, we want to avoid unrolling it when there is an explicit unroll-and-jam pragma. This is to prevent automatic unrolling from interfering with the user requested transformation. Differential Revision: https://reviews.llvm.org/D114886	2021-12-14 16:46:37 +00:00
Daniel Sanders	54e21df973	[unroll] Fix a functional change in an NFC patch 5c77aa2b917c [unroll] Use early return in shouldFullUnroll [nfc] wasn't quite NFC since !(x <= y) is x > y rather than x >= y Credit to Justin Bogner for spotting the bug Reviewed By: reames Differential Revision: https://reviews.llvm.org/D114894	2021-12-01 17:28:12 -08:00
Philip Reames	f50207c015	[unroll] Use early return in shouldPartialUnroll [nfc]	2021-11-29 14:37:18 -08:00
Philip Reames	a655e0f991	[unroll] Reduce scope of UnrollFactor variable in computeUnrollCount [NFC] Suggested in review of D114453, done as a separate change to get all uses at once.	2021-11-29 14:33:14 -08:00
Philip Reames	829b62adf5	[unroll] Split full exact and full bound unroll costing [NFC] This change should be NFC. It's posted for review mostly to make sure others are happy with the names I'm introducing for "exact full unroll" and "bounded full unroll". The motivation here is that our cost model for bounded unrolling is too aggressive - it gives benefits for exits we aren't going to prune - but I also just think the new version of the code is a lot easier to follow. Differential Revision: https://reviews.llvm.org/D114453	2021-11-29 14:18:15 -08:00
Philip Reames	18086186ab	[unroll] Remove two dead variable assignments [nfc] These variables are not out-params, and we immediately return after assigning them. Thus, the assignments are dead and just confusing. I believe these used to be out-params, but they're not any more.	2021-11-23 09:12:20 -08:00
Philip Reames	5c77aa2b91	[unroll] Use early return in shouldFullUnroll [nfc]	2021-11-23 09:01:36 -08:00
Florian Hahn	cd0ba9dc58	[LoopPeel] Peel if it turns invariant loads dereferenceable. This patch adds a new cost heuristic that allows peeling a single iteration off read-only loops, if the loop contains a load that 1. is feeding an exit condition, 2. dominates the latch, 3. is not already known to be dereferenceable, 4. and has a loop invariant address. If all non-latch exits are terminated with unreachable, such loads in the loop are guaranteed to be dereferenceable after peeling, enabling hoisting/CSE'ing them. This enables vectorization of loops with certain runtime-checks, like multiple calls to `std::vector::at` if the vector is passed as pointer. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D108114	2021-10-12 11:42:28 +01:00
Markus Lavin	1ac209ed76	[NPM] Added -print-pipeline-passes print params for a few passes. Added '-print-pipeline-passes' printing of parameters for those passes declared with _WITH_PARAMS macro in PassRegistry.def. Note that it only prints the parameters declared inside _WITH_PARAMS as in a few cases there appear to be additional parameters not parsable. The following passes are now covered (i.e. all of those with *_WITH_PARAMS in PassRegistry.def). LoopExtractorPass - loop-extract HWAddressSanitizerPass - hwsan EarlyCSEPass - early-cse EntryExitInstrumenterPass - ee-instrument LowerMatrixIntrinsicsPass - lower-matrix-intrinsics LoopUnrollPass - loop-unroll AddressSanitizerPass - asan MemorySanitizerPass - msan SimplifyCFGPass - simplifycfg LoopVectorizePass - loop-vectorize MergedLoadStoreMotionPass - mldst-motion GVN - gvn StackLifetimePrinterPass - print<stack-lifetime> SimpleLoopUnswitchPass - simple-loop-unswitch Differential Revision: https://reviews.llvm.org/D109310	2021-09-15 08:34:04 +02:00
Ali Sedaghati	cc7bcef3e3	Reapply: [NFC] factor out unrolling decision logic reverting ffd8a268bdc518f87e9ba7524aba0458f4b9979c (reapplying 4d559837e887c278d7c27274f4f6b1b78b97c00d) - removed spurious inclusion of <optional> Differential Revision: https://reviews.llvm.org/D106001	2021-08-18 12:04:33 -07:00

1 2 3 4 5 ...

366 Commits