llvm-project

History

Philip Reames 269bc684e7 [LV][RISCV] Disable vectorization of epilogue loops

Epilogue loop vectorization is a feature in the vectorize intended to avoid running fully scalar code when the vector length of the main loop turns out to be either longer than the trip count of the actual loop, or with a huge remainder.

In practice, this feature appears to not have been well tuned. I honestly don't think it should be on by default at all, but it definitely shouldn't be on for RISCV. Note that other targets have also disabled it, but they've done so via disabling interleaving - which is, well, completely unrelated - and we don't want to do that for RISCV.

In the near term, many examples I'm seeing have terrible codegen for epilogue vectorization. We are greatly increasing code size for little value at reasonable VLEN values for small types. In the long term, the cases that epilogue vectorization are intended to handle are likely better handled via tail folding on RISCV.

As an aside, I also don't really trust the correctness of epilogue vectorization. The code structure is such that otherwise straight forward changes sometimes break only epilogue vectorization. The reuse of an existing vplan without careful validation opens significant room for nasty bugs. Given how rarely the code is exercised, that is not a good combination.

As such, this patch introduces a TTI hook, and completely disables epilogue vectorization on RISCV.

Differential Revision: https://reviews.llvm.org/D136695

2022-10-25 14:28:02 -07:00

models

Reland "[MLGO] ML Regalloc Priority Advisor"

2022-09-30 16:27:26 -05:00

AliasAnalysis.cpp

[ObjCARC] Remove legacy PM versions of optimization passes

2022-10-21 13:40:54 -07:00

AliasAnalysisEvaluator.cpp

[AA] Do not track Must in ModRefInfo

2022-08-01 07:14:31 +02:00

AliasAnalysisSummary.cpp

…

AliasAnalysisSummary.h

AliasAnalysisSummary.h - cleanup includes and forward declarations. NFC.

2020-04-21 11:32:58 +01:00

AliasSetTracker.cpp

[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)

2022-10-19 11:03:54 +02:00

Analysis.cpp

[ObjCARC] Remove legacy PM versions of optimization passes

2022-10-21 13:40:54 -07:00

AssumeBundleQueries.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

AssumptionCache.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

BasicAliasAnalysis.cpp

[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)

2022-10-19 11:03:54 +02:00

BlockFrequencyInfo.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

BlockFrequencyInfoImpl.cpp

[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC

2022-06-03 21:59:05 -07:00

BranchProbabilityInfo.cpp

[PGO] Support PGO annotation of CallBrInst

2022-09-01 14:13:50 -07:00

CallGraph.cpp

[CallGraph] Port -print-callgraph-sccs to new pass manager

2022-10-11 14:43:16 -07:00

CallGraphSCCPass.cpp

[NFC] Rename Instrinsic to Intrinsic

2022-04-25 18:13:23 +01:00

CallPrinter.cpp

[CallPrinter] Port CallPrinter passes to new pass manager

2022-04-18 10:02:18 -07:00

CaptureTracking.cpp

[CaptureTracking] Increase limit and use it for all visited uses.

2022-06-02 21:43:58 +01:00

CFG.cpp

[CFG] Add const qualifier to isPotentiallyReachableFromMany() (NFC)

2022-10-18 10:06:07 +02:00

CFGPrinter.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

CFGSCCPrinter.cpp

Ensure newlines at the end of files (NFC)

2022-10-22 09:29:40 -07:00

CFLAndersAliasAnalysis.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

CFLGraph.h

[MemoryBuiltins] Remove isFreeCall() function (NFC)

2022-07-21 14:44:23 +02:00

CFLSteensAliasAnalysis.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

CGSCCPassManager.cpp

[CGSCC][DevirtWrapper] Properly handle invalidating analyses for invalidated SCCs

2022-09-29 09:55:23 -07:00

CMakeLists.txt

Port print-cfg-sccs to new pass manager

2022-10-18 08:47:08 -07:00

CmpInstAnalysis.cpp

[InstCombine][Analysis] Move getFCmpCode and getPredForFCmpCode to CmpInstAnalysis. NFC

2022-03-03 09:33:24 -08:00

CodeMetrics.cpp

[CostModel] Replace getUserCost with getInstructionCost

2022-08-18 11:55:23 +01:00

ConstantFolding.cpp

[ConstantExpr] Don't create fneg expressions

2022-09-07 11:27:25 +02:00

ConstraintSystem.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

CostModel.cpp

[NFC] Fix a few whitespace inconsistencies.

2022-10-20 14:52:25 +00:00

CycleAnalysis.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

DDG.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

DDGPrinter.cpp

Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options

2022-06-05 00:31:44 -07:00

Delinearization.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

DemandedBits.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

DependenceAnalysis.cpp

[DependenceAnalysis][PR56275] Normalize negative dependence analysis results

2022-08-03 19:59:00 -04:00

DependenceGraphBuilder.cpp

[NFC] Remove unnecessary #includes

2022-02-04 21:22:41 -08:00

DevelopmentModeInlineAdvisor.cpp

[nfc][mlgo] Separate logger and training-mode model evaluator

2022-08-03 16:20:28 -07:00

DivergenceAnalysis.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

DominanceFrontier.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

DomPrinter.cpp

[DomPrinter] Migrate -dot-dom to the new pass manager.

2022-05-16 15:07:16 -05:00

DomTreeUpdater.cpp

[NFC] Switch a few uses of undef to poison as placeholders for unreachable code

2022-07-30 13:55:56 +01:00

EHPersonalities.cpp

[PS5] Use __gxx_personality_v0 for TSan

2022-06-14 10:39:34 -07:00

FunctionPropertiesAnalysis.cpp

[FunctionPropertiesAnalysis] Generalize support for unreachable

2022-06-21 08:18:01 -07:00

GlobalsModRef.cpp

[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)

2022-10-19 11:03:54 +02:00

GuardUtils.cpp

…

HeatUtils.cpp

[iwyu] Move <cmath> out of llvm/Support/MathExtras.h

2022-09-28 20:49:01 +02:00

ImportedFunctionsInliningStatistics.cpp

…

IndirectCallPromotionAnalysis.cpp

Remove unneeded cl::ZeroOrMore for cl::opt options

2022-06-04 00:10:42 -07:00

InlineAdvisor.cpp

[Analysis] clang-format InlineAdvisor.cpp (NFC)

2022-07-13 13:38:50 -07:00

InlineCost.cpp

[Analysis] Introduce getStaticBonusApplied (NFC)

2022-09-25 23:21:40 -07:00

InlineOrder.cpp

[ModuleInliner] Add a cost-benefit-based priority

2022-09-29 09:00:38 -07:00

InlineSizeEstimatorAnalysis.cpp

Fix build breaks on ml-* bots introduced by include cleanups

2022-03-01 11:29:18 -08:00

InstCount.cpp

…

InstructionPrecedenceTracking.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

InstructionSimplify.cpp

[instsimplify] Move (extelt (inselt Vec, Value, Index), Index) -> Value from InstCombine

2022-10-17 15:22:06 -07:00

Interval.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

IntervalPartition.cpp

[llvm] Use range-based for loops (NFC)

2021-11-20 18:42:10 -08:00

IRSimilarityIdentifier.cpp

[llvm] Use value instead of getValue (NFC)

2022-07-13 23:11:56 -07:00

IVDescriptors.cpp

[IVDescriptors] Before moving an instruction in SinkAfter checking if it is target of other instructions

2022-10-03 18:47:51 +00:00

IVUsers.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

LazyBlockFrequencyInfo.cpp

…

LazyBranchProbabilityInfo.cpp

…

LazyCallGraph.cpp

[LazyCallGraph] Handle spurious ref edges when deleting a dead function

2022-09-22 15:01:15 -07:00

LazyValueInfo.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

LegacyDivergenceAnalysis.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

Lint.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

Loads.cpp

Analysis: Remove redundant assertion

2022-09-20 09:39:45 -04:00

LoopAccessAnalysis.cpp

[LAA] Use LoopAccessInfoManager in legacy pass.

2022-10-04 08:37:11 +01:00

LoopAnalysisManager.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

LoopCacheAnalysis.cpp

Revert "[llvm] Use llvm::is_contained (NFC)"

2022-08-28 18:52:49 -07:00

LoopInfo.cpp

[Loop] Move block and loop dispo invalidation to makeLoopInvariant.

2022-10-14 21:58:14 +01:00

LoopNestAnalysis.cpp

Revert "[LoopPassManager] Implement and use LoopNestAnalysis::run() instead of manually creating LoopNests"

2022-09-05 15:42:48 -07:00

LoopPass.cpp

[LegacyPassManager] Move structural hashing into Pass classes. NFC.

2022-03-17 09:51:12 +00:00

LoopUnrollAnalyzer.cpp

[NFC] format InstructionSimplify & lowerCaseFunctionNames

2022-06-09 16:10:08 +02:00

MemDepPrinter.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

MemDerefPrinter.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

MemoryBuiltins.cpp

[MemProf] Update metadata during inlining

2022-09-30 16:46:17 -07:00

MemoryDependenceAnalysis.cpp

[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC

2022-08-08 11:24:15 -07:00

MemoryLocation.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

MemoryProfileInfo.cpp

[llvm] Qualify auto (NFC)

2022-08-07 23:55:27 -07:00

MemorySSA.cpp

[MemorySSA] Remove PerformedPhiTranslation flag

2022-09-21 10:32:09 +02:00

MemorySSAUpdater.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

MLInlineAdvisor.cpp

[MLInliner] No need to invalidate everything post-inlining.

2022-06-24 18:22:06 -07:00

ModelUnderTrainingRunner.cpp

[iwyu] Handle regressions in libLLVM header include

2022-05-26 08:12:34 +02:00

ModuleDebugInfoPrinter.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

ModuleSummaryAnalysis.cpp

[ModuleSummaryAnalysis] Use helper methods to check readnone/readonly (NFC)

2022-10-21 12:18:57 +02:00

MustExecute.cpp

[MustExec][LICM] Handle latch being part of an inner cycle (PR57780)

2022-10-11 09:30:13 +02:00

NoInferenceModelRunner.cpp

[mlgo] Support exposing more features than those supported by models

2022-05-09 18:01:21 -07:00

ObjCARCAliasAnalysis.cpp

[ObjCARC] Remove legacy PM versions of optimization passes

2022-10-21 13:40:54 -07:00

ObjCARCAnalysisUtils.cpp

…

ObjCARCInstKind.cpp

[ObjCARC] Use "UnsafeClaimRV" to refer to unsafeClaim in enums. NFC.

2022-01-24 19:37:01 -08:00

OptimizationRemarkEmitter.cpp

[llvm] Use value_or instead of getValueOr (NFC)

2022-06-18 23:07:11 -07:00

OverflowInstAnalysis.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

PHITransAddr.cpp

[PHITranslateAddr] Require dominance when searching for translated address (PR57025)

2022-09-01 16:26:42 +02:00

PhiValues.cpp

…

PostDominators.cpp

…

ProfileSummaryInfo.cpp

[llvm] Use value instead of getValue (NFC)

2022-07-13 23:11:56 -07:00

PtrUseVisitor.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

README.txt

…

RegionInfo.cpp

[NFC] Remove unnecessary #includes

2022-02-04 21:22:41 -08:00

RegionPass.cpp

[LegacyPassManager] Move structural hashing into Pass classes. NFC.

2022-03-17 09:51:12 +00:00

RegionPrinter.cpp

[polly] migrate -polly-show to the new pass manager

2022-05-09 14:04:29 -05:00

ReplayInlineAdvisor.cpp

[Inline] Annotate inline pass name with link phase information for analysis.

2022-06-24 10:06:43 -07:00

ScalarEvolution.cpp

[SCEV] Replace assert with returning CouldNotComp in computeMaxBECountForLT.

2022-10-19 11:24:10 +01:00

ScalarEvolutionAliasAnalysis.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

ScalarEvolutionDivision.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

ScalarEvolutionNormalization.cpp

Cleanup includes: DebugInfo & CodeGen

2022-03-12 17:26:40 +01:00

ScopedNoAliasAA.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

StackLifetime.cpp

[StackLifetime] More efficient loop for LivenessType::Must

2022-09-28 16:28:45 -07:00

StackSafetyAnalysis.cpp

[Analysis] Qualify auto variables in for loops (NFC)

2022-07-16 23:26:34 -07:00

StratifiedSets.h

[llvm] Use Optional::has_value instead of Optional::hasValue (NFC)

2022-06-26 16:10:42 -07:00

SyncDependenceAnalysis.cpp

Cleanup includes: LLVMAnalysis

2022-03-01 18:01:54 +01:00

SyntheticCountsUtils.cpp

[llvm] Don't use Optional::getValue (NFC)

2022-06-20 22:45:45 -07:00

TargetLibraryInfo.cpp

[Analysis][SimplifyLibCalls] Refactor code related to size_t in lib func signatures. NFC

2022-10-03 12:02:50 +02:00

TargetTransformInfo.cpp

[LV][RISCV] Disable vectorization of epilogue loops

2022-10-25 14:28:02 -07:00

TensorSpec.cpp

[mlgo] Factor out TensorSpec

2022-04-25 18:35:46 -07:00

TFLiteUtils.cpp

Move interpreter check before modifying the allocation type.

2022-10-12 19:50:36 +00:00

TFUtils.cpp

[mlgo] Use TFLite for 'development' mode.

2022-08-24 16:07:24 -07:00

Trace.cpp

…

TrainingLogger.cpp

[nfc][mlgo] Separate logger and training-mode model evaluator

2022-08-03 16:20:28 -07:00

TypeBasedAliasAnalysis.cpp

[AA] Rename getModRefBehavior() to getMemoryEffects() (NFC)

2022-10-19 11:03:54 +02:00

TypeMetadataUtils.cpp

[NFC]] Use llvm::all_of instead of std::all_of

2022-08-23 12:21:53 +08:00

ValueLattice.cpp

…

ValueLatticeUtils.cpp

[SCCP] Check that load/store and global type match

2022-02-11 11:01:18 +01:00

ValueTracking.cpp

[NFC] Fix a few whitespace inconsistencies.

2022-10-20 14:52:25 +00:00

VectorUtils.cpp

[LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]

2022-09-27 15:55:44 -07:00

VFABIDemangling.cpp

[NFC] Fix a few whitespace inconsistencies.

2022-10-20 14:52:25 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//