llvm-project

History

Elena Demikhovsky 21706cbd24 AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.

X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.

In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).

* Shiffle-broadcast cost will be changed in Simon's upcoming patch.

Differential Revision: https://reviews.llvm.org/D28118

llvm-svn: 290810

2017-01-02 10:37:52 +00:00

AliasAnalysis.cpp

[PM] Remove a pointless optimization.

2016-12-27 18:04:11 +00:00

AliasAnalysisEvaluator.cpp

Consistently use FunctionAnalysisManager

2016-08-09 00:28:15 +00:00

AliasAnalysisSummary.cpp

Update a comment.

2016-08-25 01:29:55 +00:00

AliasAnalysisSummary.h

Make some LLVM_CONSTEXPR variables const. NFC.

2016-08-25 01:05:08 +00:00

AliasSetTracker.cpp

[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.

2016-11-07 14:11:45 +00:00

Analysis.cpp

[LCSSA] Perform LCSSA verification only for the current loop nest.

2016-10-28 12:57:20 +00:00

AssumptionCache.cpp

Add files I seem to have dropped in my revert (r290086).

2016-12-19 08:32:13 +00:00

BasicAliasAnalysis.cpp

[PM] Remove a pointless optimization.

2016-12-27 18:04:11 +00:00

BlockFrequencyInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

BlockFrequencyInfoImpl.cpp

[GraphTraits] Replace all NodeType usage with NodeRef

2016-08-22 21:09:30 +00:00

BranchProbabilityInfo.cpp

Retry: [BPI] Use a safer constructor to calculate branch probabilities

2016-12-17 01:02:08 +00:00

CallGraph.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

CallGraphSCCPass.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

CallPrinter.cpp

…

CaptureTracking.cpp

…

CFG.cpp

…

CFGPrinter.cpp

[PM] Port CFGViewer and CFGPrinter to the new Pass Manager

2016-09-15 18:35:27 +00:00

CFLAndersAliasAnalysis.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

CFLGraph.h

[CFLAA] Check for pointer types in more places.

2016-07-29 01:23:45 +00:00

CFLSteensAliasAnalysis.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

CGSCCPassManager.cpp

[PM] Teach the CGSCC's CG update utility to more carefully invalidate

2016-12-28 10:34:50 +00:00

CMakeLists.txt

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

CodeMetrics.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

ConstantFolding.cpp

[InstCombiner] Simplify lib calls to round{,f}

2016-12-26 14:29:29 +00:00

CostModel.cpp

AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.

2017-01-02 10:37:52 +00:00

Delinearization.cpp

…

DemandedBits.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

DependenceAnalysis.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

DivergenceAnalysis.cpp

…

DominanceFrontier.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

DomPrinter.cpp

…

EHPersonalities.cpp

[tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part

2016-11-14 21:41:13 +00:00

GlobalsModRef.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

IndirectCallPromotionAnalysis.cpp

Remove another unused variable from r275216

2016-07-12 23:49:17 +00:00

InlineCost.cpp

[PM] Provide an initial, minimal port of the inliner to the new pass manager.

2016-12-20 03:15:32 +00:00

InstCount.cpp

…

InstructionSimplify.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

Interval.cpp

Apply clang-tidy's modernize-loop-convert to lib/Analysis.

2016-06-26 17:27:42 +00:00

IntervalPartition.cpp

Apply clang-tidy's modernize-loop-convert to lib/Analysis.

2016-06-26 17:27:42 +00:00

IteratedDominanceFrontier.cpp

Normalize file docs. NFC.

2016-07-21 20:52:35 +00:00

IVUsers.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

LazyBlockFrequencyInfo.cpp

[BPI] Add new LazyBPI analysis

2016-07-28 23:31:12 +00:00

LazyBranchProbabilityInfo.cpp

[BPI] Add new LazyBPI analysis

2016-07-28 23:31:12 +00:00

LazyCallGraph.cpp

[PM] Teach the CGSCC's CG update utility to more carefully invalidate

2016-12-28 10:34:50 +00:00

LazyValueInfo.cpp

[LVI] Remove count/erase idiom in favor of checking result value of erase

2016-12-30 22:09:10 +00:00

Lint.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

LLVMBuild.txt

Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm"

2016-11-14 17:12:32 +00:00

Loads.cpp

[Loads] Fix crash in is isDereferenceableAndAlignedPointer()

2016-10-28 15:32:28 +00:00

LoopAccessAnalysis.cpp

[LAA] Prevent invalid IR for loop-invariant bound in loop body

2016-12-05 21:25:03 +00:00

LoopInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

LoopPass.cpp

[LCSSA] Perform LCSSA verification only for the current loop nest.

2016-10-28 12:57:20 +00:00

LoopPassManager.cpp

[PM] Introduce the facilities for registering cross-IR-unit dependencies

2016-12-27 08:40:39 +00:00

LoopUnrollAnalyzer.cpp

[LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad

2016-07-23 02:56:49 +00:00

MemDepPrinter.cpp

Apply clang-tidy's modernize-loop-convert to lib/Analysis.

2016-06-26 17:27:42 +00:00

MemDerefPrinter.cpp

…

MemoryBuiltins.cpp

[Analysis] Ignore nobuiltin on allocsize function calls.

2016-12-27 06:32:14 +00:00

MemoryDependenceAnalysis.cpp

[MemDep] Handle gep with zeros for invariant.group

2016-12-30 18:45:07 +00:00

MemoryLocation.cpp

…

ModuleDebugInfoPrinter.cpp

[IR] Remove the DIExpression field from DIGlobalVariable.

2016-12-20 02:09:43 +00:00

ModuleSummaryAnalysis.cpp

[ThinLTO] Fix "||" vs "|" mixup.

2016-12-27 17:45:09 +00:00

ObjCARCAliasAnalysis.cpp

Consistently use FunctionAnalysisManager

2016-08-09 00:28:15 +00:00

ObjCARCAnalysisUtils.cpp

…

ObjCARCInstKind.cpp

Create llvm.addressofreturnaddress intrinsic

2016-10-12 22:13:19 +00:00

OptimizationDiagnosticInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

OrderedBasicBlock.cpp

…

PHITransAddr.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

PostDominators.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

ProfileSummaryInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

PtrUseVisitor.cpp

…

README.txt

…

RegionInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

RegionPass.cpp

[RegionPass] Some minor cleanups

2016-07-19 17:50:27 +00:00

RegionPrinter.cpp

Apply clang-tidy's modernize-loop-convert to lib/Analysis.

2016-06-26 17:27:42 +00:00

ScalarEvolution.cpp

[SCEV] Be less conservative when extending bitwidths for computing ranges.

2016-12-20 23:03:42 +00:00

ScalarEvolutionAliasAnalysis.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

ScalarEvolutionExpander.cpp

Revert @llvm.assume with operator bundles (r289755-r289757)

2016-12-19 08:22:17 +00:00

ScalarEvolutionNormalization.cpp

…

ScopedNoAliasAA.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

SparsePropagation.cpp

Apply clang-tidy's modernize-loop-convert to lib/Analysis.

2016-06-26 17:27:42 +00:00

StratifiedSets.h

Do a sweep over move ctors and remove those that are identical to the default.

2016-10-20 12:20:28 +00:00

TargetLibraryInfo.cpp

[SimplifyLibCalls] Lower fls() to llvm.ctlz().

2016-12-15 23:45:11 +00:00

TargetTransformInfo.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

Trace.cpp

…

TypeBasedAliasAnalysis.cpp

[TBAA] Don't generate invalid TBAA when merging nodes

2016-12-11 20:07:25 +00:00

TypeMetadataUtils.cpp

TypeMetadataUtils: Simplify; spotted by Mehdi.

2016-12-21 19:00:47 +00:00

ValueTracking.cpp

Fix an issue with isGuaranteedToTransferExecutionToSuccessor

2016-12-31 22:12:34 +00:00

VectorUtils.cpp

IR: Change the gep_type_iterator API to avoid always exposing the "current" type.

2016-12-02 02:24:42 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//