llvm-project

History

Tobias Grosser d7eb619299 Model cache size and associativity in TargetTransformInfo

Summary:
We add the precise cache sizes and associativity for the following Intel
architectures:

  - Penry
  - Nehalem
  - Westmere
  - Sandy Bridge
  - Ivy Bridge
  - Haswell
  - Broadwell
  - Skylake
  - Kabylake

Polly uses since several months a performance model for BLAS computations that
derives optimal cache and register tile sizes from cache and latency
information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016).
While bootstrapping this model, these target values have been kept in Polly.
However, as our implementation is now rather mature, it seems time to teach
LLVM itself about cache sizes.

Interestingly, L1 and L2 cache sizes are pretty constant across
micro-architectures, hence a set of architecture specific default values
seems like a good start. They can be expanded to more target specific values,
in case certain newer architectures require different values. For now a set
of Intel architectures are provided.

Just as a little teaser, for a simple gemm kernel this model allows us to
improve performance from 1.2s to 0.27s. For gemm kernels with less optimal
memory layouts even larger speedups can be reported.

Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb

Reviewed By: fhahn, asb

Subscribers: lsaba, asb, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D37051

llvm-svn: 311647

2017-08-24 09:46:25 +00:00

AliasAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

AliasAnalysisEvaluator.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

AliasAnalysisSummary.cpp

Update a comment.

2016-08-25 01:29:55 +00:00

AliasAnalysisSummary.h

Make some LLVM_CONSTEXPR variables const. NFC.

2016-08-25 01:05:08 +00:00

AliasSetTracker.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-07-24 23:16:33 +00:00

Analysis.cpp

MemorySSA: Move to Analysis, from Transforms/Utils. It's used as

2017-04-11 20:06:36 +00:00

AssumptionCache.cpp

[IR][AssumptionCache] Add m_Shift and m_BitwiseLogic matchers to replace a couple m_CombineOr

2017-06-24 06:27:14 +00:00

BasicAliasAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

BlockFrequencyInfo.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-07-21 21:37:46 +00:00

BlockFrequencyInfoImpl.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-07-21 21:37:46 +00:00

BranchProbabilityInfo.cpp

[PGO] Set edge weights for indirectbr instruction with profile counts

2017-08-23 21:36:02 +00:00

CallGraph.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-07-24 23:16:33 +00:00

CallGraphSCCPass.cpp

Address http://bugs.llvm.org/pr32207 by making BannerPrinted local to runOnSCC and skipping banner for function declarations.

2017-06-12 02:18:50 +00:00

CallPrinter.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

CaptureTracking.cpp

fix trivial typos; NFC

2017-07-09 05:54:44 +00:00

CFG.cpp

…

CFGPrinter.cpp

[PM] Port CFGViewer and CFGPrinter to the new Pass Manager

2016-09-15 18:35:27 +00:00

CFLAndersAliasAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

CFLGraph.h

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

CFLSteensAliasAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

CGSCCPassManager.cpp

[PM] Switch the CGSCC debug messages to use the standard LLVM debug

2017-08-11 05:47:13 +00:00

CMakeLists.txt

Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"

2017-08-14 21:39:51 +00:00

CmpInstAnalysis.cpp

[InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares

2017-08-14 22:11:43 +00:00

CodeMetrics.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

ConstantFolding.cpp

Add strictfp attribute to prevent unwanted optimizations of libm calls

2017-08-14 21:15:13 +00:00

CostModel.cpp

[SLP] Initial rework for min/max horizontal reduction vectorization, NFC.

2017-07-31 14:36:05 +00:00

Delinearization.cpp

…

DemandedBits.cpp

[DemandedBits] simplify call; NFC

2017-08-16 14:28:23 +00:00

DependenceAnalysis.cpp

fix typos in comments and error messages; NFC

2017-07-10 12:44:25 +00:00

DivergenceAnalysis.cpp

DivergencyAnalysis patch for review

2017-06-15 19:33:10 +00:00

DominanceFrontier.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-07-24 23:16:33 +00:00

DomPrinter.cpp

[DomPrinter] Add a way to programmatically dump a dot representation.

2017-04-24 17:48:44 +00:00

EHPersonalities.cpp

[EH] Recognize __(gxx|gcc)_personality_seh0 as the GNU EH personalities

2017-05-31 22:35:52 +00:00

GlobalsModRef.cpp

GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected

2017-06-07 21:37:39 +00:00

IndirectCallPromotionAnalysis.cpp

Make ICP uses PSI to check for hotness.

2017-08-08 20:57:33 +00:00

InlineCost.cpp

[InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes.

2017-08-21 20:00:09 +00:00

InstCount.cpp

[Analysis] RemoveTotalMemInst counting in InstCount to avoid reading back other Statistic variables

2017-07-18 02:41:12 +00:00

InstructionSimplify.cpp

Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"

2017-08-14 21:39:51 +00:00

Interval.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-07-24 23:16:33 +00:00

IntervalPartition.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-07-24 23:16:33 +00:00

IteratedDominanceFrontier.cpp

[Dominators] Make IsPostDominator a template parameter

2017-07-14 18:26:09 +00:00

IVUsers.cpp

[IVUsers] Don't bail out of normalizing non-affine add recs

2017-04-25 06:53:25 +00:00

LazyBlockFrequencyInfo.cpp

[LazyBFI] Fix typos

2017-02-14 17:21:12 +00:00

LazyBranchProbabilityInfo.cpp

[BPI] Don't assume that strcmp returning >0 is more likely than <0

2017-06-08 09:44:40 +00:00

LazyCallGraph.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-08-11 21:30:02 +00:00

LazyValueInfo.cpp

[LVI] Fix LVI compile time regression around constantFoldUser()

2017-08-10 02:23:14 +00:00

Lint.cpp

[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI

2017-07-06 18:39:47 +00:00

LLVMBuild.txt

Update libdeps to add BinaryFormat, introduced in r304864.

2017-06-07 04:48:49 +00:00

Loads.cpp

Make visible isDereferenceableAndAlignedPointer(..., const APInt &Size, ...)

2017-06-24 01:35:13 +00:00

LoopAccessAnalysis.cpp

[LAA] Correctly return a half-open range in expandBounds

2017-04-05 09:24:26 +00:00

LoopAnalysisManager.cpp

Revert r293017 and fix the actual underlying issue.

2017-02-07 01:50:48 +00:00

LoopInfo.cpp

[Dominators] Make IsPostDominator a template parameter

2017-07-14 18:26:09 +00:00

LoopPass.cpp

[LegacyPM] Make the 'addLoop' method accept a loop to add rather than

2017-05-25 03:01:31 +00:00

LoopUnrollAnalyzer.cpp

[LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad

2016-07-23 02:56:49 +00:00

MemDepPrinter.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

MemDerefPrinter.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

MemoryBuiltins.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-16 22:07:40 +00:00

MemoryDependenceAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-16 22:07:40 +00:00

MemoryLocation.cpp

[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)

2017-01-23 23:16:46 +00:00

MemorySSA.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-16 22:07:40 +00:00

MemorySSAUpdater.cpp

[mssa] Fix case when there is no definition in a block prior to an inserted use.

2017-06-07 16:46:53 +00:00

ModuleDebugInfoPrinter.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

ModuleSummaryAnalysis.cpp

[lib/Analysis] - Mark personality functions as live.

2017-08-22 08:50:56 +00:00

ObjCARCAliasAnalysis.cpp

Consistently use FunctionAnalysisManager

2016-08-09 00:28:15 +00:00

ObjCARCAnalysisUtils.cpp

…

ObjCARCInstKind.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

OptimizationDiagnosticInfo.cpp

[ORE] Add diagnostics hotness threshold

2017-06-30 23:14:53 +00:00

OrderedBasicBlock.cpp

[OrderedBasicBlock] Return false for comesBefore(A, A)

2017-06-02 13:10:31 +00:00

PHITransAddr.cpp

PHITransAddr: Use new SimplifyQuery based API.

2017-04-26 20:56:13 +00:00

PostDominators.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-16 22:07:40 +00:00

ProfileSummaryInfo.cpp

Adjust the hotness threshold from 99.9% to 99%.

2017-08-04 16:20:54 +00:00

PtrUseVisitor.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-18 23:51:26 +00:00

README.txt

…

RegionInfo.cpp

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

2017-06-27 21:52:05 +00:00

RegionPass.cpp

Add opt-bisect support for region passes.

2017-06-01 21:22:26 +00:00

RegionPrinter.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

ScalarEvolution.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-18 23:51:26 +00:00

ScalarEvolutionAliasAnalysis.cpp

[PM] Change the static object whose address is used to uniquely identify

2016-11-23 17:53:26 +00:00

ScalarEvolutionExpander.cpp

[SCEV] Teach SCEVExpander to expand BinPow

2017-06-19 06:24:53 +00:00

ScalarEvolutionNormalization.cpp

Sort the remaining #include lines in include/... and lib/....

2017-06-06 11:49:48 +00:00

ScopedNoAliasAA.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-18 23:51:26 +00:00

SparsePropagation.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-18 23:51:26 +00:00

StratifiedSets.h

Do a sweep over move ctors and remove those that are identical to the default.

2016-10-20 12:20:28 +00:00

TargetLibraryInfo.cpp

Revert "Add pthread_self function prototype and make it speculatable."

2017-05-21 00:37:55 +00:00

TargetTransformInfo.cpp

Model cache size and associativity in TargetTransformInfo

2017-08-24 09:46:25 +00:00

Trace.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-07-21 21:37:46 +00:00

TypeBasedAliasAnalysis.cpp

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-08-18 23:51:26 +00:00

TypeMetadataUtils.cpp

Analysis: Add appropriate const qualification to functions in TypeMetadataUtils.cpp. NFC.

2017-01-27 22:55:30 +00:00

ValueTracking.cpp

[ValueTracking] Add assertions that the starting Depth in isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth

2017-08-21 22:56:12 +00:00

VectorUtils.cpp

[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI

2017-07-06 18:39:47 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//