llvm-project

Author	SHA1	Message	Date
Uday Bondhugula	5d22044b5f	Fix for getMemRefSizeInBytes: unsigned -> uint64_t PiperOrigin-RevId: 234829637	2019-03-29 16:35:26 -07:00
Uday Bondhugula	f97c1c5b06	Misc. updates/fixes to analysis utils used for DMA generation; update DMA generation pass to make it drop certain assumptions, complete TODOs. - multiple fixes for getMemoryFootprintBytes - pass loopDepth correctly from getMemoryFootprintBytes() - use union while computing memory footprints - bug fixes for addAffineForOpDomain - take into account loop step - add domains of other loop IVs in turn that might have been used in the bounds - dma-generate: drop assumption of "non-unit stride loops being tile space loops and skipping those and recursing to inner depths"; DMA generation is now purely based on available fast mem capacity and memory footprint's calculated - handle memory region compute failures/bailouts correctly from dma-generate - loop tiling cleanup/NFC - update some debug and error messages to use emitNote/emitError in pipeline-data-transfer pass - NFC PiperOrigin-RevId: 234245969	2019-03-29 16:30:26 -07:00
Uday Bondhugula	f5eed89df0	Fix + cleanup for getMemRefRegion() - determine symbols for the memref region correctly - this wasn't exposed earlier since we didn't have any test cases where the portion of the nest being DMAed for was non-hyperrectangular (i.e., bounds of one IV depending on other IVs within that part) PiperOrigin-RevId: 233493872	2019-03-29 16:23:53 -07:00
Uday Bondhugula	c419accea3	Automated rollback of changelist 232728977. PiperOrigin-RevId: 232944889	2019-03-29 16:21:38 -07:00
Uday Bondhugula	4ba8c9147d	Automated rollback of changelist 232717775. PiperOrigin-RevId: 232807986	2019-03-29 16:19:33 -07:00
River Riddle	fd2d7c857b	Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespace the AffineOps dialect with 'affine'. PiperOrigin-RevId: 232728977	2019-03-29 16:18:59 -07:00
River Riddle	90d10b4e00	NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect. PiperOrigin-RevId: 232717775	2019-03-29 16:17:59 -07:00
River Riddle	905d84851d	Address post submit review comments for removing Block::findInstPositionInBlock. PiperOrigin-RevId: 232713514	2019-03-29 16:17:44 -07:00
MLIR Team	b9dde91ea6	Adds the ability to compute the MemRefRegion of a sliced loop nest. Utilizes this feature during loop fusion cost computation, to compute what the write region of a fusion candidate loop nest slice would be (without having to materialize the slice or change the IR). ) Adds parameter to public API of MemRefRegion::compute for passing in the slice loop bounds to compute the memref region of the loop nest slice. ) Exposes public method MemRefRegion::getRegionSize for computing the size of the memref region in bytes. PiperOrigin-RevId: 232706165	2019-03-29 16:17:15 -07:00
River Riddle	42a2d7d6e1	Remove findInstPositionInBlock from the Block api. PiperOrigin-RevId: 232704766	2019-03-29 16:16:43 -07:00
River Riddle	10237de8eb	Refactor the affine analysis by moving some functionality to IR and some to AffineOps. This is important for allowing the affine dialect to define canonicalizations directly on the operations instead of relying on transformation passes, e.g. ComposeAffineMaps. A summary of the refactoring: * AffineStructures has moved to IR. * simplifyAffineExpr/simplifyAffineMap/getFlattenedAffineExpr have moved to IR. * makeComposedAffineApply/fullyComposeAffineMapAndOperands have moved to AffineOps. * ComposeAffineMaps is replaced by AffineApplyOp::canonicalize and deleted. PiperOrigin-RevId: 232586468	2019-03-29 16:15:41 -07:00
River Riddle	c9ad4621ce	NFC: Move AffineApplyOp to the AffineOps dialect. This also moves the isValidDim/isValidSymbol methods from Value to the AffineOps dialect. PiperOrigin-RevId: 232386632	2019-03-29 16:12:40 -07:00
Uday Bondhugula	0f50414fa4	Refactor common code getting memref access in getMemRefRegion - NFC - use getAccessMap() instead of repeating it - fold getMemRefRegion into MemRefRegion ctor (more natural, avoid heap allocation and unique_ptr where possible) - change extractForInductionVars - MutableArrayRef -> ArrayRef for the arguments. Since the method is just returning copies of 'Value *', the client can't mutate the pointers themselves; it's fine to mutate the 'Value''s themselves, but that doesn't mutate the pointers to those. - change the way extractForInductionVars returns (see b/123437690) PiperOrigin-RevId: 232359277	2019-03-29 16:12:25 -07:00
Uday Bondhugula	b26900dce5	Update dma-generate pass to (1) work on blocks of instructions (instead of just loops), (2) take into account fast memory space capacity and lower 'dmaDepth' to fit, (3) add location information for debug info / errors - change dma-generate pass to work on blocks of instructions (start/end iterators) instead of 'for' loops; complete TODOs - allows DMA generation for straightline blocks of operation instructions interspersed b/w loops - take into account fast memory capacity: check whether memory footprint fits in fastMemoryCapacity parameter, and recurse/lower the depth at which DMA generation is performed until it does fit in the provided memory - add location information to MemRefRegion; any insufficient fast memory capacity errors or debug info w.r.t dma generation shows location information - allow DMA generation pass to be instantiated with a fast memory capacity option (besides command line flag) - change getMemRefRegion to return unique_ptr's - change getMemRefFootprintBytes to work on a 'Block' instead of 'ForInst' - other helper methods; add postDomInstFilter option for replaceAllMemRefUsesWith; drop forInst->walkOps, add Block::walkOps methods Eg. output $ mlir-opt -dma-generate -dma-fast-mem-capacity=1 /tmp/single.mlir /tmp/single.mlir:9:13: error: Total size of all DMA buffers' for this block exceeds fast memory capacity for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { ^ $ mlir-opt -debug-only=dma-generate -dma-generate -dma-fast-mem-capacity=400 /tmp/single.mlir /tmp/single.mlir:9:13: note: 8 KiB of DMA buffers in fast memory space for this block for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) { PiperOrigin-RevId: 232297044	2019-03-29 16:09:52 -07:00
River Riddle	870d778350	Begin the process of fully removing OperationInst. This patch cleans up references to OperationInst in the /include, /AffineOps, and lib/Analysis. PiperOrigin-RevId: 232199262	2019-03-29 16:09:36 -07:00
River Riddle	5052bd8582	Define the AffineForOp and replace ForInst with it. This patch is largely mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body. PiperOrigin-RevId: 232060516	2019-03-29 16:06:49 -07:00
River Riddle	755538328b	Recommit: Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst. PiperOrigin-RevId: 231342063	2019-03-29 15:59:30 -07:00
Nicolas Vasilache	ae772b7965	Automated rollback of changelist 231318632. PiperOrigin-RevId: 231327161	2019-03-29 15:42:38 -07:00
River Riddle	5ecef2b3f6	Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst. PiperOrigin-RevId: 231318632	2019-03-29 15:42:08 -07:00
River Riddle	36babbd781	Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation. PiperOrigin-RevId: 231064139	2019-03-29 15:40:23 -07:00
Nicolas Vasilache	0e7a8a9027	Drop AffineMap::Null and IntegerSet::Null Addresses b/122486036 This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing the Null method and cleaning up the constructors. As the ::Null uses were tracked down, opportunities appeared to untangle some of the Parsing logic and make it explicit where AffineMap/IntegerSet have ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit pointer values of AffineMap* and IntegerSet* that were passed as function parameters. Depending the values of those pointers one of 3 behaviors could occur. This parsing logic convolution is one of the rare cases where I would advocate for code duplication. The more proper fix would be to make the syntax unambiguous or to allow some lookahead. PiperOrigin-RevId: 231058512	2019-03-29 15:40:08 -07:00
River Riddle	c3424c3c75	Allow operations to hold a blocklist and add support for parsing/printing a block list for verbose printing. PiperOrigin-RevId: 230951462	2019-03-29 15:37:37 -07:00
Uday Bondhugula	f94b15c247	Update dma-generate: update for multiple load/store op's per memref - introduce a way to compute union using symbolic rectangular bounding boxes - handle multiple load/store op's to the same memref by taking a union of the regions - command-line argument to provide capacity of the fast memory space - minor change to replaceAllMemRefUsesWith to not generate affine_apply if the supplied index remap was identity PiperOrigin-RevId: 230848185	2019-03-29 15:35:38 -07:00
River Riddle	451869f394	Add cloning functionality to Block and Function, this also adds support for remapping successor block operands of terminator operations. We define a new BlockAndValueMapping class to simplify mapping between cloned values. PiperOrigin-RevId: 230768759	2019-03-29 15:34:20 -07:00
River Riddle	6859f33292	Migrate VectorOrTensorType/MemRefType shape api to use int64_t instead of int. PiperOrigin-RevId: 230605756	2019-03-29 15:33:20 -07:00
Uday Bondhugula	864d9e02a1	Update fusion cost model + some additional infrastructure and debug information for -loop-fusion - update fusion cost model to fuse while tolerating a certain amount of redundant computation; add cl option -fusion-compute-tolerance evaluate memory footprint and intermediate memory reduction - emit debug info from -loop-fusion showing what was fused and why - introduce function to compute memory footprint for a loop nest - getMemRefRegion readability update - NFC PiperOrigin-RevId: 230541857	2019-03-29 15:32:06 -07:00
Uday Bondhugula	94a03f864f	Allocate private/local buffers for slices accurately during fusion - the size of the private memref created for the slice should be based on the memref region accessed at the depth at which the slice is being materialized, i.e., symbolic in the outer IVs up until that depth, as opposed to the region accessed based on the entire domain. - leads to a significant contraction of the temporary / intermediate memref whenever the memref isn't reduced to a single scalar (through store fwd'ing). Other changes - update to promoteIfSingleIteration - avoid introducing unnecessary identity map affine_apply from IV; makes it much easier to write and read test cases and pass output for all passes that use promoteIfSingleIteration; loop-fusion test cases become much simpler - fix replaceAllMemrefUsesWith bug that was exposed by the above update - 'domInstFilter' could be one of the ops erased due to a memref replacement in it. - fix getConstantBoundOnDimSize bug: a division by the coefficient of the identifier was missing (the latter need not always be 1); add lbFloorDivisors output argument - rename getBoundingConstantSizeAndShape -> getConstantBoundingSizeAndShape PiperOrigin-RevId: 230405218	2019-03-29 15:30:31 -07:00
MLIR Team	27d067e164	LoopFusion improvements: ) Adds support for fusing into consumer loop nests with multiple loads from the same memref. ) Adds support for reducing slice loop trip count by projecting out destination loop IVs greater than destination loop depth. *) Removes dependence on src loop depth and simplifies cost model computation. PiperOrigin-RevId: 229575126	2019-03-29 15:21:59 -07:00
Uday Bondhugula	03e15e1b9f	Minor code cleanup - NFC. - readability changes PiperOrigin-RevId: 229443430	2019-03-29 15:19:41 -07:00
MLIR Team	38c2fe3158	LoopFusion: automate selection of source loop nest slice depth and destination loop nest insertion depth based on a simple cost model (cost model can be extended/replaced at a later time). ) LoopFusion: Adds fusion cost function which compares the cost of the fused loop nest, with the cost of the two unfused loop nests to determine if it is profitable to fuse the candidate loop nests. The fusion cost function is run for various combinations for src/dst loop depths attempting find the minimum cost setting for src/dst loop depths which does not increase the computational cost when the loop nests are fused. Combinations of src/dst loop depth are evaluated attempting to maximize loop depth (i.e. take a bigger computation slice from the source loop nest, and insert it deeper in the destination loop nest for better locality). ) LoopFusion: Adds utility to compute op instance count for loop nests, sliced loop nests, and to compute the cost of a loop nest fused with another sliced loop nest. ) LoopFusion: canonicalizes slice bound AffineMaps (and updates related tests). ) Analysis::Utils: Splits getBackwardComputationSlice into two functions: one which calculates and returns the slice loop bounds for analysis by LoopFusion, and the other for insertion of the computation slice (ones fusion has calculated the min-cost src/dst loop depths). *) Test: Adds multiple unit tests to test the new functionality. PiperOrigin-RevId: 229219757	2019-03-29 15:13:53 -07:00
Nicolas Vasilache	362557e11c	Simplify compositions of AffineApply This CL is the 6th and last on the path to simplifying AffineMap composition. This removes `AffineValueMap::forwardSubstitutions` and replaces it by simple calls to `fullyComposeAffineMapAndOperands`. PiperOrigin-RevId: 228962580	2019-03-29 15:11:56 -07:00
Chris Lattner	2b902f1288	Delete FuncBuilder::createChecked. It is perhaps still a good idea, but has no clients. Let's re-add it in the future if there is ever a reason to. NFC. Unrelatedly, add a use of a variable to unbreak the non-assert build. PiperOrigin-RevId: 228284026	2019-03-29 15:05:08 -07:00
Uday Bondhugula	e94ba6815a	Fix 0-d memref corner case for getMemRefRegion() - fix crash on test/Transforms/canonicalize.mlir with -memref-bound-check PiperOrigin-RevId: 228268486	2019-03-29 15:03:50 -07:00
Uday Bondhugula	21baf86a2f	Extend loop-fusion's slicing utility + other fixes / updates - refactor toAffineFromEq and the code surrounding it; refactor code into FlatAffineConstraints::getSliceBounds - add FlatAffineConstraints methods to detect identifiers as mod's and div's of other identifiers - add FlatAffineConstraints::getConstantLower/UpperBound - Address b/122118218 (don't assert on invalid fusion depths cmdline flags - instead, don't do anything; change cmdline flags src-loop-depth -> fusion-src-loop-depth - AffineExpr/Map print method update: don't fail on null instances (since we have a wrapper around a pointer, it's avoidable); rationale: dump/print methods should never fail if possible. - Update memref-dataflow-opt to add an optimization to avoid a unnecessary call to IsRangeOneToOne when it's trivially going to be true. - Add additional test cases to exercise the new support - update a few existing test cases since the maps are now generated uniformly with all destination loop operands appearing for the backward slice - Fix projectOut - fix wrong range for getBestElimCandidate. - Fix for getConstantBoundOnDimSize() - didn't show up in any test cases since we didn't have any non-hyperrectangular ones. PiperOrigin-RevId: 228265152	2019-03-29 15:03:20 -07:00
Uday Bondhugula	56b3640b94	Misc readability and doc / code comment related improvements - NFC - when SSAValue/MLValue existed, code at several places was forced to create additional aggregate temporaries of SmallVector<SSAValue/MLValue> to handle the conversion; get rid of such redundant code - use filling ctors instead of explicit loops - for smallvectors, change insert(list.end(), ...) -> append(... - improve comments at various places - turn getMemRefAccess into MemRefAccess ctor and drop duplicated getMemRefAccess. In the next CL, provide getAccess() accessors for load, store, DMA op's to return a MemRefAccess. PiperOrigin-RevId: 228243638	2019-03-29 15:02:41 -07:00
Uday Bondhugula	8496f2c30b	Complete TODOs / cleanup for loop-fusion utility - this is CL 1/2 that does a clean up and gets rid of one limitation in an underlying method - as a result, fusion works for more cases. - fix bugs/incomplete impl. in toAffineMapFromEq - fusing across rank changing reshapes for example now just works For eg. given a rank 1 memref to rank 2 memref reshape (64 -> 8 x 8) like this, -loop-fusion -memref-dataflow-opt now completely fuses and inlines/store-forward to get rid of the temporary: INPUT // Rank 1 -> Rank 2 reshape for %i0 = 0 to 64 { %v = load %A[%i0] store %v, %B[%i0 floordiv 8, i0 mod 8] } for %i1 = 0 to 8 for %i2 = 0 to 8 %w = load %B[%i1, i2] "foo"(%w) : (f32) -> () OUTPUT $ mlir-opt -loop-fusion -memref-dataflow-opt fuse_reshape.mlir #map0 = (d0, d1) -> (d0 * 8 + d1) mlfunc @fuse_reshape(%arg0: memref<64xf32>) { for %i0 = 0 to 8 { for %i1 = 0 to 8 { %0 = affine_apply #map0(%i0, %i1) %1 = load %arg0[%0] : memref<64xf32> "foo"(%1) : (f32) -> () } } } AFAIK, there is no polyhedral tool / compiler that can perform such fusion - because it's not really standard loop fusion, but possible through a generalized slicing-based approach such as ours. PiperOrigin-RevId: 227918338	2019-03-29 14:57:22 -07:00
Uday Bondhugula	f12182157e	Introduce PostDominanceInfo, fix properlyDominates() for Instructions - introduce PostDominanceInfo in the right/complete way and use that for post dominance check in store-load forwarding - replace all uses of Analysis/Utils::dominates/properlyDominates with DominanceInfo::dominates/properlyDominates - drop all redundant copies of dominance methods in Analysis/Utils/ - in pipeline-data-transfer, replace dominates call with a much less expensive check; similarly, substitute dominates() in checkMemRefAccessDependence with a simpler check suitable for that context - fix a bug in properlyDominates - improve doc for 'for' instruction 'body' PiperOrigin-RevId: 227320507	2019-03-29 14:48:44 -07:00
Uday Bondhugula	b9fe6be6d4	Introduce memref store to load forwarding - a simple memref dataflow analysis - the load/store forwarding relies on memref dependence routines as well as SSA/dominance to identify the memref store instance uniquely supplying a value to a memref load, and replaces the result of that load with the value being stored. The memref is also deleted when possible if only stores remain. - add methods for post dominance for MLFunction blocks. - remove duplicated getLoopDepth/getNestingDepth - move getNestingDepth, getMemRefAccess, getNumCommonSurroundingLoops into Analysis/Utils (were earlier static) - add a helper method in FlatAffineConstraints - isRangeOneToOne. PiperOrigin-RevId: 227252907	2019-03-29 14:47:28 -07:00
Chris Lattner	56e2a6cc3b	Merge the verifier logic for all functions into a unified framework, this requires enhancing DominanceInfo to handle the structure of an ML function, which is required anyway. Along the way, this also fixes a const correctness problem with Instruction::getBlock(). This is step 24/n towards merging instructions and statements. PiperOrigin-RevId: 227228900	2019-03-29 14:45:43 -07:00
Chris Lattner	456ad6a8e0	Standardize naming of statements -> instructions, revisting the code base to be consistent and moving the using declarations over. Hopefully this is the last truly massive patch in this refactoring. This is step 21/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227178245	2019-03-29 14:44:30 -07:00
Chris Lattner	315a466aed	Rename BasicBlock and StmtBlock to Block, and make a pass cleaning it up. I did not make an effort to rename all of the 'bb' names in the codebase, since they are still correct and any specific missed once can be fixed up on demand. The last major renaming is Statement -> Instruction, which is why Statement and Stmt still appears in various places. This is step 19/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227163082	2019-03-29 14:43:58 -07:00
Chris Lattner	69d9e990fa	Eliminate the using decls for MLFunction and CFGFunction standardizing on Function. This is step 18/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227139399	2019-03-29 14:43:13 -07:00
Chris Lattner	d798f9bad5	Rename BBArgument -> BlockArgument, Op::getOperation -> Op::getInst(), StmtResult -> InstResult, StmtOperand -> InstOperand, and remove the old names. This is step 17/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227121537	2019-03-29 14:42:40 -07:00
Chris Lattner	5187cfcf03	Merge Operation into OperationInst and standardize nomenclature around OperationInst. This is a big mechanical patch. This is step 16/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227093712	2019-03-29 14:42:23 -07:00
Chris Lattner	4c05f8cac6	Merge CFGFuncBuilder/MLFuncBuilder/FuncBuilder together into a single new FuncBuilder class. Also rename SSAValue.cpp to Value.cpp This is step 12/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227067644	2019-03-29 14:40:22 -07:00
Chris Lattner	3f190312f8	Merge SSAValue, CFGValue, and MLValue together into a single Value class, which is the new base of the SSA value hierarchy. This CL also standardizes all the nomenclature and comments to use 'Value' where appropriate. This also eliminates a large number of cast<MLValue>(x)'s, which is very soothing. This is step 11/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 227064624	2019-03-29 14:40:06 -07:00
Chris Lattner	abf72a8bb1	Rename findFunction from the ML side of the house to be named getFunction(), making it more similar to the CFG side of things. It is true that in a deeply nested case that this is not a guaranteed O(1) time operation, and that 'get' could lead compiler hackers to think this is cheap, but we need to merge these and we can look into solutions for this in the future if it becomes a problem in practice. This is step 9/n towards merging instructions and statements, NFC. PiperOrigin-RevId: 226983931	2019-03-29 14:38:49 -07:00
Chris Lattner	1301f907a1	Refactor ForStmt: having it contain a StmtBlock instead of subclassing StmtBlock. This is more consistent with IfStmt and also conceptually makes more sense - a forstmt "isn't" its body, it contains its body. This is step 1/N towards merging BasicBlock and StmtBlock. This is required because in the new regime StmtBlock will have a use list (just like BasicBlock does) of operands, and ForStmt already has a use list for its induction variable. This is a mechanical patch, NFC. PiperOrigin-RevId: 226684158	2019-03-29 14:35:19 -07:00
MLIR Team	4eef795a1d	Computation slice update: adds parameters to insertBackwardComputationSlice which specify the source loop nest depth at which to perform iteration space slicing, and the destination loop nest depth at which to insert the compution slice. Updates LoopFusion pass to take these parameters as command line flags for experimentation. PiperOrigin-RevId: 226514297	2019-03-29 14:35:03 -07:00
MLIR Team	4f5ef1619e	Pass loop depth 1 to memref dependence check when constructing dependence constraints used to calculate computation slice for loop fusion. This done so that the dominance check between ancestors of op statements from src/dst memref accesses will be run. PiperOrigin-RevId: 226350443	2019-03-29 14:33:46 -07:00

1 2

64 Commits