llvm-project

Author	SHA1	Message	Date
Hal Finkel	39a95032d2	BBVectorize: Omit unnecessary entries in PairableInstUsers This map is queried only for instructions in pairs of pairable instructions; so make sure that only pairs of pairable instructions are added to the map. This gives a 3.5% speedup on the csa.ll test case from PR15222. No functionality change intended. llvm-svn: 174914	2013-02-11 23:02:09 +00:00
Hal Finkel	0b8ae895b4	BBVectorize: Eliminate one more restricted linear search This eliminates one more linear search over a range of std::multimap entries. This gives a 22% speedup on the csa.ll test case from PR15222. No functionality change intended. llvm-svn: 174893	2013-02-11 17:19:34 +00:00
Hal Finkel	cb268f7995	BBVectorize: Remove the linear searches from pair connection searching This removes the last of the linear searches over ranges of std::multimap iterators, giving a 7% speedup on the doduc.bc input from PR15222. No functionality change intended. llvm-svn: 174859	2013-02-11 05:29:51 +00:00
Hal Finkel	fee38f9754	BBVectorize: Avoid linear searches within the load-move set This is another cleanup aimed at eliminating linear searches in ranges of std::multimap. No functionality change intended. llvm-svn: 174858	2013-02-11 05:29:49 +00:00
Hal Finkel	dd4bc66593	BBVectorize: isa/cast cleanup in getInstructionTypes Profiling suggests that getInstructionTypes is performance-sensitive, this cleans up some double-casting in that function in favor of using dyn_cast. No functionality change intended. llvm-svn: 174857	2013-02-11 05:29:48 +00:00
Hal Finkel	c1cc166948	BBVectorize: Make the bookkeeping to support full cycle checking less expensive By itself, this does not have much of an effect, but only because in the default configuration the full cycle checks are used only for small problem sizes. This is part of a general cleanup of uses of iteration over std::multimap ranges only for the purpose of checking membership. No functionality change intended. llvm-svn: 174856	2013-02-11 05:29:41 +00:00
Hal Finkel	dd2721842d	BBVectorize: Use TTI->getAddressComputationCost This is a follow-up to the cost-model change in r174713 which splits the cost of a memory operation between the address computation and the actual memory access. In r174713, this cost is always added to the memory operation cost, and so BBVectorize will do the same. Currently, this new cost function is used only by ARM, and I don't have any ARM test cases for BBVectorize. Assistance in generating some good ARM test cases for BBVectorize would be greatly appreciated! llvm-svn: 174743	2013-02-08 21:13:39 +00:00
Jakob Stoklund Olesen	479e5a9313	Typos. llvm-svn: 174723	2013-02-08 17:43:32 +00:00
Arnold Schwaighofer	594fa2dc2b	ARM cost model: Address computation in vector mem ops not free Adds a function to target transform info to query for the cost of address computation. The cost model analysis pass now also queries this interface. The code in LoopVectorize adds the cost of address computation as part of the memory instruction cost calculation. Only there, we know whether the instruction will be scalarized or not. Increase the penality for inserting in to D registers on swift. This becomes necessary because we now always assume that address computation has a cost and three is a closer value to the architecture. radar://13097204 llvm-svn: 174713	2013-02-08 14:50:48 +00:00
Michael Kuperstein	f63b77be7f	Test Commit llvm-svn: 174709	2013-02-08 12:58:29 +00:00
Nadav Rotem	a9100f3609	fix 80-col violation and fix the docs. llvm-svn: 174671	2013-02-07 22:34:07 +00:00
Arnold Schwaighofer	3476fc8c82	Loop Vectorizer: Refactor Memory Cost Computation We don't want too many classes in a pass and the classes obscure the details. I was going a little overboard with object modeling here. Replace classes by generic code that handles both loads and stores. No functionality change intended. llvm-svn: 174646	2013-02-07 19:05:21 +00:00
Arnold Schwaighofer	3be40b56c5	Loop Vectorizer: Refactor code to compute vectorized memory instruction cost Introduce a helper class that computes the cost of memory access instructions. No functionality change intended. llvm-svn: 174422	2013-02-05 18:46:41 +00:00
Arnold Schwaighofer	22174f5d5a	Loop Vectorizer: Handle pointer stores/loads in getWidestType() In the loop vectorizer cost model, we used to ignore stores/loads of a pointer type when computing the widest type within a loop. This meant that if we had only stores/loads of pointers in a loop we would return a widest type of 8bits (instead of 32 or 64 bit) and therefore a vector factor that was too big. Now, if we see a consecutive store/load of pointers we use the size of a pointer (from data layout). This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced test case is the first test in vector_ptr_load_store.ll). radar://13139343 llvm-svn: 174377	2013-02-05 15:08:02 +00:00
Pekka Jaaskelainen	f50ab84bb1	LoopVectorize: convert TinyTripCountVectorThreshold constant to a command line switch. llvm-svn: 173837	2013-01-29 21:42:08 +00:00
Benjamin Kramer	cf406756ce	LoopVectorize: Clean up ValueMap a bit and avoid double lookups. No intended functionality change. llvm-svn: 173809	2013-01-29 17:31:33 +00:00
Renato Golin	1258519674	Vectorization Factor clarification llvm-svn: 173691	2013-01-28 16:02:45 +00:00
Hal Finkel	293a41d14f	BBVectorize: Better use of TTI->getShuffleCost When flipping the pair of subvectors that form a vector, if the vector length is 2, we can use the SK_Reverse shuffle kind to get more-accurate cost information. Also we can use the SK_ExtractSubvector shuffle kind to get accurate subvector extraction costs. The current cost model implementations don't yet seem complex enough for this to make a difference (thus, there are no test cases with this commit), but it should help in future. Depending on how the various targets optimize and combine shuffles in practice, we might be able to get more-accurate costs by combining the costs of multiple shuffle kinds. For example, the cost of flipping the subvector pairs could be modeled as two extractions and two subvector insertions. These changes, however, should probably be motivated by specific test cases. llvm-svn: 173621	2013-01-27 20:07:01 +00:00
Hal Finkel	2d443e94b4	BBVectorize: Add a additional comment about the cost computation llvm-svn: 173580	2013-01-26 16:49:04 +00:00
Hal Finkel	351a75b6d7	BBVectorize: Fix anomalous capital letter in comment llvm-svn: 173579	2013-01-26 16:49:03 +00:00
Nadav Rotem	69a040d3eb	LoopVectorize: Refactor the code that vectorizes loads/stores to remove duplication. llvm-svn: 173500	2013-01-25 21:47:42 +00:00
Benjamin Kramer	21e8da5990	LoopVectorize: Simplify code. No functionality change. llvm-svn: 173475	2013-01-25 19:43:15 +00:00
Nadav Rotem	8e9ca2f8cb	LoopVectorizer: Refactor more code to use the IRBuilder. llvm-svn: 173471	2013-01-25 19:26:23 +00:00
Nadav Rotem	c8adf3ff6e	Refactor some code to use the IRBuilder. llvm-svn: 173467	2013-01-25 18:34:09 +00:00
Nadav Rotem	ab3e698ee9	Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = --p; p = ( ... ); } while (--n); llvm-svn: 173219	2013-01-23 01:35:00 +00:00
Nadav Rotem	b2e7e7a0b6	Fix a comment. Induction vars dont need to start at zero. llvm-svn: 173061	2013-01-21 17:59:18 +00:00
Benjamin Kramer	a6e2e2a0a7	LoopVectorize: Fix a C++11 incompatibility. llvm-svn: 172990	2013-01-20 20:29:52 +00:00
Nadav Rotem	da9f2adffd	Fix a build error. llvm-svn: 172971	2013-01-20 09:39:17 +00:00
Nadav Rotem	c42f90b1f4	LoopVectorizer: Implement a new heuristics for selecting the unroll factor. We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964	2013-01-20 05:24:29 +00:00
Benjamin Kramer	d455ed85d1	LoopVectorizer: Emit memory checks into their own basic block. This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902	2013-01-19 13:57:58 +00:00
Nadav Rotem	d33ce6f100	LoopVectorizer cost model. Honor the user command line flag that selects the vectorization factor even if the target machine does not have any vector registers. llvm-svn: 172544	2013-01-15 18:25:16 +00:00
Nadav Rotem	40e45eeae2	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Nadav Rotem	853fe0acb9	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Nadav Rotem	6eae65cfac	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Nadav Rotem	3c352c0f4a	Code cleanup: refactor the switch statements in the generation of reduction variables into an IR builder call. llvm-svn: 171871	2013-01-08 17:37:45 +00:00
Nadav Rotem	6f6d21a17b	Rename the enum members to match the LLVM coding style. llvm-svn: 171868	2013-01-08 17:23:17 +00:00
Nadav Rotem	5a197c06f3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Nadav Rotem	c60d7d96f5	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Chandler Carruth	b348328b5d	Simplify LoopVectorize to require target transform info and rely on it being present. Make a member of one of the helper classes a reference as part of this. Reformatting goodness brought to you by clang-format. llvm-svn: 171726	2013-01-07 11:12:29 +00:00
Chandler Carruth	b7e60f6844	Merge the unused header file for LoopVectorizer into the source file. This makes the loop vectorizer match the pattern followed by roughly all other passses. =] Notably, this header file was braken in several regards: it contained a using namespace directive, global #define's that aren't globaly appropriate, and global constants defined directly in the header file. As a side benefit, lots of the types in this file become internal, which will cause the optimizer to chew on this pass more effectively. llvm-svn: 171723	2013-01-07 10:44:06 +00:00
Chandler Carruth	7383bfd67e	Switch BBVectorize to directly depend on having a TTI analysis. This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722	2013-01-07 10:22:36 +00:00
Chandler Carruth	04ece8623e	Fix a slew of indentation and parameter naming style issues. This 80% of this patch brought to you by the tool clang-format. I wanted to fix up the names of constructor parameters because they followed a bit of an anti-pattern by naming initialisms with CamelCase: 'Tti', 'Se', etc. This appears to have been in an attempt to not overlap with the names of member variables 'TTI', 'SE', etc. However, constructor arguments can very safely alias members, and in fact that's the conventional way to pass in members. I've fixed all of these I saw, along with making some strang abbreviations such as 'Lp' be simpler 'L', or 'Lgl' be the word 'Legal'. However, the code I was touching had indentation and formatting somewhat all over the map. So I ran clang-format and fixed them. I also fixed a few other formatting or doxygen formatting issues such as using ///< on trailing comments so they are associated with the correct entry. There is still a lot of room for improvement of the formating and cleanliness of this code. ;] At least a few parts of the coding standards or common practices in LLVM's code aren't followed, the enum naming rules jumped out at me. I may mix some of these while I'm here, but not all of them. llvm-svn: 171719	2013-01-07 09:57:00 +00:00
Chandler Carruth	2109f47d97	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	d3e73556d6	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Chandler Carruth	21b3c586ab	Switch the loop vectorizer from VTTI to just use TTI directly. llvm-svn: 171620	2013-01-05 10:16:02 +00:00
Chandler Carruth	7c4f91dea5	Switch the BB vectorizer from the VTTI interface to the simple TTI interface. llvm-svn: 171618	2013-01-05 10:05:28 +00:00
Nadav Rotem	e9f5bfd5e9	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Paul Redmond	874f01e956	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00

... 26 27 28 29 30 ...

1581 Commits