llvm-project

Author	SHA1	Message	Date
Johannes Rudolf Doerfert	41a278f56a	[OpenMP][FIX] Do not add custom state machine eagerly in LTO runs If we run LTO optimization we migth end up introducing a custom state machine and later transforming the region into SPMD. This is a problem. While a follow up will introduce a check for the SPMD conversion, this already prevents the eager custom state machine generation. Only if the kernel init function is defined, rather then declared, we will emit a custom state machine. SPMD-zation can happen eagerly though. Tests are adjusted via a weak definition. The LTO test was added to verify this works as expected. Differential Revision: https://reviews.llvm.org/D136740	2022-10-26 10:40:11 -07:00
Alex Brachet	443e2a10f6	Reland "[PGO] Make emitted symbols hidden" This was reverted because it was breaking when targeting Darwin which tried to export these symbols which are now hidden. It should be safe to just stop attempting to export these symbols in the clang driver, though Apple folks will need to change their TAPI allow list described in the commit where these symbols were originally exported `f538018562` Then reverted again because it broke tests on MacOS, they should be fixed now. Bug: https://github.com/llvm/llvm-project/issues/58265 Differential Revision: https://reviews.llvm.org/D135340	2022-10-26 17:13:05 +00:00
Momchil Velikov	9901583968	Revert "[FuncSpec] Fix specialisation based on literals" This reverts commit a8b0f580170089fcd555ade5565ceff0ec60f609 because of "reverse-iteration" buildbot failure.	2022-10-26 13:54:12 +01:00
Momchil Velikov	2c8a4c6e62	Revert "[FuncSpec][NFC] Refactor finding specialisation opportunities" This reverts commit a8853924bd3c50deebfbf993c037257ccf9805f4 due to dependency on a8b0f5801700	2022-10-26 13:54:12 +01:00
Guillaume Chatelet	1a726cfa83	Take memset_inline into account in analyzeLoadFromClobberingMemInst This appeared in https://reviews.llvm.org/D126903#3884061 Differential Revision: https://reviews.llvm.org/D136752	2022-10-26 09:50:13 +00:00
Momchil Velikov	a8853924bd	[FuncSpec][NFC] Refactor finding specialisation opportunities This patch reorders the traversal of function call sites and function formal parameters to: * do various argument feasibility checks (`isArgumentInteresting` ) only once per argument, i.e. doing N-args checks instead of N-calls x N-args checks. * do hash table lookups only once per call site, i.e. N-calls lookups/inserts instead of N-call x N-args lookups/inserts. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D135968	2022-10-26 10:18:35 +01:00
Momchil Velikov	606d25e545	[FuncSpec] Compute specialisation gain even when forcing specialisation When rewriting the call sites to call the new specialised functions, a single call site can be matched by two different specialisations - a "less specialised" version of the function and a "more specialised" version of the function, e.g. for a function void f(int x, int y) the call like `f(1, 2)` could be matched by either void f.1(int x /* int y == 2 /); or void f.2(/ int x == 1, int y == 2 */); The `FunctionSpecialisation` pass tries to match specialisation in the order of decreasing gain, so "more specialised" functions are preferred to "less specialised" functions. This breaks, however, when using the flag `-force-function-specialization`, in which case the cost/benefit analysis is not performed and all the specialisations are equally preferable. This patch makes the pass calculate specialisation gain and order the specialisations accordingly even when `-force-function-specialization` is used, under the assumption that this flag has purely debugging purpose and it is reasonable to ignore the extra computing effort it incurs. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D136180	2022-10-26 10:08:03 +01:00
Momchil Velikov	a8b0f58017	[FuncSpec] Fix specialisation based on literals The `FunctionSpecialization` pass has support for specialising functions, which are called with literal arguments. This functionality is disabled by default and is enabled with the option `-function-specialization-for-literal-constant` . There are a few issues with the implementation, though: * even with the default, the pass will still specialise based on floating-point literals * even when it's enabled, the pass will specialise only for the `i1` type (or `i2` if all of the possible 4 values occur, or `i3` if all of the possible 8 values occur, etc) The reason for this is incorrect check of the lattice value of the function formal parameter. The lattice value is `overdefined` when the constant range of the possible arguments is the full set, and this is the reason for the specialisation to trigger. However, if the set of the possible arguments is not the full set, that must not prevent the specialisation. This patch changes the pass to NOT consider a formal parameter when specialising a function if the lattice value for that parameter is: * unknown or undef * a constant * a constant range with a single element on the basis that specialisation is pointless for those cases. Is also changes the criteria for picking up an actual argument to specialise if the argument is: * a LLVM IR constant * has `constant` lattice value has `constantrange` lattice value with a single element. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135893	2022-10-26 09:55:33 +01:00
Momchil Velikov	1a525dec7f	[FuncSpec] Fix missed opportunities for function specialisation When collecting the possible constant arguments to specialise a function the compiler will abandon the search on the first argument that is for some reason unsuitable as a specialisation constant. Thus, depending on the traversal order of the functions and call sites, the compiler can end up with a different set of possible constants, hence with different set of specialisations. With this patch, the compiler will skip unsuitable constants, but nevertheless will continue searching for more. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135867	2022-10-25 23:19:48 +01:00
Philip Reames	269bc684e7	[LV][RISCV] Disable vectorization of epilogue loops Epilogue loop vectorization is a feature in the vectorize intended to avoid running fully scalar code when the vector length of the main loop turns out to be either longer than the trip count of the actual loop, or with a huge remainder. In practice, this feature appears to not have been well tuned. I honestly don't think it should be on by default at all, but it definitely shouldn't be on for RISCV. Note that other targets have also disabled it, but they've done so via disabling interleaving - which is, well, completely unrelated - and we don't want to do that for RISCV. In the near term, many examples I'm seeing have terrible codegen for epilogue vectorization. We are greatly increasing code size for little value at reasonable VLEN values for small types. In the long term, the cases that epilogue vectorization are intended to handle are likely better handled via tail folding on RISCV. As an aside, I also don't really trust the correctness of epilogue vectorization. The code structure is such that otherwise straight forward changes sometimes break only epilogue vectorization. The reuse of an existing vplan without careful validation opens significant room for nasty bugs. Given how rarely the code is exercised, that is not a good combination. As such, this patch introduces a TTI hook, and completely disables epilogue vectorization on RISCV. Differential Revision: https://reviews.llvm.org/D136695	2022-10-25 14:28:02 -07:00
Arthur Eubanks	ef37504879	[Instrumentation] Remove legacy passes Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D136615	2022-10-25 13:11:07 -07:00
Alina Sbirlea	d1b19da854	[LoopPeeling] Add flag to disable support for peeling loops with non-latch exits Add a flag to allow disabling the changes in https://reviews.llvm.org/D134803. Differential Revision: https://reviews.llvm.org/D136643	2022-10-25 12:19:14 -07:00
Momchil Velikov	c47739b45c	[FuncSpec] Consider small noinline functions for specialisation Small functions with size under a given threshold are not considered for specialisaion on the presumption that they are easy to inline. This does not apply to `noinline` functions, though. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D135862	2022-10-25 19:49:04 +01:00
Fangrui Song	a527bda520	[LegacyPM] Remove DataFlowSanitizerLegacyPass Using the legacy PM for the optimization pipeline was deprecated in 13.0.0. Following recent changes to remove non-core features of the legacy PM/optimization pipeline, remove DataFlowSanitizerLegacyPass. Differential Revision: https://reviews.llvm.org/D124594	2022-10-25 10:55:29 -07:00
Simon Pilgrim	50fe87a5c8	[Transforms] classifyArgUse - don't deference pointer before null test Reported here: https://pvs-studio.com/en/blog/posts/cpp/1003/ (N11)	2022-10-25 17:24:00 +01:00
Yaxun (Sam) Liu	9d5adc7e49	Revert "reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost" This reverts commit bd7949bcd86633bd4203b2ba6f891aea00fce4d1. Revert this patch since reviwers have different opinions regarding the approach in post-commit review. Will open RFC for further discussion. Differential Revision: https://reviews.llvm.org/D132408	2022-10-25 12:15:39 -04:00
zhongyunde	620cff096a	[InstCombine] Fold series of instructions into mull for more types Relax the constraint of wide/vectors types. Address the comment https://reviews.llvm.org/D136015?id=469189#inline-1314520 Reviewed By: spatel, chfast Differential Revision: https://reviews.llvm.org/D136661	2022-10-25 23:04:46 +08:00
Nico Weber	76745d2b58	Revert "[PGO] Make emitted symbols hidden" This reverts commit 04877284b4592e9286cab43467662c1b4ff81861. Looks like this is still breaking the test Profile-x86_64 :: instrprof-darwin-dead-strip.c (see comment on https://reviews.llvm.org/D135340).	2022-10-25 08:54:47 -04:00
LiaoChunyu	e6c8418aab	[ObjCARC][NFC] Fix defined but not used warning from D135041 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D136665	2022-10-25 15:16:42 +08:00
Kevin Athey	31bfa4a69b	[MSAN] Add handleCountZeroes for ctlz and cttz. This addresses a bug where vector versions of ctlz are creating false positive reports. Depends on D136369 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D136523	2022-10-24 17:31:34 -07:00
Roy Sundahl	0c35b6165c	[ASAN] Don't inline when -asan-max-inline-poisoning-size=0 When -asan-max-inline-poisoning-size=0, all shadow memory access should be outlined (through asan calls). This was not occuring when partial poisoning was required on the right side of a variable's redzone. This diff contains the changes necessary to implement and utilize __asan_set_shadow_01() through __asan_set_shadow_07(). The change is necessary for the full abstraction of the asan implementation and will enable experimentation with alternate strategies. Differential Revision: https://reviews.llvm.org/D136197	2022-10-24 14:17:59 -07:00
Sanjay Patel	5dcfc32822	[InstCombine] allow more commutative matches for logical-and to select fold This is a sibling transform to the fold just above it. That was changed to allow the corresponding commuted patterns with: 307307456277 e1bd759ea567 8628e6df7000	2022-10-24 16:40:43 -04:00
Yaxun (Sam) Liu	bd7949bcd8	reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost Fixed compile time increase due to always constructing LocalCostTracker. Now only construct LocalCostTracker when needed.	2022-10-24 15:43:53 -04:00
Craig Topper	1edc51b56a	[InstCombine] Explicitly check for scalable TypeSize. Instead of assuming it is a fixed size. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D136517	2022-10-24 12:29:06 -07:00
Alex Brachet	04877284b4	[PGO] Make emitted symbols hidden This was reverted because it was breaking when targeting Darwin which tried to export these symbols which are now hidden. It should be safe to just stop attempting to export these symbols in the clang driver, though Apple folks will need to change their TAPI allow list described in the commit where these symbols were originally exported `f538018562` Bug: https://github.com/llvm/llvm-project/issues/58265 Differential Revision: https://reviews.llvm.org/D135340	2022-10-24 19:05:10 +00:00
Alexey Bataev	da4e0f7ac5	[SLP][NFC]Fix PR58476: Fix compile time for reductions, NFC. Improve O(N^2) to O(N) in some cases, reduce number of allocations by reserving memory. Also, improve analysis of loads reduction values to avoid analysis of not vectorizable cases.	2022-10-24 10:13:24 -07:00
zhongyunde	81713e893a	[InstCombine] Fold series of instructions into mull The following sequence should be folded into in0 * in1 In0Lo = in0 & 0xffffffff; In0Hi = in0 >> 32; In1Lo = in1 & 0xffffffff; In1Hi = in1 >> 32; m01 = In1Hi * In0Lo; m10 = In1Lo * In0Hi; m00 = In1Lo * In0Lo; addc = m01 + m10; ResLo = m00 + (addc >> 32); Reviewed By: spatel, RKSimon Differential Revision: https://reviews.llvm.org/D136015	2022-10-25 01:09:37 +08:00
Kevin P. Neal	cfb88ee3ba	[StrictFP][IPSCCP] Constant fold intrinsics with metadata arguments This teaches the SCCP Solver how to constant fold more intrinsics. Constant folding appears to be just as good as D115737 but much, much lower in code change impact as suggested by nikic. The constrained floating-point intrinsics all take at least one metadata argument and were the motivation for the change. Differential Revision: https://reviews.llvm.org/D136466	2022-10-24 11:43:20 -04:00
Ahmed Bougacha	bddd9b6b91	[InstCombine] Combine ptrauth sign/resign + auth/resign intrinsics. (sign\|resign) + (auth\|resign) can be folded by omitting the middle sign+auth component if the key and discriminator match. Differential Revision: https://reviews.llvm.org/D132383	2022-10-24 08:03:14 -07:00
Matt Arsenault	597b9b7e8e	CodeExtractor: Fix assertion with non-0 alloca address spaces emitCallAndSwitchStatement creates placeholder allocas to pass to these, so the types need to match.	2022-10-23 15:16:55 -07:00
Mike Hommey	86e57e66da	[InstCombine] Bail out of casting calls when a conversion from/to byval is involved. Fixes #58307 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D135738	2022-10-23 09:49:48 +02:00
Kazu Hirata	3f8d2c917c	Ensure newlines at the end of files (NFC)	2022-10-22 09:29:40 -07:00
Sanjay Patel	8628e6df70	[InstCombine] use freeze to enable poison-safe logic->select fold Without a freeze, this transform can leak poison to the output: https://alive2.llvm.org/ce/z/GJuF9i This makes the transform as uniform as possible, and it can help reduce patterns like issue #58313 (although that particular example probably still needs another transform). Differential Revision: https://reviews.llvm.org/D136527	2022-10-22 10:42:14 -04:00
Thomas Symalla	fc26a75280	[NFC] Fixed several misspellings of "Splitter" in Scalarizer Spliiter => Splitter	2022-10-22 15:13:56 +02:00
Sanjay Patel	e1bd759ea5	[InstCombine] allow more matches for logical-ands --> select This allows patterns with real 'and' instructions because those are safe to transform: https://alive2.llvm.org/ce/z/7-U_Ak	2022-10-22 08:15:50 -04:00
Arthur Eubanks	4153f989ba	[ObjCARC] Remove legacy PM versions of optimization passes This doesn't touch objc-arc-contract because that's in the codegen pipeline. However, this does move its corresponding initialize function into initializeCodegen(). Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D135041	2022-10-21 13:40:54 -07:00
Sanjay Patel	3073074562	[InstCombine] allow more commutative matches for logical-and to select fold When the common value is part of either select condition, this is safe to reduce. Otherwise, it is not poison-safe (with the select form of the pattern): https://alive2.llvm.org/ce/z/FxQTzB This is another patch motivated by issue #58313.	2022-10-21 13:29:13 -04:00
Wael Yehia	461a1836d3	[PGO][AIX] Improve dummy var retention and allow -bcdtors:csect linking. 1) Use a static array of pointer to retain the dummy vars. 2) Associate liveness of the array with that of the runtime hook variable __llvm_profile_runtime. 3) Perform the runtime initialization through the runtime hook variable. 4) Preserve the runtime hook variable using the -u linker flag. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D136192	2022-10-21 16:32:42 +00:00
Sanjay Patel	d7fecf26f4	[InstCombine] allow some commutative matches for logical-and to select fold This is obviously correct for real logic instructions, and it also works for the poison-safe variants that use selects: https://alive2.llvm.org/ce/z/wyHiwX This is motivated by the lack of 'xor' folding seen in issue #58313. This more general fold should help reduce some of those patterns, but I'm not sure if this specific case does anything for that particular example.	2022-10-21 11:28:38 -04:00
Sanjay Patel	f6fc3e23b9	[InstCombine] refactor matching code for logical ands; NFCI Separating the matches makes it easier to enhance for commutative patterns.	2022-10-21 11:28:38 -04:00
Sanjay Patel	bf75e937bb	[InstCombine] match logical and/or more generally in fold to select This allows the regular bitwise logic opcodes in addition to the poison-safe select variants: https://alive2.llvm.org/ce/z/8xB9gy Handling commuted variants safely is likely trickier, so that's left to another patch.	2022-10-21 09:03:36 -04:00
Florian Hahn	fd236772f5	[IndVars] Forget SCEV for value after simplifying condition. Additional SCEV verification highlighted a case where the cached loop dispositions where incorrect after simplifying a condition in IndVars and moving the user in LoopDeletion. Fix it by invalidating ICmp and all its users. Fixes #58515.	2022-10-21 11:18:01 +01:00
Nikita Popov	e9754f0211	[IR] Add support for memory attribute This implements IR and bitcode support for the memory attribute, as specified in https://reviews.llvm.org/D135597. The new attribute is not used for anything yet (and as such, the old memory attributes are unaffected). Differential Revision: https://reviews.llvm.org/D135592	2022-10-21 12:11:25 +02:00
Florian Hahn	7eb4ec1c75	[VPlan] Print predicates for widened cmp instructions (NFC).	2022-10-21 08:54:11 +01:00
Michael Francis	922f42d531	[clang][AIX] Fix mcount name and call arguments Currently, compiling a program with the `-pg` flag will result in an undefined symbol error for `.mcount`. This revision fixes the call to use `__mcount`, which requires a pointer argument to a pointer-sized object (unique per inserted call) on AIX. This is only a partial fix. This patch should fix the `-pg` flag's behaviour on AIX to work with code you are compiling, but it will not link against standard libraries with `mcount` instrumentation calls. The next step is to add profiled libraries to the linker search paths in the Clang driver for the AIX toolchain when linking with `-pg`. Differential Review: https://reviews.llvm.org/D135384	2022-10-20 16:20:00 -04:00
William Huang	6c767cef5a	[InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the back Canonicalize GEP of GEP by swapping GEP with some suffix constant indices to the back (and GEP with all constant indices to the back of that), this allows more constant index GEP merging to happen. Exceptions are: If swapping violates use-def relations, or anti-optimizes LICM For constant indexed GEP of GEP, if they cannot be merged directly, they will be casted to i8* and merged. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D125845	2022-10-20 17:41:26 +00:00
Paul Walker	ab8257ca0e	[NFC] Fix a few whitespace inconsistencies.	2022-10-20 14:52:25 +00:00
OCHyams	f825214411	[DebugInfo][NFC] Refactor debug intrinsic copy and delete to instead just move Reviewed By: jryans Differential Revision: https://reviews.llvm.org/D133304	2022-10-20 15:12:49 +01:00
Florian Hahn	e25ed058bc	[LV] Use buildScalarSteps to also handle VF = 1. (NFCI) The code in buildScalarSteps already properly handles creating the scalar induction values with VF = 1. Use it directly instead of using extra code to handle that case. Suggested by @Ayal in D133760.	2022-10-20 14:30:01 +01:00
Nikita Popov	7c32c7e777	Reapply [FunctionAttrs] Make location classification more precise Reapplying after the fix for volatile modelling in D135863. ----- Don't add argmem if the pointer is clearly not an argument (e.g. a global). I don't think this makes a difference right now, but gives more obvious results with D135780.	2022-10-20 15:13:20 +02:00

1 2 3 4 5 ...

31834 Commits