llvm-project

Author	SHA1	Message	Date
futog	3e0a76b1fd	[Codegen][LegalizeIntegerTypes] Improve shift through stack (#96151 ) Minor improvement on cc39c3b17fb2598e20ca0854f9fe6d69169d85c7. Use an aligned stack slot to store the shifted value. Use the native register width as shifting unit, so the load of the shift result is aligned. If the shift amount is a multiple of the native register width, there is no need to do a follow-up shift after the load. I added new tests for these cases. Co-authored-by: Gergely Futo <gergely.futo@hightec-rt.com>	2024-09-23 11:45:43 +02:00
Roman Lebedev	cc39c3b17f	[Codegen][LegalizeIntegerTypes] New legalization strategy for scalar shifts: shift through stack https://reviews.llvm.org/D140493 is going to teach SROA how to promote allocas that have variably-indexed loads. That does bring up questions of cost model, since that requires creating wide shifts. Indeed, our legalization for them is not optimal. We either split it into parts, or lower it into a libcall. But if the shift amount is by a multiple of CHAR_BIT, we can also legalize it throught stack. The basic idea is very simple: 1. Get a stack slot 2x the width of the shift type 2. store the value we are shifting into one half of the slot 3. pad the other half of the slot. for logical shifts, with zero, for arithmetic shift with signbit 4. index into the slot (starting from the base half into which we spilled, either upwards or downwards) 5. load 6. split loaded integer This works for both little-endian and big-endian machines: https://alive2.llvm.org/ce/z/YNVwd5 And better yet, if the original shift amount was not a multiple of CHAR_BIT, we can just shift by that remainder afterwards: https://alive2.llvm.org/ce/z/pz5G-K I think, if we are going perform shift->shift-by-parts expansion more than once, we should instead go through stack, which is what this patch does. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140638	2023-01-14 19:12:18 +03:00
Chen Zheng	b5e1fc19da	[PowerPC] don't check CTR clobber in hardware loop insertion pass We added a new post-isel CTRLoop pass in D122125. That pass will expand the hardware loop related intrinsic to CTR loop or normal loop based on the loop context. So we don't need to conservatively check the CTR clobber now on the IR level. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D135847	2022-12-04 20:53:49 -05:00
Chen Zheng	5f4927da77	[PowerPC][NFC] refactor some test cases.	2022-10-12 12:19:52 +00:00
Kai Nacke	427fb35192	[PPC] Opaque pointer migration, part 1. The LIT test cases were migrated with the script provided by Nikita Popov. Due to the size of the change it is split into several parts. Reviewed By: nemanja, amyk, nikic, PowerPC Differential Revision: https://reviews.llvm.org/D135470	2022-10-11 17:24:06 +00:00
Ehsan Amiri	a538b0f023	Adding -verify-machineinstrs option to PowerPC tests Currently we have a number of tests that fail with -verify-machineinstrs. To detect this cases earlier we add the option to the testcases with the exception of tests that will currently fail with this option. PR 27456 keeps track of this failures. No code review, as discussed with Hal Finkel. llvm-svn: 277624	2016-08-03 18:17:35 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Hal Finkel	c4c6c87666	[PowerPC] On PPC32, 128-bit shifts might be runtime calls The counter-loops formation pass needs to know what operations might be function calls (because they can't appear in counter-based loops). On PPC32, 128-bit shifts might be runtime calls (even though you can't use __int128 on PPC32, it seems that SROA might form them). Fixes PR19709. llvm-svn: 208501	2014-05-11 16:23:29 +00:00

8 Commits