llvm-project

Author	SHA1	Message	Date
Craig Topper	f668a08e00	[DAGCombiner][RISCV] Optimize (zext nneg (truncate X)) if X has known sign bits. (#82227 ) This treats the zext nneg as sext if X is known to have sufficient sign bits to allow the zext or truncate or both to removed. This code is taken from the same optimization for sext.	2024-02-19 10:45:11 -08:00
Nikita Popov	ff9af4c43a	[CodeGen] Convert tests to opaque pointers (NFC)	2024-02-05 14:07:09 +01:00
Matthias Braun	e3cf80c5c1	BlockFrequencyInfoImpl: Avoid big numbers, increase precision for small spreads BlockFrequencyInfo calculates block frequencies as Scaled64 numbers but as a last step converts them to unsigned 64bit integers (`BlockFrequency`). This improves the factors picked for this conversion so that: * Avoid big numbers close to UINT64_MAX to avoid users overflowing/saturating when adding multiply frequencies together or when multiplying with integers. This leaves the topmost 10 bits unused to allow for some room. * Spread the difference between hottest/coldest block as much as possible to increase precision. * If the hot/cold spread cannot be represented loose precision at the lower end, but keep the frequencies at the upper end for hot blocks differentiable.	2023-10-24 20:27:39 -07:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
Simon Pilgrim	4cd1c07491	[DAG] SimplifyDemandedBits - if we're only demanding the msb, a UMIN/UMAX node can be simplified to a AND/OR node respectively. Alive2: https://alive2.llvm.org/ce/z/qnvmc6	2023-08-18 12:12:22 +01:00
Eli Friedman	bc7f11ccb0	[SelectionDAG] Improve expansion of wide min/max The current implementation tries to handle the high and low halves separately, but that's less efficient in most cases; use a wide SETCC instead. Differential Revision: https://reviews.llvm.org/D151358	2023-06-26 10:45:41 -07:00
Amaury Séchet	61f9cb002d	[NFC] Regenerate several VE codegen tests.	2023-06-14 16:20:37 +00:00
Craig Topper	139392c0a5	[LegalizeTypes][ARM][AArch6][RISCV][VE][WebAssembly] Add special case for smin(X, -1) and smax(X, 0) to ExpandIntRes_MINMAX. We can compute a simpler expression for Lo for these cases. This is an alternative for the test cases in D151180 that works for more targets. This is similar to some of the special cases we have for expanding setcc operands. Differential Revision: https://reviews.llvm.org/D151182	2023-05-23 09:19:55 -07:00
Craig Topper	a983ef2c17	[DAGCombiner][AArch64][VE] Teach BuildUDIV/SDIV to use 2x mul when mulh/mul_lohi are not available. Correct the legality of i32 mul_lohi on AArch64. Previously, AArch64 incorrectly reported i32 mul_lohi as Legal. This allowed BuildUDIV/SDIV to use them. A later DAGCombiner would replace them with MULHS/MULHU because only the high half was used. This conversion does not check the legality of MULHS/MULHU under the assumption that LegalizeDAG can turn it back into MUL_LOHI later. After they are converted to MULHS/MULHU, DAGCombine ran and saw that these operations aren't supported but an i64 MUL is. So they get converted to that plus a shift. Without this, LegalizeDAG would convert back MUL_LOHI and isel would fail to find a pattern. This patch teaches BuildUDIV/SDIV to create the wide mul and shift so that we can report the correct operation legality on AArch64. It also enables div by constant folding for more cases on VE. I don't know if VE wants this div by constant optimization or not. If they don't want it, they can use the isIntDivCheap hook to disable it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D150333	2023-05-12 09:06:17 -07:00
Andrew Savonichev	c65b4d64d4	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2023-02-09 18:45:20 +03:00
Matt Arsenault	778cf5431c	IR: Add atomicrmw uinc_wrap and udec_wrap These are essentially add/sub 1 with a clamping value. AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.	2023-01-24 17:55:11 -04:00
Nikita Popov	4861a58769	[VE] Convert test to opaque pointers (NFC) There is a minor codegen regression here (an extra and instruction). The reason is that CGP only eliminates fallthrough branches if it has made some other kind of change, and with opaque pointers that other change does not occur. Ideally, we should probably always try to eliminate fallthroughs, but this runs into the problem that performing a dummy fallthrough is a common pattern in tests for forcing SDAG to select them separately, so it's not quite that simple.	2022-12-23 12:51:06 +01:00
Nikita Popov	ce5ef7d1d5	[VE] Name instructions in test (NFC)	2022-12-23 11:43:01 +01:00
Nikita Popov	b006b60dc9	[VE] Convert some tests to opaque pointers (NFC)	2022-12-19 13:06:34 +01:00
Ron Lieberman	38f1abef86	Revert "[SelectionDAG] Do not second-guess alignment for alloca" Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193 23491 This reverts commit ffedf47d8b793e07317f82f9c2a5f5425ebb71ad.	2022-12-15 10:55:18 -06:00
Andrew Savonichev	ffedf47d8b	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2022-12-15 18:18:12 +03:00
Kazushi (Jam) Marukawa	33dda45dde	[VE] Change the way to lower selectcc Change to use VEISD::CMPI/CMPU/CMPF/CMPQ and VEISD::CMOV in combineSelectCC for better optimization. Support VEISD::CMPI/CMPU in combineTRUNCATE also to optimize truncate. Remove obsolete lower patterns from VEInstrInfo.td. Update regression tests also. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D136049	2022-10-20 08:08:59 +09:00
Kazushi (Jam) Marukawa	0278c9ceb6	[VE] Change the way to lower select Change to use VEISD::CMOV in combineSelect for better optimization. Support VEISD::CMOV in combineTRUNCATE also to optimize trancate. Merge functions to handle condition codes to VE.h. And add basic CMOV patterns to VEInstrInfo.td. Update regression tests also. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D135878	2022-10-15 08:49:36 +09:00
Matt Arsenault	a61c3455c0	AtomicExpand: Use llvm.ptrmask instead of ptrtoint This removes the ptrtoint from the load's pointer operand, although we can't entirely eliminate these to get the LSB shift. In a future patch, this will avoid ptrtoint in the case where the atomic is overaligned to the word size.	2022-09-28 12:51:30 -04:00
Kazushi (Jam) Marukawa	de8013201f	[VE] Change to expand FPOW VE doesn't have FPOW instruction, so this patch makes llvm expand it. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D134695	2022-09-27 20:03:10 +09:00
Kazushi (Jam) Marukawa	1cef30b9d3	[VE] Disable automatic maxnum/minnum selection Disable FMAX/FMIN selection from select_cc in VEInstrInfo.td because of the lack of NaN consideration. This patch removes such selection from VEInstrInfo.td and lets llvm work on it in combineMinNumMaxNum. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D134595	2022-09-26 22:04:02 +09:00
Kazushi (Jam) Marukawa	76c76e9ab4	[VE] Support smax/smin Support smax/smin in VEInstrInfo.td. Remove obsolete patterns for smax/smin. Add regression tests for smax/smin/umax/umin. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D134583	2022-09-26 22:02:57 +09:00
Kazushi (Jam) Marukawa	337e54ec95	[VE] Add maxnum and minnum Add maxnum and minnum for float and double. Lowering is already implemented, so this patch changes them legal and adds regression tests. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D134108	2022-09-21 18:03:49 +09:00
Kazushi (Jam) Marukawa	3ee64ea5cf	[VE] Change to expand FMA VE has fused multiply-add instruction for only vector calculations. This patch forces to expand scalar FMA to multiply and add instructions. This patch also adds regression test. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D134107	2022-09-21 18:02:55 +09:00
Matt Arsenault	230dbe0857	VE: Use generated checks for a copy-pasted output test	2022-09-20 16:51:04 -04:00
Craig Topper	38ffa2bb96	[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. For remainder: If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer type, this urem will be expand by DAGCombiner using multiply by magic constant. We do have to take into account that adding high and low together can produce a carry, making it a (BitWidth / 2)+1 bit number. So we need to also add back in the carry from the first addition. For division: We can use the above trick to compute the remainder, subtract that remainder from the dividend, then multiply by the multiplicative inverse of the Divisor modulo (1 << BitWidth). This is based on the section "Remainder by Summing Digits" in Hacker's delight. The remainder trick is similar to a trick you may have learned for determining if a decimal number is divisible by 3. You can add all the digits together and see if the sum is divisible by 3. If you're not sure if the sum is divisible by 3, you can add its digits together. This can be repeated until you have a single decimal digit. If that digit is 3, 6, or 9, then the original number is divisible by 3. This works because 10 % 3 == 1. gcc already does this same trick. There are additional tricks gcc does urem as well as srem, udiv, and sdiv that I plan to add in future patches. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130862	2022-09-12 10:34:52 -07:00
Alex Richardson	2616e00949	[update_llc_test_checks][VE] Handle .Lfoo$local in function regex While working on https://reviews.llvm.org/D131429, I got a test diff in one of the VE tests and running update_llc_test_checks.py deleted all the code for that function. This updates the regex to handle this new output. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D131431	2022-08-24 14:16:20 +00:00
Kazushi (Jam) Marukawa	b88aba9d7d	[VE] Support inlineasm memory operand Support inline asm memory operand for VE. Add regression tests also. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D132380	2022-08-23 13:44:03 +09:00
Amaury Séchet	06da353748	[NFC] Automatically generate CodeGen/VE/Scalar/atomic.ll	2022-07-27 23:52:00 +00:00
Kazushi (Jam) Marukawa	da5a6b2bf5	[VE] Restructure eliminateFrameIndex Restructure the current implementation of eliminateFrameIndex function in order to support more instructions. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D129034	2022-07-05 20:00:19 +09:00
Kazushi (Jam) Marukawa	9ad38e5288	Revert "[VE] Restructure eliminateFrameIndex" This reverts commit 98e52e8bff525b1fb2b269f74b27f0a984588c9c.	2022-07-05 19:35:12 +09:00
Kazushi (Jam) Marukawa	98e52e8bff	[VE] Restructure eliminateFrameIndex Restructure the current implementation of eliminateFrameIndex function in order to support more instructions. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D129034	2022-07-05 19:28:11 +09:00
Sanjay Patel	c2592c374e	[SDAG] simplify bitwise logic with repeated operand We do not have general reassociation here (and probably do not need it), but I noticed these were missing in patches/tests motivated by D111530, so we can at least handle the simplest patterns. The VE test diff looks correct, but we miss that pattern in IR currently: https://alive2.llvm.org/ce/z/u66_PM	2022-03-13 11:12:30 -04:00
Simon Moll	bb5e35833f	[VE][NFC] correct bitmasking in popcnt expansion test	2021-10-25 13:55:58 +02:00
Simon Moll	4e9dbee1a3	[VE][Test] Make Scalar/va_arg test generic Make match patterns more permissive to be invariant to register allocation choices. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D111312	2021-10-08 08:07:51 +02:00
Craig Topper	9132299836	[LegalizeTypes][VE] Don't Expand BITREVERSE/BSWAP during type legalization promotion if they will be promoted for NVT in op legalization. We were trying to expand these if they were going to be expanded in op legalization so that we generated the minimum number of operations. We failed to take into account that NVT could be promoted to another legal type in op legalization. Hoping this fixes the issue on the VE target reported as a follow up to D96681. The check line changes were taken from before 1e46b6f4012399a2fef5fbbb4ed06fc919835414 so this patch does appear to improve some cases that had previously regressed.	2021-06-29 11:00:11 -07:00
Fangrui Song	1e46b6f401	[test] Fix CodeGen/VE/Scalar tests	2021-03-02 15:30:44 -08:00
Kazushi (Jam) Marukawa	f784be0777	[VE] Support SJLJ exception related instructions Support EH_SJLJ_LONGJMP, EH_SJLJ_SETJMP, and EH_SJLJ_SETUP_DISPATCH for SjLj exception handling. NC++ uses SjLj exception handling, so implement it first. Add regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D94071	2021-01-05 20:19:15 +09:00
Kazushi (Jam) Marukawa	2654f33c47	[VE] Support llvm.eh.sjlj.lsda In order to support SJLJ exception, implement llvm.eh.sjlj.lsda first. Add regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93811	2021-01-05 18:06:14 +09:00
Kazushi (Jam) Marukawa	c287f90ccd	[VE] Change default CPU name to "generic" Change default CPU name of SX-Aurora VE from "ve" to "generic" similar to other architectures. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93836	2021-01-04 20:09:57 +09:00
Kazushi (Jam) Marukawa	a3a896d1cd	[VE] Optimize LEA combinations Change to optimize references of elements of aggregate data. Also add regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93627	2020-12-21 22:21:10 +09:00
Kazushi (Jam) Marukawa	5e273b845b	[VE] Support STACKSAVE and STACKRESTORE Change to use default expanded code. Add regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93539	2020-12-21 20:15:50 +09:00
Kazushi (Jam) Marukawa	d99e4a4840	[VE] Support RETURNADDR Implement RETURNADDR for VE. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93545	2020-12-21 20:06:03 +09:00
Kazushi (Jam) Marukawa	697226550e	[VE] Support FRAMEADDR Implement FRAMEADDR for VE. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93295	2020-12-15 23:31:19 +09:00
Kazushi (Jam) Marukawa	a2eb07aa55	[VE] Support atomic exchange instructions Support atomic exchange and atomic compare and exchange instructions. Change CAS and TS1AM instructions for ISel patterns. Add selectADDRzi pattern for them. Add TS1AM pseudo instruction also for better ISel. Add shouldExpandAtomicRMWInIR() function to expand all atomicrmw instructions except atomicrmw xchg. Add custom lower for i8/i16 atomicrmw xchg. Modify replaceFI to support CAS/TS1AM instructions which use "reg+disp" operands instead of "reg+imm+disp" operands. And, add several regression tests to check the correctness. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93161	2020-12-15 17:43:11 +09:00
Kazushi (Jam) Marukawa	6834b3d6d5	[VE] Optimize prologue/epilogue instructions about GOT Optimize prologue/epilogue instructions if a given function use GOT but do not call other functions by eliminating FP. Previously, we had wrong implementations taken from other architectures. Update regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92313	2020-12-01 02:22:31 +09:00
Kazushi (Jam) Marukawa	6fe610535f	[VE] Clean check routines of branch types Previously, these check routines accepted non-generatble instructions. This time, I clean them and add assert for those non-generatable instructions. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92254	2020-12-01 02:19:37 +09:00
Kazushi (Jam) Marukawa	686988a50f	[VE] Optimize prologue/epilogue instructions Optimize eliminate FP mechanism. This time optimize a function which has no call but fixed stack objects. LLVM eliminates FP on such functions now. Also, optimize GOT/PLT registers save/restore instructions if a given function doesn't uses them. In addition, remove generating mechanism of `.cfi` instructions since those are taken from other architectures and not inspected yet. Update regression tests, also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92251	2020-11-30 22:22:33 +09:00
Kazushi (Jam) Marukawa	44a679eaa4	[VE] Change the behaviour of truncate Change the way to truncate i64 to i32 in I64 registers. VE assumed sext values previously. Change it to zext values this time to make it match to the LLVM behaviour. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92226	2020-11-30 22:12:45 +09:00

1 2

66 Commits