llvm-project

Author	SHA1	Message	Date
Bjorn Pettersson	a89d751fb4	Add intrinsics for saturating float to int casts This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptosi instructions, but will saturate (or return 0 for NaN) on values unrepresentable in the target type, instead of returning poison. Related mailing list discussion can be found at: https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ The intrinsics have overloaded source and result type and support vector operands: i32 @llvm.fptoui.sat.i32.f32(float %f) i100 @llvm.fptoui.sat.i100.f64(double %f) <4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f) // etc On the SelectionDAG layer two new ISD opcodes are added, FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands and one result. The second operand is an integer constant specifying the scalar saturation width. The idea here is that initially the second operand and the scalar width of the result type are the same, but they may change during type legalization. For example: i19 @llvm.fptsi.sat.i19.f32(float %f) // builds i19 fp_to_sint_sat f, 19 // type legalizes (through integer result promotion) i32 fp_to_sint_sat f, 19 I went for this approach, because saturated conversion does not compose well. There is no good way of "adjusting" a saturating conversion to i32 into one to i19 short of saturating twice. Specifying the saturation width separately allows directly saturating to the correct width. There are two baseline expansions for the fp_to_xint_sat opcodes. If the integer bounds can be exactly represented in the float type and fminnum/fmaxnum are legal, we can expand to something like: f = fmaxnum f, FP(MIN) f = fminnum f, FP(MAX) i = fptoxi f i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN If the bounds cannot be exactly represented, we expand to something like this instead: i = fptoxi f i = select f ult FP(MIN), MIN, i i = select f ogt FP(MAX), MAX, i i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN It should be noted that this expansion assumes a non-trapping fptoxi. Initial tests are for AArch64, x86_64 and ARM. This exercises all of the scalar and vector legalization. ARM is included to test float softening. Original patch by @nikic and @ebevhan (based on D54696). Differential Revision: https://reviews.llvm.org/D54749	2020-12-18 11:09:41 +01:00
Yvan Roux	923ca0b411	[ARM][MachineOutliner] Fix costs model. Fix candidates calls costs models allocation and prepare stack fixups handling. Differential Revision: https://reviews.llvm.org/D92933	2020-12-17 16:08:23 +01:00
David Green	c100d7ba36	[NFC] Chec[^k] -> Check Some test updates all appearing to use the wrong spelling of CHECK.	2020-12-08 11:54:39 +00:00
Martin Storsjö	78a57069b5	[CodeGen] Restore accessing __stack_chk_guard via a .refptr stub on mingw after 2518433f861fcb87 Add tests for this particular detail for x86 and arm (similar tests already existed for x86_64 and aarch64). The libssp implementation may be located in a separate DLL, and in those cases, the references need to be in a .refptr stub, to avoid needing to touch up code in the text section at runtime (which is supported but inefficient for x86, and unsupported for arm). Differential Revision: https://reviews.llvm.org/D92738	2020-12-07 09:35:12 +02:00
Layton Kifer	ac522f8700	[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1) Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets. Differential Revision: https://reviews.llvm.org/D91589	2020-12-06 11:52:10 -05:00
Fangrui Song	109e70d357	[TargetMachine] Drop implied dso_local for an edge case (extern_weak + non-pic + hidden) This does not deserve special handling. The code should be added to Clang instead if deemed useful. With this simplification, we can additionally delete the PIC extern_weak special case.	2020-12-05 15:52:33 -08:00
Fangrui Song	a084c0388e	[TargetMachine] Don't imply dso_local on function declarations in Reloc::Static model for ELF/wasm clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations, we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal. (Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.) By not implying dso_local, we will respect dso_local/dso_preemptable specifiers set by the frontend. This allows the proposed -fno-direct-access-external-data option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value) when taking the address of a function symbol. This patch should be NFC in terms of the Clang emitted assembly because the case we don't set dso_local is a case Clang sets dso_local. However, some tests don't set dso_local on some function declarations and expose some differences. Most tests have been fixed to be more robust in the previous commit.	2020-12-05 14:54:37 -08:00
Fangrui Song	6dbd0eac02	[test] Split some tests which test both static and pic relocation models TargetMachine::shouldAssumeDSOLocal currently implies dso_local for Static. Split some tests so that these `external dso_local global` will align with the Clang behavior.	2020-12-04 18:11:35 -08:00
Evgeny Leviant	f69936f529	Attempt to fix buildbot after rG993eaf2d69d8	2020-12-04 22:10:36 +03:00
Evgeny Leviant	993eaf2d69	Recommit [TableGen][SchedModels] Fix read/write variant substitution Original commit rG112b3cb6ba49 introduced non-determinism in subtarget generator due to iteration over DenseMap. New patch fixes this changing ProcModelMapTy from DenseMap to std::map.	2020-12-04 21:50:34 +03:00
Fangrui Song	86fa896363	Revert D90844 "[TableGen][SchedModels] Fix read/write variant substitution" This reverts commit 112b3cb6ba49aacd821440d0913f15b32131480e. D90844 made lib/Target/AArch64/AArch64GenSubtargetInfo.inc non-deterministic.	2020-12-03 14:24:29 -08:00
David Blaikie	615f63e149	Revert "[FastISel] Flush local value map on ever instruction" and dependent patches This reverts commit cf1c774d6ace59c5adc9ab71b31e762c1be695b1. This change caused several regressions in the gdb test suite - at least a sample of which was due to line zero instructions making breakpoints un-lined. I think they're worth investigating/understanding more (& possibly addressing) before moving forward with this change. Revert "[FastISel] NFC: Clean up unnecessary bookkeeping" This reverts commit 3fd39d3694d32efa44242c099e923a7f4d982095. Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option" This reverts commit a474657e30edccd9e175d92bddeefcfa544751b2. Revert "Remove static function unused after cf1c774." This reverts commit dc35368ccf17a7dca0874ace7490cc3836fb063f. Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction"" This reverts commit 53a14a47ee89dadb8798ca8ed19848f33f4551d5.	2020-12-01 14:26:23 -08:00
Hans Wennborg	2ca4785ac7	Remove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now	2020-12-01 13:50:49 +01:00
Paul Robinson	a474657e30	[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option This option is not used for anything after #dc35368 (D91734).	2020-11-30 10:55:49 -08:00
Dmitri Gribenko	234a5297aa	Add 'asserts' requiremnt to test/CodeGen/ARM/cortex-a57-misched-mla.mir '-debug-only=machine-scheduler' only works when asserts are enabled.	2020-11-30 15:19:27 +01:00
Hans Wennborg	25d54abca5	Try harder to get rid off cortex-a57-misched-mla.s	2020-11-30 14:30:50 +01:00
Hans Wennborg	273641fedc	Try to fix bots after 112b3cb by removing cortex-a57-misched-mla.s	2020-11-30 14:15:56 +01:00
Evgeny Leviant	129523588f	Fix test case	2020-11-30 12:35:28 +03:00
Evgeny Leviant	112b3cb6ba	[TableGen][SchedModels] Fix read/write variant substitution Patch fixes multiple issues related to expansion of variant sched reads and writes. Differential revision: https://reviews.llvm.org/D90844	2020-11-30 11:55:55 +03:00
Paul Robinson	cf1c774d6a	[FastISel] Flush local value map on ever instruction Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value" area that always dominates the current insertion point, to try to avoid materializing these values more than once (per block). https://reviews.llvm.org/D43093 added code to sink these local value instructions to their first use, which has two beneficial effects. One, it is likely to avoid some unnecessary spills and reloads; two, it allows us to attach the debug location of the user to the local value instruction. The latter effect can improve the debugging experience for debuggers with a "set next statement" feature, such as the Visual Studio debugger and PS4 debugger, because instructions to set up constants for a given statement will be associated with the appropriate source line. There are also some constants (primarily addresses) that could be produced by no-op casts or GEP instructions; the main difference from "local value" instructions is that these are values from separate IR instructions, and therefore could have multiple users across multiple basic blocks. D43093 avoided sinking these, even though they were emitted to the same "local value" area as the other instructions. The patch comment for D43093 states: Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions. This patch undoes most of D43093, and instead flushes the local value map after() every IR instruction, using that instruction's debug location. This avoids sometimes incorrect locations used previously, and emits instructions in a more natural order. This does mean materialized values are not re-used across IR instruction boundaries; however, only about 5% of those values were reused in an experimental self-build of clang. () Actually, just prior to the next instruction. It seems like it would be cleaner the other way, but I was having trouble getting that to work. Differential Revision: https://reviews.llvm.org/D91734	2020-11-25 13:05:00 -05:00
Craig Topper	4252f7773a	[SelectionDAG][ARM][AArch64][Hexagon][RISCV][X86] Add SDNPCommutative to fma and fmad nodes in tablegen. Remove explicit commuted patterns from targets. X86 was already specially marking fma as commutable which allowed tablegen to autogenerate commuted patterns. This moves it to the target independent definition and fix up the targets to remove now unneeded patterns. Unfortunately, the tests change because the commuted version of the patterns are generating operands in a different than the explicit patterns. Differential Revision: https://reviews.llvm.org/D91842	2020-11-23 10:09:20 -08:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
David Green	32556a9832	[ARM] Remove more unused check prefixes, NFC	2020-11-14 15:37:53 +00:00
Pirama Arumuga Nainar	8262e94a6d	[ARM] Fix PR 47980: Use constrainRegClass during foldImmediate opt. Previously we used setRegClass to rgpr, which may expand the register domain if the result was already in a constrained class (tcgpr in the above PR). Differential Revision: https://reviews.llvm.org/D91192	2020-11-10 13:38:11 -08:00
David Green	08d1c2d470	[ARM] Introduce t2DoLoopStartTP This introduces a new pseudo instruction, almost identical to a t2DoLoopStart but taking 2 parameters - the original loop iteration count needed for a low overhead loop, plus the VCTP element count needed for a DLSTP instruction setting up a tail predicated loop. The idea is that the instruction holds both values and the backend ARMLowOverheadLoops pass can pick between the two, depending on whether it creates a tail predicated loop or falls back to a low overhead loop. To do that there needs to be something that converts a t2DoLoopStart to a t2DoLoopStartTP, for which this patch repurposes the MVEVPTOptimisationsPass as a "tail predication and vpt optimisation" pass. The extra operand for the t2DoLoopStartTP is chosen based on the operands of VCTP's in the loop, and the instruction is moved as late in the block as possible to attempt to increase the likelihood of making tail predicated loops. Differential Revision: https://reviews.llvm.org/D90591	2020-11-10 18:08:12 +00:00
David Green	b2ac9681a7	[ARM] Alter t2DoLoopStart to define lr This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator. This is a fairly simple change in itself, but leads to a number of other required alterations. - The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them! This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall. Differential Revision: https://reviews.llvm.org/D89881	2020-11-10 15:57:58 +00:00
Momchil Velikov	937ab6a785	[ARM][MachineOutliner] Emit more CFI instructions This patch make the outliner emit CFI instructions in a few more places: * after LR is restored, but before the return in an outlined function * around save/restore of LR to/from a register at calls to outlined functions * around save/restore of LR to/from the stack at calls to outlined functions The latter two only when the function does NOT spill LR. If the function spills LR, then outliner generated saves/restores around calls are not considered interesting for unwinding the frame. Differential Revision: https://reviews.llvm.org/D89483	2020-11-09 15:26:18 +00:00
Francesco Petrogalli	fc2fe6817e	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Momchil Velikov	5b30d9adc0	[MachineOutliner] Do not outline debug instructions The debug location is removed from any outlined instruction. This causes the MachineVerifier to crash on outlined DBG_VALUE instructions. Then, debug instructions are "invisible" to the outliner, that is, two ranges of instructions from different functions are considered identical if the only difference is debug instructions. Since a debug instruction from one function is unlikely to provide sensible debug information about all functions, sharing an outlined sequence, this patch just removes debug instructions from the outlined functions. Differential Revision: https://reviews.llvm.org/D89485	2020-11-05 19:26:51 +00:00
Cameron McInally	c126eb7529	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Michael Liao	4b11201592	[MachineInstr] Add support for instructions with multiple memory operands. - Basically iterate each pair of memory operands from both instructions and return true if any of them may alias. - The exception are memory instructions without any memory operand. They may touch everything and could alias to any memory instruction. Differential Revision: https://reviews.llvm.org/D89447	2020-11-03 20:44:40 -05:00
Momchil Velikov	7360d6d921	[ARM][MachineOutliner] Do not overestimate LR liveness in return block The `LiveRegUnits` utility (as well as `LivePhysRegs`) considers callee-saved registers to be alive at the point after the return instruction in a block. In the ARM backend, the `LR` register is classified as callee-saved, which is not really correct (from an ARM eABI or just common sense point of view). These two conditions cause the `MachineOutliner` to overestimate the liveness of `LR`, which results in unnecessary saves/restores of `LR` around calls to outlined sequences. It also causes the `MachineVerifer` to crash in some cases, because the save instruction reads a dead `LR`, for example when the following program: int h(int, int); int f(int a, int b, int c, int d) { a = h(a + 1, b - 1); b = b + c; return 1 + (2 * a + b) * (c - d) / (a - b) * (c + d); } int g(int a, int b, int c, int d) { a = h(a - 1, b + 1); b = b + c; return 2 + (2 * a + b) * (c - d) / (a - b) * (c + d); } is compiled with `-target arm-eabi -march=armv7-m -Oz`. This patch computes the liveness of `LR` in return blocks only, while taking into account the few ARM instructions, which read `LR`, but nevertheless the register is not mentioned (explicitly or implicitly) in the instruction operands. Differential Revision: https://reviews.llvm.org/D89189	2020-11-02 16:47:22 +00:00
Nikita Popov	fa48ff3fc9	[CodeGen] Fix neutral value of vecreduce fadd in tests (NFC) The neutral value is -0.0, not 0.0. This doesn't matter for "fast" reductions due to nsz, but does matter for reassoc-only and seq reductions. Change tests to mostly use -0.0 where the neutral value was intended, and add some additional test coverage in some places. Also update LangRef to use the right value.	2020-10-29 21:26:14 +01:00
Fangrui Song	2819631914	[PrologEpilogInserter] Reduce PR16393 test and fix a prologue parameter in a debuginfo test	2020-10-18 22:18:42 -07:00
Roman Lebedev	7ee6c40247	Revert "Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"" and it's follow-ups While we haven't encountered an earth-shattering problem with this yet, by now it is pretty evident that trying to model the ptr->int cast implicitly leads to having to update every single place that assumed no such cast could be needed. That is of course the wrong approach. Let's back this out, and re-attempt with some another approach, possibly one originally suggested by Eli Friedman in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which should hopefully spare us this pain and more. This reverts commits 1fb610429308a7c29c5065f5cc35dcc3fd69c8b1, 7324616660fc0995fa8c166e3c392361222d5dbc, aaafe350bb65dfc24c2cdad4839059ac81899fbe, e92a8e0c743f83552fac37ecf21e625ba3a4b11e. I've kept&improved the tests though.	2020-10-14 16:09:18 +03:00
Roman Lebedev	1fb6104293	Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit 1c021c64caef83cccb719c9bf0a2554faa6563af which was reverted in commit 17cec6a11a12f815052d56a17ef738cf246a2d9a because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 23:02:55 +03:00
Hans Wennborg	17cec6a11a	Revert 1c021c64c "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" > While we indeed can't treat them as no-ops, i believe we can/should > do better than just modelling them as `unknown`. `inttoptr` story > is complicated, but for `ptrtoint`, it seems straight-forward > to model it just as a zext-or-trunc of unknown. > > This may be important now that we track towards > making inttoptr/ptrtoint casts not no-op, > and towards preventing folding them into loads/etc > (see D88979/D88789/D88788) > > Reviewed By: mkazantsev > > Differential Revision: https://reviews.llvm.org/D88806 It caused the following assert during Chromium builds: llvm/lib/IR/Constants.cpp:1868: static llvm::Constant llvm::ConstantExpr::getTrunc(llvm::Constant , llvm::Type *, bool): Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed. See code review for a link to a reproducer. This reverts commit 1c021c64caef83cccb719c9bf0a2554faa6563af.	2020-10-12 18:39:35 +02:00
Simon Pilgrim	c252200e4d	[DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns Based on a discussion on D88783, if we're promoting a funnel shift to a width at least twice the size as the original type, then we can use the 'double shift' patterns (shifting the concatenated sources). Differential Revision: https://reviews.llvm.org/D89139	2020-10-12 14:11:02 +01:00
Roman Lebedev	1c021c64ca	[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 11:04:03 +03:00
Simon Pilgrim	191fbda5d2	[ARM][MIPS] Add funnel shift test coverage Based on offline discussions regarding D89139 and D88783 - we want to make sure targets aren't doing anything particularly dumb Tests copied from aarch64 which has a mixture of general, legalization and special case tests	2020-10-09 19:19:47 +01:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Simon Pilgrim	6625892d7c	[ARM] Regenerate vldlane tests To help make the diffs in D88569 clearer	2020-10-07 11:47:03 +01:00
Jay Foad	1aa8e6a51a	[SDag] SimplifyDemandedBits: simplify to FP constant if all bits known We were already doing this for integer constants. This patch implements the same thing for floating point constants. Differential Revision: https://reviews.llvm.org/D88570	2020-10-07 09:24:38 +01:00
Sanjay Patel	2ccbf3dbd5	[SDAG] fold x * 0.0 at node creation time In the motivating case from https://llvm.org/PR47517 we create a node that does not get constant folded before getNegatedExpression is attempted from some other node, and we crash. By moving the fold into SelectionDAG::simplifyFPBinop(), we get the constant fold sooner and avoid the problem.	2020-10-04 11:31:57 -04:00
Meera Nakrani	f7c0e2b8f2	[ARM] Prevent constants from iCmp instruction from being hoisted if part of a min(max()) pattern Marks constants of an ICmp instruction as free if it's only user is a select instruction that is part of a min(max()) pattern. Ensures that in loops, in particular when loop unrolling is turned on, SSAT will still be correctly generated. Differential Revision: https://reviews.llvm.org/D88662	2020-10-02 09:28:35 +00:00
Matt Arsenault	89baeaef2f	Reapply "RegAllocFast: Rewrite and improve" This reverts commit 73a6a164b84a8195defbb8f5eeb6faecfc478ad4.	2020-09-30 10:35:25 -04:00
Meera Nakrani	675431b987	[ARM] Added more patterns to generate SSAT/USAT with shift Added patterns to generate an SSAT or USAT with shift for SSAT/USAT instructions that are matched from IR patterns. Differential Revision: https://reviews.llvm.org/D88145	2020-09-28 14:50:19 +00:00
Eli Friedman	3f739f736b	[SelectionDAG][GISel] Make LegalizeDAG lower FNEG using integer ops. Previously, if a floating-point type was legal, but FNEG wasn't legal, we would use FSUB. Instead, we should use integer ops, to preserve the semantics. (Alternatively, there's a compiler-rt call we could use, but there isn't much reason to use that.) It turns out we actually are still using this obscure codepath in a few cases: on some targets, we have "legal" floating-point types that don't actually support any floating-point operations. In particular, ARM and AArch64 are using this path. The implementation for SelectionDAG is pretty simple because we can reuse the infrastructure from FCOPYSIGN. See also 9a3dc3e, the corresponding change to type legalization. Also includes a "bonus" change to STRICT_FSUB legalization, so we can lower a STRICT_FSUB to a float libcall. Includes the changes to both LegalizeDAG and GlobalISel so we don't have inconsistent results in the future. Fixes https://bugs.llvm.org/show_bug.cgi?id=46792 . Differential Revision: https://reviews.llvm.org/D84287	2020-09-23 14:10:33 -07:00

1 2 3 4 5 ...

4216 Commits