llvm-project

Author	SHA1	Message	Date
Roman Lebedev	0aef747b84	[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated The motivation is that the update script has at least two deviations (`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from what pretty much all the checklines were generated with, and most of the tests are still not updated, so each time one of the non-up-to-date tests is updated to see the effect of the code change, there is a lot of noise. Instead of having to deal with that each time, let's just deal with everything at once. This has been done via: ``` cd llvm-project/llvm/test/CodeGen/X86 grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" \| xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc ``` Not all tests were regenerated, however.	2021-06-11 23:57:02 +03:00
Jim Lin	4456805938	[X86] Don't fold (fneg (fma (fneg X), Y, (fneg Z))) to (fma X, Y, Z) Check if it has no signed zeros flag (nsz) in getNegatedExpression for x86. This patch fixed miscompilation: https://alive2.llvm.org/ce/z/XxwBAJ Reviewed By: RKSimon, spatel Differential Revision: https://reviews.llvm.org/D90901	2021-05-21 23:02:19 +08:00
Wang, Pengfei	c22dc71b12	[CodeGen][X86] Remove unused trivial check-prefixes from all CodeGen/X86 directory. I had manually removed unused prefixes from CodeGen/X86 directory for more than 100 tests. I checked the change history for each of them at the beginning, and then I mainly focused on the format since I found all of the unused prefixes were result from either insensible copy or residuum after functional update. I think it's OK to remove the remaining X86 tests by script now. I wrote a rough script which works for me in most tests. I put it in llvm/utils temporarily for review and hope it may help other components owners. The tests in this patch are all generated by the tool and checked by update tool for the autogenerated tests. I skimmed them and checked about 30 tests and didn't find any unexpected changes. Reviewed By: mtrofin, MaskRay Differential Revision: https://reviews.llvm.org/D91496	2020-11-16 09:45:55 +08:00
Sanjay Patel	39009a8245	[DAGCombiner] tighten fast-math constraints for fma fold fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E) This is only allowed when "reassoc" is present on the fadd. As discussed in D80801, this transform goes beyond what is allowed by "contract" FMF (-ffp-contract=fast). That is because we are fusing the trailing add of 'E' with a multiply, but without "reassoc", the code mandates that the products AB and CD are added together before adding in 'E'. I've added this example to the LangRef to try to clarify the meaning of "contract". If that seems reasonable, we should probably do something similar for the clang docs because there does not appear to be any formal spec for the behavior of -ffp-contract=fast. Differential Revision: https://reviews.llvm.org/D82499	2020-07-12 08:51:49 -04:00
Sanjay Patel	26fd3ffa78	[x86][AArch64] add tests for fmul-fma combine; NFC As discussed in D80801, there's a possible overstep in what is allowed by the 'contract' fast-math-flag.	2020-06-24 15:56:32 -04:00
Sanjay Patel	702cf93356	[DAGCombiner] allow more folding of fadd + fmul into fma If fmul and fadd are separated by an fma, we can fold them together to save an instruction: fadd (fma A, B, (fmul C, D)), N1 --> fma(A, B, fma(C, D, N1)) The fold implemented here is actually a specialization - we should be able to peek through >1 fma to find this pattern. That's another patch if we want to try that enhancement though. This transform was guarded by the TLI hook enableAggressiveFMAFusion(), so it was done for some in-tree targets like PowerPC, but not AArch64 or x86. The hook is protecting against forming a potentially more expensive computation when fma takes longer to execute than a single fadd. That hook may be needed for other transforms, but in this case, we are replacing fmul+fadd with fma, and the fma should never take longer than the 2 individual instructions. 'contract' FMF is all we need to allow this transform. That flag corresponds to -ffp-contract=fast in Clang, so we are allowed to form fma ops freely across expressions. Differential Revision: https://reviews.llvm.org/D80801	2020-06-09 10:41:27 -04:00
Sanjay Patel	912502e8ef	[AArch64][x86] add tests for FMA combines; NFC	2020-05-29 08:58:37 -04:00
Simon Pilgrim	4aa7b9cc96	[X86] X86InstComments - add FMA4 comments These typically match the FMA3 equivalents, although the multiply operands sometimes get flipped due to the FMA3 permute variants.	2020-02-08 17:02:00 +00:00
Simon Pilgrim	c96001035d	[X86] isNegatibleForFree - allow pre-legalized FMA negation As long as the FMA operation is legal (which we can proxy for the FMA3/FMA4 variants as well), we don't have to wait for the LegalOperations stage.	2020-02-07 17:04:17 +00:00
Cameron McInally	5d9271802b	Revert "[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns.ll" This reverts commit 06de52674da73f30751f3ff19fdf457f87077c65. llvm-svn: 363314	2019-06-13 19:25:06 +00:00
Simon Pilgrim	287e78c82b	[DAGCombine] GetNegatedExpression - constant float vector support (PR42105) Add support for negation of constant build vectors. Differential Revision: https://reviews.llvm.org/D62963 llvm-svn: 363040	2019-06-11 09:44:33 +00:00
Cameron McInally	06de52674d	[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns.ll llvm-svn: 362730	2019-06-06 18:41:18 +00:00
Craig Topper	aa5eb2fbaa	[X86] Force floating point values in constant pool decoding to print in scientific notation so they can't be confused with integers. When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction. Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that. llvm-svn: 345488	2018-10-29 04:52:04 +00:00
Sanjay Patel	9e7e0fd828	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344528	2018-10-15 15:56:39 +00:00
Sanjay Patel	475a53649e	[x86] add tests for fma with undef elts; NFC llvm-svn: 344527	2018-10-15 15:47:37 +00:00
Sanjay Patel	4e970ff022	[DAGCombiner] allow undef elts in vector fma matching llvm-svn: 344525	2018-10-15 15:38:38 +00:00
Sanjay Patel	b06ac18ee9	[x86] add tests for fma with undef elts; NFC llvm-svn: 344523	2018-10-15 15:28:44 +00:00
Simon Pilgrim	ad23f270db	[X86] Standardize floating point assembly comments Consistently try to use APFloat::toString for floating point constant comments to get rid of differences between Constant / ConstantDataSequential values - it should help stop some of the linux-windows buildbot failures matching NaN/INF etc. as well. Differential Revision: https://reviews.llvm.org/D52702 llvm-svn: 343562	2018-10-02 09:08:51 +00:00
Simon Pilgrim	43e4e648ef	[X86] Regenerate fma comments. llvm-svn: 343376	2018-09-29 14:31:00 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Dinar Temirbulatov	a0beedef1c	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35965 llvm-svn: 309926	2017-08-03 08:50:18 +00:00
Dinar Temirbulatov	aead31a36f	[X86] SET0 to use XMM registers where possible PR26018 PR32862 Differential Revision: https://reviews.llvm.org/D35839 llvm-svn: 309298	2017-07-27 17:47:01 +00:00
Simon Pilgrim	ddf407dec9	[X86][FMA] Regenerate test with broadcast comments. llvm-svn: 309093	2017-07-26 10:20:49 +00:00
Craig Topper	2caa97c891	[AVX-512] Fix the execution domain for scalar FMA instructions. llvm-svn: 296271	2017-02-25 19:36:28 +00:00
Nicolai Haehnle	3c67a08d1b	X86: Add checks for fma_patterns[_wide].ll with -enable-no-infs-fp-math This re-adds checks for the patterns that were disabled with r288506. Reviewers: spatel, delena, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27346 llvm-svn: 289049	2016-12-08 14:08:08 +00:00
Nicolai Haehnle	33ca182c91	[DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default Summary: When X = 0 and Y = inf, the original code produces inf, but the transformed code produces nan. So this transform (and its relatives) should only be used when the no-infs-fp-math flag is explicitly enabled. Also disable the transform using fmad (intermediate rounding) when unsafe-math is not enabled, since it can reduce the precision of the result; consider this example with binary floating point numbers with two bits of mantissa: x = 1.01 y = 111 x * (y + 1) = 1.01 * 1000 = 1010 (this is the exact result; no rounding occurs at any step) x * y + x = 1000.11 + 1.01 =r 1000 + 1.01 = 1001.01 =r 1000 (with rounding towards zero) The example relies on rounding towards zero at least in the second step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98578 Reviewers: RKSimon, tstellarAMD, spatel, arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26602 llvm-svn: 288506	2016-12-02 16:06:18 +00:00
Craig Topper	713085e60a	[X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958	2016-08-29 04:49:31 +00:00
Craig Topper	c48c029610	[AVX-512] Fix duplicate column in AVX512 execution dependency table that was preventing VMOVDQU32/VMOVDQA32 from being recognized. Fix a bug in the code that stops execution dependency fix from turning operations on 32-bit integer element types into operations on 64-bit integer element types. llvm-svn: 277327	2016-08-01 07:55:33 +00:00
Craig Topper	ce415ff9c5	[AVX512] Add load folding support for the unmasked forms of the FMA instructions. llvm-svn: 276615	2016-07-25 07:20:35 +00:00
Craig Topper	b6519db90d	[AVX512] Implement commuting support for EVEX encoded FMA3 instructions. llvm-svn: 276521	2016-07-23 07:16:56 +00:00
Craig Topper	5c913e84df	[AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763	2016-07-18 06:14:34 +00:00
Craig Topper	e5ce84a33c	[AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used. llvm-svn: 268884	2016-05-08 21:33:53 +00:00
Vyacheslav Klochkov	a3cd08b05c	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Simon Pilgrim	5a64d98303	[X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841	2015-12-05 07:07:42 +00:00
Simon Pilgrim	3fc3454a0c	[X86][FMA] Optimize FNEG(FMUL) Patterns On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495	2015-12-02 09:07:55 +00:00
Simon Pilgrim	db26b3ddfa	[X86][FMA4] Prefer FMA4 to FMA We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339	2015-11-30 22:22:06 +00:00
Simon Pilgrim	82f663d755	[X86][FMA] More thorough FMA tests Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types Added load folding tests for 512-bit vectors NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly As discussed on D14909 llvm-svn: 254232	2015-11-28 14:28:44 +00:00
Simon Pilgrim	1d881ae225	[X86][FMA] Begun adding AVX512 FMA tests As discussed on D14909 llvm-svn: 254180	2015-11-26 20:53:28 +00:00
Simon Pilgrim	1b4fecb098	[X86][FMA] Optimize FNEG(FMA) Patterns X86 needs to use its own FMA opcodes, preventing the standard FNEG(FMA) pattern table recognition method used by other platforms. This patch adds support for lowering FNEG(FMA(X,Y,Z)) into a single suitably negated FMA instruction. Fix for PR24364 Differential Revision: http://reviews.llvm.org/D14906 llvm-svn: 254016	2015-11-24 20:31:46 +00:00
Simon Pilgrim	806c42a747	[X86][FMA] Regenerate tests. Fixes some broken checks. llvm-svn: 253830	2015-11-22 19:05:53 +00:00
Andrew Kaylor	4731bea3e5	Improved the operands commute transformation for X86-FMA3 instructions. All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335	2015-11-06 19:47:25 +00:00
Simon Pilgrim	d45c88bbb5	[DAGCombiner] Improved FMA combine support for vectors Enabled constant canonicalization for all constants. Improved combining of constant vectors. llvm-svn: 249993	2015-10-11 19:48:12 +00:00
Simon Pilgrim	4003ed2da3	[DAGCombiner] Improve FMA support for interpolation patterns This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t)))) Differential Revision: http://reviews.llvm.org/D13003 llvm-svn: 248210	2015-09-21 20:32:48 +00:00
Simon Pilgrim	779bcf3e3d	[X86][FMA] Refreshed fma tests llvm-svn: 247508	2015-09-12 15:33:05 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Craig Topper	12f0d9ef2c	Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336	2014-11-05 06:43:02 +00:00
Stephen Lin	98cbca2e4d	Disambiguate function names in some CodeGen tests. (Some tests were using function names that also were names of instructions and/or doing other unusual things that were making the test not amenable to otherwise scriptable pattern matching.) No functionality change. llvm-svn: 186621	2013-07-18 22:29:15 +00:00
Craig Topper	908e685102	Mark FMA4 instructions as commutable and add them to the folding tables. llvm-svn: 163035	2012-08-31 23:10:34 +00:00
Craig Topper	c0387f6b23	Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted. llvm-svn: 163001	2012-08-31 16:31:13 +00:00
Craig Topper	c30fdbc46c	Add support for converting llvm.fma to fma4 instructions. llvm-svn: 162999	2012-08-31 15:40:30 +00:00

1 2

54 Commits