llvm-project

Author	SHA1	Message	Date
Shengchen Kan	8e77390c06	[X86][CodeGen] Support folding memory broadcast in X86InstrInfo::foldMemoryOperandImpl (#79761 )	2024-01-31 12:51:03 +08:00
Shengchen Kan	169553688c	[X86][NFC] Remove TB_FOLDED_BCAST and format code in X86InstrFoldTables.cpp	2024-01-30 00:27:16 +08:00
Shengchen Kan	d133ada946	[X86][CodeGen] Add missing BroadcastTable1 in X86MemUnfoldTable	2024-01-30 00:02:18 +08:00
Shengchen Kan	cfb702676c	[X86][NFC] Rename lookupBroadcastFoldTable to lookupBroadcastFoldTableBySize Address RKSimon's comments in #79761	2024-01-29 23:23:07 +08:00
Shengchen Kan	7c3ee7cbe6	[X86][tablgen] Fix the broadcast tables (#79675 )	2024-01-28 09:06:27 +08:00
Simon Pilgrim	b8bbd5fe6f	[X86] X86InstrFoldTables.cpp - add Op4 Broadcast Fold/Unfold table entries Prep work for #73509 (missed in #73654)	2023-11-30 15:20:42 +00:00
Shengchen Kan	a4e1aa256b	[X86][tablgen] Auto-gen broadcast tables (#73654 ) 1. Add TB_BCAST_SH for FP16 2. Auto-gen 4 broadcast tables BroadcastTable[1-4] issue: https://github.com/llvm/llvm-project/issues/66360	2023-11-30 22:24:31 +08:00
Shengchen Kan	bafa51c8a5	[X86] Rename X86MemoryFoldTableEntry to X86FoldTableEntry, NFCI b/c it's used for element that folds a load, store or broadcast.	2023-11-28 19:49:14 +08:00
Shengchen Kan	c66c15a76d	[X86] Rename some variables for memory fold and format code, NFCI 1. Rename the names of tables to simplify the print 2. Align the abbreviation in the same file Instr -> Inst 3. Clang-format 4. Capitalize the first char of the variable name	2023-11-28 19:07:44 +08:00
Shengchen Kan	1b1f3c20b5	[X86][CodeGen] Remove duplicated code for the table checks, NFCI	2023-11-28 14:07:32 +08:00
Simon Pilgrim	0b91de5ea3	[X86] Add X86FixupVectorConstantsPass to re-fold AVX512 vector load folds as broadcast folds This patch analyzes AVX512 instructions for full vector width folded loads from the constant pool and attempts to determine if it can be replaced with a smaller broadcast folded variant. Typically the broadcast opportunities were missed by type-width mismatches or mulituse limitations which have been removed in later passes. As well as introducing broadcast fold tables (which can hopefully be extended/automated in the future), this also handles mismatches in the AND/ANDN/OR/XOR/TERNLOG type-widths, catching additional missed opportunities. This is patch is pulled from the ongoing work based on D150143, but without removing the existing DAG constant broadcast lowering code - this patch is currently a late stage cleanup only. The intention is to add additional broadcast/extension handling of constants in future patches, but it turned out that AVX512 broadcast handling was the easiest to start with. Differential Revision: https://reviews.llvm.org/D150526	2023-05-23 10:58:17 +01:00
Shengchen Kan	f3d9abf1f8	[X86][mem-fold] Use the generated memory folding table Reviewed By: yubing Differential Revision: https://reviews.llvm.org/D147527	2023-04-06 19:49:39 +08:00
Bing1 Yu	4f4b2161ec	[X86][NFC] Move MemoryFoldTable2Addr MemoryFoldTable0~4 into X86InstrFoldTables.def Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D142083	2023-02-02 16:38:50 +08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Freddy Ye	23f02693ec	[X86] Add AVX-VNNI-INT8 instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135938	2022-10-28 10:39:54 +08:00
Freddy Ye	0e720e6ada	[X86] Add AVX-IFMA instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135932	2022-10-28 09:42:30 +08:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit e6f1f062457c928c18a88c612f39d9e168f65a85. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Shengchen Kan	4d21497006	[X86] Remove TB_NO_REVERSE for 2 memory folding entries ``` X86::MMX_MOVD64from64rr -> X86::MMX_MOVQ64mr X86::MMX_MOVD64grr -> X86::MMX_MOVD64mr ``` These two entries were added in llvm-svn: 372770. I think these two should be reversable. Reviewed By: RKSimon, pengfei Differential Revision: https://reviews.llvm.org/D122217	2022-04-06 17:21:12 +08:00
Craig Topper	9933015fdd	[X86] Fold MMX_MOVD64from64rr + store to MMX_MOVQ64mr instead of MMX_MOVD64from64mr. MMX_MOVD64from64rr moves an MMX register to a 64-bit GPR. MMX_MOVD64from64mr is the memory version of moving a MMX register to a 64-bit GPR. It requires the REX.W bit to be set. There are no isel patterns that use this instruction. MMX_MOVQ64mr is the MMX register store instruction. It doesn't require a REX.W prefix. This makes it one byte shorter to encode than MMX_MOVD64from64mr in many cases. Both store instructions output the same mnemonic string. The assembler would choose MMX_MOVQ64mr if it was to parse the output. Which is another reason using it is the correct thing to do. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122241	2022-03-22 14:21:55 -07:00
Shengchen Kan	021b42367a	[X86] Rename MMX_MOVD64from64rm to MMX_MOVD64from64mr b/c it stores sth, NFC Reviewed By: pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D122216	2022-03-22 17:59:28 +08:00
Simon Pilgrim	41052fd699	[X86][MMX] Remove superfluous 'i' from MMX cvt opnames. NFCI. This is a very old copy+paste typo - none of these cvt ops have an immediate operand. Noticed while trying to merge MMX instructions into some existing SSE instruction scheduler instregex patterns.	2021-12-12 17:59:16 +00:00
Simon Pilgrim	0a08813cad	[X86][MMX] Remove superfluous 'i' from MMX binop opnames. NFCI. This is a very old copy+paste typo - none of these binops have an immediate operand. Noticed while trying to merge MMX instructions into some existing SSE instruction scheduler instregex patterns.	2021-12-12 17:59:16 +00:00
Wang, Pengfei	ab40dbfe03	[X86] AVX512FP16 instructions enabling 6/6 Enable FP16 complex FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105269	2021-08-30 13:08:45 +08:00
Wang, Pengfei	c728bd5bba	[X86] AVX512FP16 instructions enabling 5/6 Enable FP16 FMA instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105268	2021-08-24 09:07:19 +08:00
Wang, Pengfei	b088536ce9	[X86] AVX512FP16 instructions enabling 4/6 Enable FP16 unary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105267	2021-08-22 08:59:35 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Wang, Pengfei	f1de9d6dae	[X86] AVX512FP16 instructions enabling 2/6 Enable FP16 binary operator instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105264	2021-08-15 08:56:33 +08:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Craig Topper	e28376ec28	[X86] Add i32->float and i64->double bitcast pseudo instructions to store folding table. We have pseudo instructions we use for bitcasts between these types. We have them in the load folding table, but not the store folding table. This adds them there so they can be used for stack spills. I added an exact size check so that we don't fold when the stack slot is larger than the GPR. Otherwise the upper bits in the stack slot would be garbage. That would be fine for Eli's test case in PR47874, but I'm not sure its safe in general. A step towards fixing PR47874. Next steps are to change the ADDSSrr_Int pseudo instructions to use FR32 as the second source register class instead of VR128. That will keep the coalescer from promoting the register class of the bitcast instruction which will make the stack slot 4 bytes instead of 16 bytes. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D89656	2020-10-19 12:53:14 -07:00
Craig Topper	912cd8a37f	[X86] Add vpternlog to the broadcast unfolding table.	2020-07-02 13:43:44 -07:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Craig Topper	7badb38918	[X86] Fix copy/paste mistake in comment. NFC	2020-02-14 09:47:50 -08:00
Craig Topper	0152b106ae	[X86] Add the recently added (V)CVTSS2SI/CVTSD2SI instructions used for LRINT/LLRINT to the load folding tables.	2020-02-08 17:54:48 -08:00
Simon Pilgrim	10417ad2e4	[X86] Standardize BROADCAST enum names (PR31079) Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.	2020-02-08 16:55:00 +00:00
Simon Pilgrim	0ed79e9b8f	[X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079) Tweak EVEX implementation names so it matches the other variants	2020-02-08 14:54:44 +00:00
Craig Topper	8b5ad3d16e	[X86] Add broadcast load unfolding support for VPTESTMD/Q and VPTESTNMD/Q. llvm-svn: 373138	2019-09-28 01:56:36 +00:00
Simon Pilgrim	a7f27f357d	[X86] Add MMX MOVD/MOVQ stores to folding tables to support stack folding llvm-svn: 372770	2019-09-24 16:15:32 +00:00
Craig Topper	0e533ca4bb	[X86] Add broadcast load unfolding support for VCMPPS/PD. llvm-svn: 371487	2019-09-10 05:49:53 +00:00
Craig Topper	a88f58ff0e	[X86] Add broadcast load unfolding support for vpcmpeq/vpcmpgt/vpcmp/vpcmpu. llvm-svn: 371368	2019-09-09 07:46:11 +00:00
Craig Topper	8c2ab1c4cb	[X86] Add broadcast load unfold support for smin/umin/smax/umax. llvm-svn: 371366	2019-09-09 06:32:24 +00:00
Craig Topper	ad7822329f	[X86] Add broadcast load unfolding support for VMAXPS/PD and VMINPS/PD. llvm-svn: 371363	2019-09-09 04:25:01 +00:00
Craig Topper	1829a09bea	[X86] Add support for unfold broadcast loads from FMA instructions. llvm-svn: 371323	2019-09-07 21:54:40 +00:00
Craig Topper	3ab210862a	[X86] Add initial support for unfolding broadcast loads from arithmetic instructions to enable LICM hoisting of the load MachineLICM can hoist an invariant load, but if that load is folded it needs to be unfolded. On AVX512 sometimes this load is an broadcast load which we were previously unable to unfold. This patch adds initial support for that with a very basic list of supported instructions as a starting point. Differential Revision: https://reviews.llvm.org/D67017 llvm-svn: 370620	2019-09-01 22:14:36 +00:00
Craig Topper	46f2b583a2	[X86] Add MOVSDrr->MOVLPDrm entry to load folding table. Add custom handling to turn UNPCKLPDrr->MOVHPDrm when load is under aligned. If the load is aligned we can turn UNPCKLPDrr into UNPCKLPDrm. llvm-svn: 365287	2019-07-08 02:10:20 +00:00
Fangrui Song	dc8de6037c	Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC llvm-svn: 364006	2019-06-21 05:40:31 +00:00
Craig Topper	587427716c	[X86] Remove MOVDI2SSrm/MOV64toSDrm/MOVSS2DImr/MOVSDto64mr CodeGenOnly instructions. The isel patterns for these use a bitcast and load/store, but DAG combine should have canonicalized those away. For the purposes of the memory folding table these opcodes can be replaced by the MOVSSrm_alt/MOVSDrm_alt and MOVSSmr/MOVSDmr opcodes. llvm-svn: 363644	2019-06-18 03:23:15 +00:00
Craig Topper	f3f968adcd	[X86] Add TB_NO_REVERSE to some memory folding table entries where the register form requires 64-bit mode, but the memory form does not. We don't know if its safe to unfold if we're in 32-bit mode. This is simlar to what was done to some load opcodes in r363523. I think its pretty unlikely we will try to unfold these anyway so I don't think this is testable. llvm-svn: 363595	2019-06-17 18:38:07 +00:00
Craig Topper	9f2f127009	[X86] Add TB_NO_REVERSE to some folding table entries where the register from uses the REX prefix, but the memory form does not. It would not be safe to unfold the memory form the register form without checking that we are compiling for 64-bit mode. This probaby isn't a real functional issue since we are unlikely to unfold any of these instructions since they don't have any tied registers, aren't commutable, and don't have any inputs other than the address. llvm-svn: 363523	2019-06-16 22:33:09 +00:00

1 2

62 Commits