llvm-project

Author	SHA1	Message	Date
Phoebe Wang	5c73c5c9bf	[X86][NFC] Add missing immediate qualifier to VSM3RNDS2 instruction (#131576 )	2025-03-17 17:59:39 +08:00
Phoebe Wang	ee2722fc88	[X86][AVX10.2-BF16] Remove [NE]P from intrinsic and instruction name (#123335 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2025-01-24 15:49:28 +08:00
Phoebe Wang	24f177df61	[X86][AVX10.2-BF16] Update VCOMISBF16 intrinsics and instructions (#123307 ) - Add `I` to intrinsics and instructions - Add `_` before sbf16 in intrinsics Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2025-01-24 08:37:29 +08:00
Mikołaj Piróg	25653e558c	[AVX10.2] Update convert chapter intrinsic and mnemonics names (#123656 ) Intel spec for avx10.2 (https://cdrdv2.intel.com/v1/dl/getContent/828965) has been updated. This PR changes relevant names from the "AVX10 CONVERT INSTRUCTIONS" chapter .	2025-01-23 22:23:56 +08:00
Simon Pilgrim	90e9895a93	[X86] Handle BSF/BSR "zero-input pass through" behaviour (#123623 ) Intel docs have been updated to be similar to AMD and now describe BSF/BSR as not changing the destination register if the input value was zero, which allows us to support CTTZ/CTLZ zero-input cases by setting the destination to support a NumBits result (BSR is a bit messy as it has to be XOR'd to create a CTLZ result). VIA/Zhaoxin x86_64 CPUs have also been confirmed to match this behaviour. This patch adjusts the X86ISD::BSF/BSR nodes to take a "pass through" argument for zero-input cases, by default this is set to UNDEF to match existing behaviour, but it can be set to a suitable value if supported. There are still some limits to this - its only supported for x86_64 capable processors (and I've only enabled it for x86_64 codegen), and Intel CPUs sometimes zero the upper 32-bits of a pass through register when used for BSR32/BSF32 with a zero source value (i.e. the whole 64bits may not get passed through). Fixes #122004	2025-01-23 12:59:59 +00:00
Phoebe Wang	4f40b07533	[X86][AVX10.2-SATCVT][NFC] Remove NE from intrinsic and instruction name (#123275 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2025-01-22 22:53:47 +08:00
Phoebe Wang	13c6abfac8	[X86][AVX10.2-MINMAX][NFC] Remove NE[P] from intrinsic and instruction (#123272 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2025-01-21 19:55:09 +08:00
Phoebe Wang	9cd774d1e4	[X86][NFC] Move "_Int" after "k"/"kz" (#121450 ) Address comment at https://github.com/llvm/llvm-project/pull/121373#discussion_r1900402932	2025-01-02 21:02:19 +08:00
Phoebe Wang	23ec9ee17e	[X86][AVX10.2] Lower fmininum/fmaximum to VMINMAX* (#121373 )	2025-01-02 11:30:26 +08:00
Simon Pilgrim	29f11f0a32	[X86] Add missing reg/imm attributes to VRNDSCALES instruction names (#117203 ) More canonicalization of the instruction names to make the predictable - more closely matches VRNDSCALEP / VROUND equivalent instructions	2024-11-22 17:45:30 +00:00
Simon Pilgrim	3a5cf6d99b	[X86] Rename AVX512 VEXTRACT/INSERT??x? to VEXTRACT/INSERT??X? (#116826 ) Use uppercase in the subvector description ("32x2" -> "32X4" etc.) - matches what we already do in VBROADCAST??X?, and we try to use uppercase for all x86 instruction mnemonics anyway (and lowercase just for the arg description suffix).	2024-11-20 08:25:01 +00:00
Simon Pilgrim	7dcefb37a4	[X86] Tidyup up AVX512 FPCLASS instruction naming (#116661 ) FPCLASS is a unary instruction with an immediate operand - update the naming to match similar instructions (e.g. VPSHUFD) by only using the source reg/mem and immediate in the instruction name	2024-11-19 11:26:46 +00:00
Simon Pilgrim	d4f2b71c3f	[X86] Fix position of immediate argument in AVX512 VPCMP comparisons (#116646 ) The 'i' arg was being put between the 'm' and 'b' args instead of afterwards like other avx512 instructions (VCMPPS/D, VPERMILPS/D etc.).	2024-11-19 10:00:24 +00:00
Mahesh-Attarde	e61a7dc256	[X86][AVX512] Use comx for compare (#113567 ) We added AVX10.2 COMEF ISA in LLVM, This does not optimize correctly in scenario mentioned below. Summary Input ``` define i1 @oeq(float %x, float %y) { %1 = fcmp oeq float %x, %y ret i1 %1 }define i1 @une(float %x, float %y) { %1 = fcmp une float %x, %y ret i1 %1 }define i1 @ogt(float %x, float %y) { %1 = fcmp ogt float %x, %y ret i1 %1 } // Prior AVX10.2, default code generation oeq: # @oeq cmpeqss xmm0, xmm1 movd eax, xmm0 and eax, 1 ret une: # @une cmpneqss xmm0, xmm1 movd eax, xmm0 and eax, 1 ret ogt: # @ogt ucomiss xmm0, xmm1 seta al ret ``` This patch will remove `cmpeqss` and `cmpneqss`. For complete transform check unit test. Continuing on what PR https://github.com/llvm/llvm-project/pull/113098 added Earlier Legalization and combine expanded `setcc oeq:ch` node into `and` and `setcc eq` , `setcc o`. From suggestions in community new internal transform ``` Optimized type-legalized selection DAG: %bb.0 'hoeq:' SelectionDAG has 11 nodes: t0: ch,glue = EntryToken t2: f16,ch = CopyFromReg t0, Register:f16 %0 t4: f16,ch = CopyFromReg t0, Register:f16 %1 t14: i8 = setcc t2, t4, setoeq:ch t10: ch,glue = CopyToReg t0, Register:i8 $al, t14 t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1 Optimized legalized selection DAG: %bb.0 'hoeq:' SelectionDAG has 12 nodes: t0: ch,glue = EntryToken t2: f16,ch = CopyFromReg t0, Register:f16 %0 t4: f16,ch = CopyFromReg t0, Register:f16 %1 t15: i32 = X86ISD::UCOMX t2, t4 t17: i8 = X86ISD::SETCC TargetConstant:i8<4>, t15 t10: ch,glue = CopyToReg t0, Register:i8 $al, t17 t11: ch = X86ISD::RET_GLUE t10, TargetConstant:i32<0>, Register:i8 $al, t10:1 ``` Earlier transform is mentioned here https://github.com/llvm/llvm-project/pull/113098#discussion_r1810307663 --------- Co-authored-by: mattarde <mattarde@intel.com>	2024-10-30 16:17:25 +08:00
Freddy Ye	5aa1275d03	[X86] Support SM4 EVEX version intrinsics/instructions. (#113402 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368	2024-10-28 10:46:16 +08:00
Mahesh-Attarde	311e4e3245	[X86][AVX10.2] Support AVX10.2 MOVZXC new Instructions. (#108537 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965 Chapter 14 INTEL® AVX10 ZERO-EXTENDING PARTIAL VECTOR COPY INSTRUCTIONS --------- Co-authored-by: mattarde <mattarde@intel.com>	2024-09-18 21:01:51 +08:00
Mahesh-Attarde	f5ad9e1ca5	[X86][AVX10.2] Support AVX10.2-COMEF new instructions. (#108063 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965 Chapter 8 AVX10 COMPARE SCALAR FP WITH ENHANCED EFLAGS INSTRUCTIONS --------- Co-authored-by: mattarde <mattarde@intel.com>	2024-09-18 17:55:29 +08:00
Simon Pilgrim	1e33bd2031	[X86] Add missing immediate qualifier to the (V)PINSR/PEXTR instruction names Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-09-15 14:12:55 +01:00
Simon Pilgrim	7048857f52	[X86] Add missing immediate qualifier to the (V)EXTRACTPS instruction names Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-09-15 13:41:46 +01:00
Simon Pilgrim	614a064cac	[X86] Add missing immediate qualifier to the (V)INSERT/EXTRACT/PERM2 instruction names (#108593 ) Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-09-15 11:42:13 +01:00
Malay Sanghi	a409ebc1fc	[X86][AVX10.2] Support AVX10.2-SATCVT-DS new instructions. (#102592 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-09-12 22:45:20 +08:00
Freddy Ye	83ad644afa	[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (#101603 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-09-04 08:13:24 +08:00
Freddy Ye	7c4cadfc43	[X86][AVX10.2] Support AVX10.2-CONVERT new instructions. (#101600 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-21 15:44:06 +08:00
Freddy Ye	80721e0d6c	[X86][AVX10.2] Support AVX10.2-SATCVT new instructions. (#101599 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-06 19:37:49 +08:00
Phoebe Wang	b0329206db	[X86][AVX10.2] Support AVX10.2 VNNI FP16/INT8/INT16 new instructions (#101783 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-05 18:57:42 +08:00
Freddy Ye	3d5cc7e1e6	[X86][AVX10.2] Support AVX10.2-MINMAX new instructions. (#101598 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-05 11:06:02 +08:00
Phoebe Wang	259ca9ee9c	Reland "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452 )" (#101616 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-03 09:26:07 +08:00
Phoebe Wang	2e0588d5e1	Revert "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions" (#101612 ) Reverts llvm/llvm-project#101452 There are several buildbot failed. Revert first.	2024-08-02 13:04:10 +08:00
Phoebe Wang	10bad2c8d7	[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new instructions (#101452 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965	2024-08-02 12:10:50 +08:00
Freddy Ye	13b265c7b5	[X86][MC] Support Intel FRED and LKGS instructions. (#91909 ) Spec reference: https://cdrdv2.intel.com/v1/dl/getContent/678938	2024-05-15 10:40:16 +08:00
Freddy Ye	de3e4a9dfe	[X86][APX] Remove KEYLOCKER and SHA promotions from EVEX MAP4. (#89173 ) APX spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 Change happended in version 4.0. Removed instructions' Opcodes: AESDEC128KL AESDEC256KL AESDECWIDE128KL AESDECWIDE256KL AESENC128KL AESENC256KL AESENCWIDE128KL AESENCWIDE256KL ENCODEKEY128 ENCODEKEY256 SHA1MSG1 SHA1MSG2 SHA1NEXTE SHA1RNDS4 SHA256MSG1 SHA256MSG2 SHA256RNDS2	2024-04-19 10:56:59 +08:00
Freddy Ye	f4509cf284	[X86][MC] Support enc/dec for SETZUCC and promoted SETCC. (#86473 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-04-11 10:18:29 +08:00
Simon Pilgrim	ecb34599bd	[X86] Add missing immediate qualifier to the (V)ROUND instructions (#87636 ) Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-04-04 15:20:16 +01:00
Freddy Ye	db7d243978	[X86][MC] Support enc/dec for IMULZU. (#86653 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-03-29 15:52:41 +08:00
XinWang10	7b766a6f50	[X86] Support APX CMOV/CFCMOV instructions (#82592 ) This patch support ND CMOV instructions and CFCMOV instructions. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-03-17 20:18:56 +08:00
Simon Pilgrim	0858c906db	[X86] Add missing register qualifier to the VBLENDVPD/VBLENDVPS/VPBLENDVB instruction names Matches the SSE variants (which has a 0 qualifier to indicate the xmm0 explicit dependency)	2024-03-11 15:48:07 +00:00
Simon Pilgrim	1ec5b1f483	[X86] Add missing immediate qualifier to the (V)PCLMULQDQ instruction names	2024-03-11 13:39:25 +00:00
Simon Pilgrim	2b8f1daf78	[X86] Add missing immediate qualifier to the SSE42 (V)PCMPEST/PCMPIST string instruction names	2024-03-11 13:02:48 +00:00
Simon Pilgrim	92d7aca441	[X86] Add missing immediate qualifier to the (V)CMPSS/D instructions (#84496 ) Matches (V)CMPPS/D and makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-03-09 16:21:25 +00:00
Shengchen Kan	1ca8092e87	[X86][MC] Support encoding/decoding for APX CCMP/CTEST (#83863 ) APX assembly syntax recommendations: https://cdrdv2.intel.com/v1/dl/getContent/817241 NOTE: The change in llvm/tools/llvm-exegesis/lib/X86/Target.cpp is for test LLVM :: tools/llvm-exegesis/X86/latency/latency-SETCCr-cond-codes-sweep.s For `SETcc`, llvm-exegesis would randomly choose 1 other instruction to test with `SETcc`, after selecting the instruction, llvm-exegesis would check if the operand is initialized and valid, if not `randomizeTargetMCOperand` would choose a value for invalid operand, it misses support for condition code operand, which cause the flaky failure after `CCMP` supported. llvm-exegesis can choose `CCMP` without specifying ccmp feature b/c it use `MCSubtarget` and only16/32/64 bit is considered. llvm-exegesis doesn't choose other instructions b/c requirement in `hasAliasingRegistersThrough`: the instruction should use GPR (defined by `SETcc`) and define `EFLAGS` (used by `SETcc`).	2024-03-08 20:54:33 +08:00
XinWang10	d9e875dcc1	[X86][MC] Support encoding/decoding for APX variant LZCNT/TZCNT/POPCNT instructions (#79954 ) Two variants: promoted legacy, NF (no flags update). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-31 21:10:02 +08:00
Shengchen Kan	7c3ee7cbe6	[X86][tablgen] Fix the broadcast tables (#79675 )	2024-01-28 09:06:27 +08:00
XinWang10	02d56801ee	[X86] Support APX promoted RAO-INT and MOVBE instructions (#77431 ) R16-R31 was added into GPRs in https://github.com/llvm/llvm-project/pull/70958, This patch supports the promoted RAO-INT and MOVBE instructions in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-01-26 14:33:45 +08:00
XinWang10	816cc9d24b	[X86][MC] Support Enc/Dec for NF BMI instructions (#76709 ) Promoted BMI instructions were supported in #73899	2024-01-25 10:33:14 +08:00
Shengchen Kan	5c68c6d70f	[X86] Support encoding/decoding and lowering for APX variant SHL/SHR/SAR/ROL/ROR/RCL/RCR/SHLD/SHRD (#78853 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-23 10:23:27 +08:00
Shengchen Kan	4daea501c4	[X86][MC] Support encoding/decoding for APX variant MUL/IMUL/DIV/IDIV instructions (#76919 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-05 17:16:55 +08:00
Shengchen Kan	dd9681f839	[X86][MC] Support encoding/decoding for APX variant INC/DEC/ADCX/ADOX instructions (#76721 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-04 10:12:12 +08:00
XinWang10	d8db2733c8	[X86][MC] Support Enc/Dec for EGPR for promoted CRC32 (#76434 ) R16-R31 was added into GPRs in https://github.com/llvm/llvm-project/pull/70958, This patch supports the encoding/decoding for promoted CRC32 instruction in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-01-02 10:48:42 +08:00
Shengchen Kan	d3ddb93d04	[X86] Fix typo about the internal name of instructions 64ri -> 64ri32	2023-12-29 12:18:34 +08:00
Shengchen Kan	d79ccee8dc	[X86][MC] Support encoding/decoding for APX variant ADD/SUB/ADC/SBB/OR/XOR/NEG/NOT instructions (#76319 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2023-12-28 21:22:03 +08:00

1 2

66 Commits