llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	ecb34599bd	[X86] Add missing immediate qualifier to the (V)ROUND instructions (#87636 ) Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-04-04 15:20:16 +01:00
Freddy Ye	36b4b9d988	[X86] Support immediate folding for CCMP/CTEST (#86616 ) E.g. %0:gr32 = MOV32ri 81 CTEST32rr %0, %1, 2, 10, implicit-def $eflags, implicit $eflags => CTEST32ri %1, 81, 2, 10, implicit-def $eflags, implicit $eflags	2024-03-28 18:54:32 +08:00
XinWang10	7b766a6f50	[X86] Support APX CMOV/CFCMOV instructions (#82592 ) This patch support ND CMOV instructions and CFCMOV instructions. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-03-17 20:18:56 +08:00
Ganesh	61fadd0b09	[X86] Fast AVX-512-VNNI vpdpwssd tuning (#85375 ) Adding a tuning feature to fix https://github.com/llvm/llvm-project/issues/84182 Generates vpdpwssd (instead of vpmaddwd + vpaddd sequence)	2024-03-15 16:45:41 +05:30
Simon Pilgrim	1ec5b1f483	[X86] Add missing immediate qualifier to the (V)PCLMULQDQ instruction names	2024-03-11 13:39:25 +00:00
Simon Pilgrim	92d7aca441	[X86] Add missing immediate qualifier to the (V)CMPSS/D instructions (#84496 ) Matches (V)CMPPS/D and makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-03-09 16:21:25 +00:00
David Green	44be5a7fdc	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875 ) This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).	2024-03-06 17:40:13 +00:00
AtariDreams	3e40c96d89	[X86] Resolve FIXME: Add FPCW as a rounding control register (#82452 ) To prevent tests from breaking, another fix had to be made: Now, we check if the instruction after a waiting instruction is a call, and if so, we insert the wait.	2024-03-05 08:47:05 +08:00
Simon Pilgrim	448fe73428	[X86] Add X86::getVectorRegisterWidth helper. NFC. Replaces internal helper used by addConstantComments to allow reuse in a future patch.	2024-02-08 12:42:33 +00:00
Shengchen Kan	e270ec47cd	[X86] X86InstrInfo.cpp - Remove dead code for memory folding, NFCI `commuteInstruction(MI, false, OpNum, CommuteOpIdx2)` should never create any new instruction, so we don't need to check and erase it.	2024-02-02 11:14:07 +08:00
Philip Reames	3ff7caea33	[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339 )	2024-02-01 17:52:35 -08:00
Shengchen Kan	c82a645ef2	[X86][NFC] Simplify the code for memory fold	2024-02-01 13:43:25 +08:00
Shengchen Kan	e3c9327bc4	[X86][CodeGen] Set isReMaterializable = 1 for AVX broadcast load Broadcast of a single float should not be any slower than loading 32B using vmovaps. So remat it can help reduce register spill when there is big register pressure.	2024-01-31 20:55:56 +08:00
Kazu Hirata	5d7a0a734a	[X86] Use a range-based for loop (NFC)	2024-01-30 22:12:05 -08:00
Shengchen Kan	8e77390c06	[X86][CodeGen] Support folding memory broadcast in X86InstrInfo::foldMemoryOperandImpl (#79761 )	2024-01-31 12:51:03 +08:00
Shengchen Kan	2960656eb9	[X86][NFC] Extract code for commute in foldMemoryOperandImpl into functions To share code for folding broadcast in #79761	2024-01-31 00:09:08 +08:00
Shengchen Kan	02a275cca1	[X86][CodeGen] Add entries for TB_BCAST_SH in getBroadcastOpcode	2024-01-30 21:01:31 +08:00
Shengchen Kan	f28430d577	[X86][CodeGen] Add entries for TB_BCAST_W in getBroadcastOpcode and fix typo	2024-01-30 01:03:32 +08:00
Shengchen Kan	169553688c	[X86][NFC] Remove TB_FOLDED_BCAST and format code in X86InstrFoldTables.cpp	2024-01-30 00:27:16 +08:00
Shengchen Kan	7089c012ec	[X86][NFC] Replace if-else with switch-case in X86InstrInfo::foldMemoryOperandImpl	2024-01-28 10:30:26 +08:00
Shengchen Kan	6754b5428e	[X86][NFC] AnalyzeBranchImpl -> analyzeBranchImpl and remove duplicated comments in X86InstrInfo.h	2024-01-28 09:54:31 +08:00
Shengchen Kan	035f33bf41	[X86][CodeGen] Add NDD entries for X86InstrInfo::foldImmediate	2024-01-26 22:11:57 +08:00
Shengchen Kan	550f0eb2ce	[NFC] Rename TargetInstrInfo::FoldImmediate to TargetInstrInfo::foldImmediate and simplify implementation for X86	2024-01-26 20:50:58 +08:00
Shengchen Kan	821dee9852	[X86][CodeGen] Add NDD entries for isAssociativeAndCommutative	2024-01-26 18:39:52 +08:00
Shengchen Kan	33ecef9812	[X86][CodeGen] Fix crash when commute operands of Instruction for code size (#79245 ) Reported in 134fcc62786d31ab73439201dce2d73808d1785a Incorrect opcode is used b/c there is a `[[fallthrough]]` at line 2386.	2024-01-24 17:10:28 +08:00
Shengchen Kan	71d64ed80f	[X86][Peephole] Add NDD entries for EFLAGS optimization	2024-01-24 15:47:58 +08:00
Shengchen Kan	f7b61f81b5	[X86][CodeGen] Transform NDD SUB to CMP if dest reg is dead (#79135 )	2024-01-24 13:58:48 +08:00
Anatoly Trosinenko	10bd69a4f7	[MachineOutliner] Refactor iterating over Candidate's instructions (#78972 ) Make Candidate's front() and back() functions return references to MachineInstr and introduce begin() and end() returning iterators, the same way it is usually done in other container-like classes. This makes possible to iterate over the instructions contained in Candidate the same way one can iterate over MachineBasicBlock (note that begin() and end() return bundled iterators, just like MachineBasicBlock does, but no instr_begin() and instr_end() are defined yet).	2024-01-23 17:21:40 +03:00
Shengchen Kan	66237d647e	[X86][CodeGen] Add entries for NDD SHLD/SHRD to the commuteInstructionImpl	2024-01-23 17:05:09 +08:00
Shengchen Kan	134fcc6278	[X86][NFC] Simplify function X86InstrInfo::commuteInstructionImpl	2024-01-23 16:32:32 +08:00
Simon Pilgrim	4e64ed9780	[X86] Update X86::getConstantFromPool to take base OperandNo instead of Displacement MachineOperand This allows us to check the entire constant address calculation, and ensure we're not performing any runtime address math into the constant pool (noticed in an upcoming patch).	2024-01-22 15:40:45 +00:00
XinWang10	dd6fec5d4f	[X86][APX]Support lowering for APX promoted AMX-TILE instructions (#78689 ) The enc/dec of promoted AMX-TILE instructions have been supported in https://github.com/llvm/llvm-project/pull/76210. This patch support lowering for promoted AMX-TILE instructions and integrate test to existing tests.	2024-01-22 11:33:23 +08:00
Simon Pilgrim	d12dffacaa	[X86] Add X86::getConstantFromPool helper function to replace duplicate implementations. We had the same helper function in shuffle decode / vector constant code - move this to X86InstrInfo to avoid duplication.	2024-01-18 11:59:46 +00:00
Shengchen Kan	199117ae09	[X86] Fix error: unused variable 'isMemOp' after #78019 , NFCI BTW, I adjust the code by LLVM coding standards.	2024-01-16 13:14:55 +08:00
Jie Fu	d338d15243	[X86] Fix -Wunused-variable in X86InstrInfo.cpp (NFC) llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:3467:14: error: unused variable 'isMemOp' [-Werror,-Wunused-variable] 3467 \| const auto isMemOp = [](const MCOperandInfo &OpInfo) -> bool { \| ^~~~~~~ 1 error generated.	2024-01-16 11:57:13 +08:00
Nicholas Mosier	855e863004	[X86] Add MI-layer routine for getting the index of the first address operand, NFC (#78019 ) Add the MI-layer routine X86::getFirstAddrOperandIdx(), which returns the index of the first address operand of a MachineInstr (or -1 if there is none). X86II::getMemoryOperandNo(), the existing MC-layer routine used to obtain the index of the first address operand in a 5-operand X86 memory reference, is incomplete: it does not handle pseudo-instructions like TCRETURNmi, resulting in security holes in the mitigation passes that use it (e.g., x86-slh and x86-lvi-load). X86::getFirstAddrOperandIdx() handles both pseudo and real instructions and is thus more suitable for most use cases than X86II::getMemoryOperandNo(), especially in mitigation passes like x86-slh and x86-lvi-load. For this reason, this patch replaces all uses of X86II::getMemoryOperandNo() with X86::getFirstAddrOperandIdx() in the aforementioned mitigation passes.	2024-01-16 10:55:00 +08:00
Kazu Hirata	a041da3109	[X86] Use range-based for loops (NFC)	2023-12-24 15:56:36 -08:00
Simon Pilgrim	bcee4a9363	[X86] Rename VPERMI2/VPERMT2 to VPERMI2Z/VPERMT2Z (#75192 ) Add missing AVX512 Z prefix to conform to the standard naming convention and simplify matching in X86FoldTablesEmitter::addBroadcastEntry etc.	2023-12-14 09:55:18 +00:00
Arthur Eubanks	843ea98437	[X86] Allow constant pool references under medium code model in X86InstrInfo::foldMemoryOperandImpl() (#75011 ) The medium code model assumes that the constant pool is referenceable with 32-bit relocations.	2023-12-11 19:00:56 -08:00
Arthur Eubanks	687e63a2bd	[X86] Allow accessing large globals in small code model (#74785 ) This removes some assumptions that the small code model will only reference "near" globals. There are still some missing optimizations and wrong code sequences, but I'd like to address those separately. This will require auditing any checks of the code model in the X86 backend.	2023-12-08 11:09:54 -08:00
Matt Arsenault	546a9ce80c	CodeGen: Fix bypassing legality checks for IMPLICIT_DEF rematerialization (#73934 ) It's permitted to have extra implicit-def operands of the same main register after the main register def. If there are implicit operands, use the standard legality checks which verify the operand contents. Depends #73933	2023-12-06 21:43:19 +07:00
Simon Pilgrim	56eb3e738a	[X86] Set x87 fld1/fldz pseudo instructions as rematerializable (#74592 ) No need to generate/spill/restore to cpu stack Cleanup work to allow us to properly use isFPImmLegal and fix some regressions encountered while looking at #74304	2023-12-06 14:36:42 +00:00
Shengchen Kan	68d6fe508c	[X86][CodeGen] Prefer KMOVkk_EVEX than KMOVkk when EGPR is supported (#74048 ) In memory fold table, we have ``` {X86::KMOVDkk, X86::KMOVDkm, 0}, {X86::KMOVDkk_EVEX, X86::KMOVDkm_EVEX, 0} ``` where `KMOVDkm_EVEX` can use EGPR as base and index registers, while `KMOVDkm` can't. Hence, though `KMOVkk` does not have any GPR operands, we prefer to use `KMOVDkk_EVEX` to help register allocation. It will be compressed to `KMOVDkk` in EVEX2VEX pass if memory folding does not happen.	2023-12-02 22:43:02 +08:00
Shengchen Kan	e017169dbd	[X86][NFC] Extract ReplaceableInstrs to a separate file and clang-format X86InstrInfo.cpp	2023-12-01 15:21:38 +08:00
Shengchen Kan	511ba45a47	[X86][MC][CodeGen] Support EGPR for KMOV (#73781 ) KMOV is essential for copy between k-registers and GPRs. R16-R31 was added into GPRs in #70958, so we extend KMOV for these new registers first. This patch 1. Promotes KMOV instructions from VEX space to EVEX space 2. Emits prefix {evex} for the EVEX variants 3. Prefers EVEX variant than VEX variant in ISEL and optimizations for better RA EVEX variants will be compressed to VEX variants by existing EVEX2VEX pass if no EGPR is used. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4 TAG: llvm-test-suite && CPU2017 can be built with feature egpr successfully.	2023-11-30 16:13:51 +08:00
Nick Desaulniers	b053359892	[X86InstrInfo] support memfold on spillable inline asm (#70832 ) This enables -regalloc=greedy to memfold spillable inline asm MachineOperands. Because no instruction selection framework marks MachineOperands as spillable, no language frontend can observe functional changes from this patch. That will change once instruction selection frameworks are updated. Link: https://github.com/llvm/llvm-project/issues/20571	2023-11-29 08:18:51 -08:00
Shengchen Kan	bafa51c8a5	[X86] Rename X86MemoryFoldTableEntry to X86FoldTableEntry, NFCI b/c it's used for element that folds a load, store or broadcast.	2023-11-28 19:49:14 +08:00
Craig Topper	a845061935	[AArch64] Use the same fast math preservation for MachineCombiner reassociation as X86/PowerPC/RISCV. (#72820 ) Don't blindly copy the original flags from the pre-reassociated instrutions. This copied the integer poison flags which are not safe to preserve after reassociation. For the FP flags, I think we should only keep the intersection of the flags. Override setSpecialOperandAttr to do this. Fixes #72777.	2023-11-22 14:17:45 -08:00
Alex Bradbury	5b3eb1bc22	[ARM][X86][NFC] Use lambda to avoid duplicate switches in areLoadsFromSameBasePtr (#72376 ) Both the Arm and X86 implementations of areLoadsFromSameBasePtr use a switch over the machine opcode, and repeat the same logic for both SDNode operands. We can avoid the duplicated logic (especially lengthy in the X86 case) by just using a lambda. This could obviously be a candidate for moving out to a separate helper function if there were other users, but I've made the minimal change in this patch.	2023-11-15 12:35:35 +00:00
Shengchen Kan	c9017bc793	[X86] Support EGPR (R16-R31) for APX (#70958 ) 1. Map R16-R31 to DWARF registers 130-145. 2. Make R16-R31 caller-saved registers. 3. Make R16-31 allocatable only when feature EGPR is supported 4. Make R16-31 availabe for instructions in legacy maps 0/1 and EVEX space, except XSAVE*/XRSTOR RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4 Explanations for some seemingly unrelated changes: inline-asm-registers.mir, statepoint-invoke-ra-enter-at-end.mir: The immediate (TargetInstrInfo.cpp:1612) used for the regdef/reguse is the encoding for the register class in the enum generated by tablegen. This encoding will change any time a new register class is added. Since the number is part of the input, this means it can become stale. seh-directive-errors.s: R16-R31 makes ".seh_pushreg 17" legal musttail-varargs.ll: It seems some LLVM passes use the number of registers rather the number of allocatable registers as heuristic. This PR is to reland #67702 after #70222 in order to reduce some compile-time regression when EGPR is not used.	2023-11-09 23:39:40 +08:00

1 2 3 4 5 ...

1667 Commits