llvm-project

Author	SHA1	Message	Date
Phoebe Wang	da1eb886c4	[X86] Do not check alignment for VINSERTPS (#65721 ) We don't have alignment constraint in AVX instructions.	2023-09-08 19:23:43 +08:00
Daniel Hoekwater	ca72b0a709	[CodeGen] Use the TII hook for Noop insertion in BBSections (NFC) Refactor BasicBlockSections to use the target-specific noop insertion hook from TargetInstrInfo instead of building it ourselves. Using the TII hook is both cleaner and makes it easier to extend BBSections to non-X86 targets. Differential Revision: https://reviews.llvm.org/D158303	2023-08-18 19:40:11 +00:00
Shengchen Kan	fda9a9c61e	[X86][Codegen] Remove dead code for ADCX/ADOX There is no pattern for ADCX/ADOX and they are never selected during ISEL. So we remove the cases in some MIR optimizations in this patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157717	2023-08-14 10:23:42 +08:00
Sander de Smalen	bbb95893de	[TII] NFCI: Simplify the interface for isTriviallyReMaterializable Currently `isTriviallyReMaterializable` calls `isReallyTriviallyReMaterializable` and `isReallyTriviallyReMaterializableGeneric`. The two interfaces are confusing, but there are also some real issues with this. The documentation of this function (see below) suggests that `isReallyTriviallyRematerializable` allows the target to override the default behaviour. /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. It however implements something different. The default behaviour is the analysis done in `isReallyTriviallyReMaterializableGeneric`, which is testing if it is safe to rematerialize the MachineInstr. The result of `isReallyTriviallyReMaterializable` is only considered if `isReallyTriviallyReMaterializableGeneric` returns `false`. That means there is no way to override the default behaviour if `isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to rematerialize, but we'd rather not). By making this a single interface, we can override the interface to do either. Reviewed By: craig.topper, nemanjai Differential Revision: https://reviews.llvm.org/D156520	2023-08-07 13:01:06 +00:00
Matt Arsenault	c26dfc81e2	[HACK] X86: Disable isCopyInstrImpl for undef subregister defs This is a workaround for a coalescer bug where coalescing SUBREG_TO_REG ends up losing the liveness of the high bits of the source register. The result is an incorrect undef subregister def instead of preserving the high values. Work around the observed failure after the resulting mov is eliminated during allocation until a proper fix is ready. I believe the proper fix is to make SUBREG_TO_REG use a tied operand. The test should catch a regression originally observed after b7836d856206ec39509d42529f958c920368166b and should not show a difference after a496c8be6e638ae58bb45f13113dbe3a4b7b23fd is reverted. https://reviews.llvm.org/D156164	2023-07-28 13:33:28 -04:00
Freddy Ye	1c154bd755	[X86] Add AVX-VNNI-INT16 instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D155145	2023-07-20 14:31:16 +08:00
XinWang10	2d6a5ab5eb	[X86]Recommit D154193 - Remove TEST in AND32ri+TEST16rr in peephole-opt Previously we remove a pattern like: %reg = and32ri %in_reg, 5 ... // EFLAGS not changed. %src_reg = subreg_to_reg 0, %reg, %subreg.sub_index test64rr %src_reg, %src_reg, implicit-def $eflags We can remove test64rr since it has same functionality as and subreg_to_reg avoid the opt in previous code, so we handle this case specially. And this case is also can be opted for the same reason, like: %reg = and32ri %in_reg, 5 ... // EFLAGS not changed. %src_reg = copy %reg.sub_16bit:gr32 test16rr %src_reg, %src_reg, implicit-def $eflags The COPY from gr32 to gr16 prevent the opt in previous code too, just handle it specially as what we did for test64rr. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D154193	2023-07-14 03:42:42 -04:00
Wang, Xin10	284a059b33	Revert "[X86]Remove TEST in AND32ri+TEST16rr in peephole-opt" This reverts commit 2c64226d84174dd1d9f93e1884c1b0bd432f89b5. revert first due to buildbot fail https://lab.llvm.org/buildbot/#/builders/85/builds/17571	2023-07-10 03:20:11 -04:00
XinWang10	2c64226d84	[X86]Remove TEST in AND32ri+TEST16rr in peephole-opt Previously we remove a pattern like: %reg = and32ri %in_reg, 5 ... // EFLAGS not changed. %src_reg = subreg_to_reg 0, %reg, %subreg.sub_index test64rr %src_reg, %src_reg, implicit-def $eflags We can remove test64rr since it has same functionality as and subreg_to_reg avoid the opt in previous code, so we handle this case specially. And this case is also can be opted for the same reason, like: %reg = and32ri %in_reg, 5 ... // EFLAGS not changed. %src_reg = copy %reg.sub_16bit:gr32 test16rr %src_reg, %src_reg, implicit-def $eflags The COPY from gr32 to gr16 prevent the opt in previous code too, just handle it specially as what we did for test64rr. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D154193	2023-07-09 23:21:32 -04:00
David Green	2802739dfd	[NFC] Replace ;; with ;	2023-06-11 10:25:24 +01:00
Dávid Bolvanský	09515f2c20	[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata. Example: ``` int MaxIndex(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is converted to branch by X86CmovConversion if (a[i] > a[t]) t = i; } return t; } int MaxIndex2(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is preserved if (__builtin_unpredictable(a[i] > a[t])) t = i; } return t; } ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D118118	2023-06-01 20:56:44 +02:00
Shengchen Kan	f603809637	[X86] Move encoding optimization for PUSH32i, PUSH64i to MC lowering, NFCI	2023-05-20 17:59:43 +08:00
Shengchen Kan	89ca4eb002	[X86][NFC] Correct the instruction names for PUSH16i, PUSH32i Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D151012	2023-05-20 17:33:42 +08:00
Shengchen Kan	0d9b36ce7d	[X86] Remove patterns for IMUL with immediate 8 and optimize during MC lowering, NFCI	2023-05-20 11:14:03 +08:00
Shengchen Kan	c81a121f3f	Revert "Revert "[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI"" This reverts commit cb16b33a03aff70b2499c3452f2f817f3f92d20d. In fact, the test https://bugs.chromium.org/p/chromium/issues/detail?id=1446973#c2 already passed after 5586bc539acb26cb94e461438de01a5080513401	2023-05-19 22:21:56 +08:00
Hans Wennborg	cb16b33a03	Revert "[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI" This caused compiler assertions, see comment on https://reviews.llvm.org/D150107. This also reverts the dependent follow-up change: > [X86] Remove patterns for ADD/AND/OR/SUB/XOR/CMP with immediate 8 and optimize during MC lowering, NFCI > > This is follow-up of D150107. > > In addition, the function `X86::optimizeToFixedRegisterOrShortImmediateForm` can be > shared with project bolt and eliminates the code in X86InstrRelaxTables.cpp. > > Differential Revision: https://reviews.llvm.org/D150949 This reverts commit 2ef8ae134828876ab3ebda4a81bb2df7b095d030 and 5586bc539acb26cb94e461438de01a5080513401.	2023-05-19 14:43:33 +02:00
Shengchen Kan	5586bc539a	[X86] Remove patterns for ADD/AND/OR/SUB/XOR/CMP with immediate 8 and optimize during MC lowering, NFCI This is follow-up of D150107. In addition, the function `X86::optimizeToFixedRegisterOrShortImmediateForm` can be shared with project bolt and eliminates the code in X86InstrRelaxTables.cpp. Differential Revision: https://reviews.llvm.org/D150949	2023-05-19 18:22:30 +08:00
Shengchen Kan	2ef8ae1348	[X86] Remove patterns for ADC/SBB with immediate 8 and optimize during MC lowering, NFCI This is follow-up of D150107.	2023-05-19 10:33:52 +08:00
Shengchen Kan	77589e945f	[X86] Remove patterns for shift/rotate with immediate 1 and optimize during MC lowering It's first suggested by @craig.topper in D150068. I think there are at least three pros 1. This can reduce the patterns during ISEL, as a result, reducing the bytes in X86GenDAGISel.inc 2. The patterns for shift/rotate with immediate 1 look quite similar to shift/rotate with immediate 8. So this can be seen as eliminating "duplicate" code. 3. Delay the optimization from imm8 to imm1, so that the previous optimization passes do not need to handle the version of imm1 It improves fast isel code and makes X86DomainReassignment work for shifts by 1, but regressed global isel, though no one should care. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D150107	2023-05-17 19:55:44 +08:00
Matthias Braun	b8817825b9	Support critical edge splitting for jump tables Add support for splitting critical edges coming from an indirect jump using a jump table ("switch jumps"). This introduces the `TargetInstrInfo::getJumpTableIndex` callback to allows targets to return an index into `MachineJumpTableInfo` for a given indirect jump. It also updates to `MachineBasicBlock::SplitCriticalEdge` to allow splitting of critical edges by rewriting jump table entries. This is largely based on work done by Zhixuan Huan in D132202. Differential Revision: https://reviews.llvm.org/D140975	2023-05-10 20:30:52 -07:00
Sami Tolvanen	e9569748de	[CodeGen][KCFI] Move cfi-type lowering to TargetLowering KCFI machine function passes transform indirect calls with a cfi-type attribute into architecture-specific type checks bundled together with the calls. Instead of having a separate pass for each architecture, add a generic machine function pass for KCFI and move the architecture-specific code that emits the actual check to TargetLowering. This avoids unnecessary duplication and makes it easier to add KCFI support to other architectures. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D149915	2023-05-09 18:38:54 +00:00
Akshay Khadse	5c7c3af1d0	Reapply [Coverity] Fix explicit null dereferences This change fixes static code analysis errors Reviewed By: skan Differential Revision: https://reviews.llvm.org/D149506	2023-05-08 21:19:40 +08:00
Luo, Yuanke	40222ddcf8	[X86] Fix the vnni machine combine issue. The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly in genAlternativeDpCodeSequence(). It causes vnni lit test failure when LLVM_ENABLE_EXPENSIVE_CHECKS is on.	2023-04-29 13:51:08 +08:00
Jie Fu	563e3028c9	[X86] Fix -Wstring-conversion in X86InstrInfo.cpp (NFC) /Users/jiefu/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9794:12: error: implicit conversion turns string literal into bool: 'const char[25]' to 'bool' [-Werror,-Wstring-conversion] assert("It should not reach here"); ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~ /Applications/Xcode13.1/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.0.sdk/usr/include/assert.h:99:25: note: expanded from macro 'assert' (__builtin_expect(!(e), 0) ? __assert_rtn(__func__, __ASSERT_FILE_NAME, __LINE__, #e) : (void)0) ~ ^ 1 error generated.	2023-04-27 17:52:57 +08:00
Jie Fu	8de16131cb	[X86] Fix -Wsometimes-uninitialized in X86InstrInfo.cpp (NFC) /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'MaddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9854:25: note: uninitialized use occurs here Madd->setDesc(TII.get(MaddOpc)); ^~~~~~~ /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9791:19: note: initialize the variable 'MaddOpc' to silence this warning unsigned MaddOpc; ^ = 0 /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9793:3: error: variable 'AddOpc' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized] default: ^~~~~~~ /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9862:46: note: uninitialized use occurs here BuildMI(*MF, MIMetadata(Root), TII.get(AddOpc), DstReg) ^~~~~~ /data/llvm-project/llvm/lib/Target/X86/X86InstrInfo.cpp:9790:18: note: initialize the variable 'AddOpc' to silence this warning unsigned AddOpc; ^ = 0 2 errors generated.	2023-04-27 17:08:24 +08:00
Luo, Yuanke	8f7f9d86a7	[X86] Machine combine vnni instruction. "vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is reduced after combination. However when vpdpwssd is in a critical path the combination get less ILP. It happens when vpdpwssd is in a loop, the vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd has data dependency for each iterations. If vpaddd is in a critical path while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd + vpaddd". This patch is based on the machine combiner framework to acheive decision on "vpmaddwd + vpaddd" combination. The typical example code is as below. ``` __m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) { for (int i = 0; i < cnt; ++i) { __m256i a = p[i]; __m256i m = _mm256_madd_epi16 (b, a); c = _mm256_add_epi32(m, c); } return c; } ``` Differential Revision: https://reviews.llvm.org/D148980	2023-04-27 16:42:04 +08:00
Tom Weaver	b63c08c773	Revert "[Coverity] Fix explicit null dereferences" This reverts commit 22b23a5213b57ce1834f5b50fbbf8a50297efc8a. This commit caused the following two build bots to start failing: https://lab.llvm.org/buildbot/#/builders/216/builds/20322 https://lab.llvm.org/buildbot/#/builders/123/builds/18511	2023-04-24 11:14:10 +01:00
Akshay Khadse	22b23a5213	[Coverity] Fix explicit null dereferences This change fixes static code analysis errors Reviewed By: skan Differential Revision: https://reviews.llvm.org/D148912	2023-04-23 12:07:11 +08:00
Shengchen Kan	92af50f41c	[X86][NFC] Fix for warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits	2023-04-06 17:13:10 +08:00
Shengchen Kan	ea91acda05	[X86][mem-fold] Simplify the logic and correct the comments for TB_ALIGN, NFCI	2023-04-06 16:38:30 +08:00
Amara Emerson	41e9c4b88c	[NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction. I think it's good practice to avoid having default ctors unless they're really valid/useful. For OutlinedFunction the default ctor was used to represent a bail-out value for getOutliningCandidateInfo(), so I changed the API to return an optional<getOutliningCandidateInfo> instead which seems a tad cleaner. Differential Revision: https://reviews.llvm.org/D146375	2023-03-20 11:17:10 -07:00
duk	d61d591411	[MachineOutliner] Make getOutliningType partially target-independent The motivation behind this patch is to unify some of the outliner logic across architectures. This looks nicer in general and makes fixing [issues like this](https://reviews.llvm.org/D124707#3483805) easier. There are some notable changes here: 1. `isMetaInstruction()` is used directly instead of checking for specific meta-instructions like `IMPLICIT_DEF` or `KILL`. This was already done in the RISC-V implementation, but other architectures still did hardcoded checks. - As an exception to this, CFI instructions are explicitly delegated to the target because RISC-V has different handling for those. 2. `isTargetIndex()` checks are replaced with an assert; none of the architectures supported actually use `MO_TargetIndex` at this point in time. 3. `isCFIIndex()` and `isFI()` checks are also replaced with asserts, since these operands should not exist in [any context](https://reviews.llvm.org/D122635#3447214) at this stage in the pipeline. Reviewed by: paquette Differential Revision: https://reviews.llvm.org/D125072	2023-02-09 14:35:00 -05:00
Shengchen Kan	011e4abb49	[X86][MC][bugfix] Report error for mismatched modifier in inline asm and remove function getX86SubSuperRegisterOrZero ``` MCRegister getX86SubSuperRegister(MCRegister Reg, unsigned Size, bool High = false); ``` A strange behavior of the functions `getX86SubSuperRegister` was introduced by llvm-svn:145579: The returned register may not match the parameters when a 8-bit high register is required. And llvm-svn: 175762 refined the code and dropped the comments, then we knew nothing happened there from the code :-( These two functions are only called with `Size=8` and `High=true` in two places. One is in `X86FixupBWInsts.cpp` for liveness of registers and the other is in `X86AsmPrinter.cpp` for inline asm. For the first one, we provide an alternative in this patch. For the second one, the strange behaviour caused a bug that an erorr was not reported for mismatched modifier. ``` void f() { char x; asm volatile ("mov %%ah, %h0" :"=r"(x)::"%eax", "%ebx", "%ecx", "%edx", "edi", "esi"); } ``` ``` $ gcc -S test.c error: extended registers have no high halves ``` ``` $ clang -S test.c no error ``` so we fix the bug in this patch. `getX86SubSuperRegister` is just a wrapper of `getX86SubSuperRegisterOrZero` with a `assert`. I belive we should remove the latter. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D142834	2023-02-02 10:08:56 +08:00
Bill Wendling	7d626e7cbb	[X86] Move RDFLAGS/WRFLAGS expansion until after RA The register allocator may introduce reloads in the middle of reading and writing the EFLAGS register, due to the RDFLAGS & WRFLAGS pseudos being expanded before RA. This may cause an issue where the stack pointer was adjusted but the stack offset for the reload wasn't accounted for (see [1]). To avoid this, expand these pseudos after register allocation. [1] https://github.com/llvm/llvm-project/issues/59102 Reviewed By: craig.topper, nickdesaulniers, pengfei Differential Revision: https://reviews.llvm.org/D140045	2023-01-30 15:32:16 -08:00
Kazu Hirata	5d3462162e	[X86] Use llvm::countr_zero instead of findFirstSet (NFC) At the call site of findFirstSet, ZMask \| (1 << DstIdx) always have exactly 3 bits set, and they are all among the 4 least significant bits, so (ZMask \| (1 << DstIdx)) ^ 15 has exactly one bit set. Since the argument to findFirstSet is nonzero, we can safely switch to llvm::countr_zero.	2023-01-24 23:26:08 -08:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Craig Topper	79858d1908	[CodeGen][Target] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC Use isPhysical/isVirtual methods.	2023-01-13 23:12:48 -08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Christudasan Devadasan	b5efec4b27	[CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot With D134950, targets get notified when a virtual register is created and/or cloned. Targets can do the needful with the delegate callback. AMDGPU propagates the virtual register flags maintained in the target file itself. They are useful to identify a certain type of machine operands while inserting spill stores and reloads. Since RegAllocFast spills the physical register itself, there is no way its virtual register can be mapped back to retrieve the flags. It can be solved by passing the virtual register as an additional argument. This argument has no use when the spill interfaces are called during the greedy allocator or even the PrologEpilogInserter and can pass a null register in such cases. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138656	2022-12-17 11:55:34 +05:30
Anton Sidorenko	f8ed709345	[MachineCombiner] Extend reassociation logic to handle inverse instructions Machine combiner supports generic reassociation only of associative and commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we can extend this generic support to handle patterns like (X + A) - Y => (X - Y) + A), where `-` is the inverse of `+`. This patch adds interface functions to process reassociation patterns of associative/commutative instructions and their inverse variants with minimal changes in backends. Differential Revision: https://reviews.llvm.org/D136754	2022-12-07 13:50:28 +03:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Shengchen Kan	861f5dd688	[X86][NFC] Minor improvement in X86InstrInfo::optimizeCompareInstr Before this patch, the code enumerated `getCondFromBranch`, `getCondFromSETCC` and `getCondFromFromCMov` to get the condition code of a `MachineInstr`, and assigned the result to variable `OldCC` when `MI \|\| IsSwapped \|\| ImmDelta != 0` was satisfiled. After this patch, the `if-else` structure is eliminated by using `getCondFromMI`. Since `OldCC` is only used when `MI \|\| IsSwapped \|\| ImmDelta != 0` is true, it is initialized with `getCondFromMI` directly outside the scope of `if` now. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D138349	2022-11-21 21:00:07 +08:00
Sami Tolvanen	7c96f61aaa	[X86][KCFI] Don't fold loads into indirect calls that need a KCFI check Avoid unnecessary folding as X86KCFIPass would have to unfold these anyway when emitting the KCFI_CHECK.	2022-11-18 21:55:41 +00:00
Bing1 Yu	5bc36c8cb4	[X86] Add necessary check isReg() when updating LiveVariables in convertToThreeAddress Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D137388	2022-11-10 21:12:00 +08:00
Freddy Ye	23f02693ec	[X86] Add AVX-VNNI-INT8 instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135938	2022-10-28 10:39:54 +08:00
Freddy Ye	0e720e6ada	[X86] Add AVX-IFMA instructions. For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D135932	2022-10-28 09:42:30 +08:00
Guozhi Wei	d24c93cc41	[X86] Enable reassociation for ADD instructions ADD is an associative and commutative operation, so we can do reassociation for it. Differential Revision: https://reviews.llvm.org/D136396	2022-10-26 00:46:13 +00:00
Jay Foad	325927ffb9	[X86] Update LiveVariables in more cases in convertToThreeAddress Following on from D129634, this patch fixes more X86 CodeGen test failures with D129213 applied, which adds verification of LiveIntervals after the TwoAddressInstruction pass runs. These failures only showed up with LLVM_ENABLE_EXPENSIVE_CHECKS=ON which adds the equivalent of an implicit -verify-machineinstrs on all tests. Differential Revision: https://reviews.llvm.org/D136596	2022-10-25 09:21:51 +01:00
Joao Moreira	eac3e5c3fb	[X86] Do not emit JCC to __x86_indirect_thunk Clang may optimize conditional tailcall blocks with the following layout: cmp <condition> je tailcall_target ret When retpoline is in place, indirect calls are converted into direct calls to a retpoline thunk. When these indirect calls are tail calls, they may be subject to the above described optimization (there is no indirect JCC, but since now the jump is direct it can be made conditional). The above layout is non-ideal for the Linux kernel scenario because the branches into thunks may be patched back into indirect branches during runtime depending on the underlying CPU features, what would not be feasible if the binary is emitted with the optimized layout above. Thus, prevent clang from emitting this it if CodeModel is Kernel. Feature request from the respective kernel mailing list: https://lore.kernel.org/llvm/Yv3uI%2FMoJVctmBCh@worktop.programming.kicks-ass.net/ Reviewed By: nickdesaulniers, pengfei Differential Revision: https://reviews.llvm.org/D134915	2022-10-06 11:09:24 -07:00

1 2 3 4 5 ...

1609 Commits