llvm-project

Author	SHA1	Message	Date
CHIANG, YU-HSUN (Tommy Chiang, oToToT)	4a31af88a2	[MC][AArch64] Enable '+v8a' when nothing specified for MCSubtargetInfo Since D110065, the 'R' profile support is added to LLVM. It turns the `generic` cpu into the intersection of v8-a and v8-r. However, this makes some backward compatibility problems. The original patch makes the clang driver implicitly pass -march=armv8-a when only the triple is specified. Since it only applies to clang, other tools like llvm-objdump still faces the backward compatibility problem. This patch applies the same idea to MC related tools by enabling '+v8a' feature when nothing is specified (both CPU and FS are empty) for MCSubtargetInfo creation. This patch should fix PR53956. Reviewed by: labrinea Differential Revision: https://reviews.llvm.org/D124319	2022-04-29 04:53:22 +08:00
David Penry	fa49021c68	Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM" This reverts commit 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f while I investigate a buildbot failure.	2022-04-28 13:29:27 -07:00
Simon Pilgrim	ab17ed0723	[X86] Don't fold AND(SRL(X,Y),1) -> SETCC(BT(X,Y)) on BMI2 targets With BMI2 we have SHRX which is a lot quicker than regular x86 shifts. Fixes #55138	2022-04-28 21:28:16 +01:00
David Penry	28d09bbbc3	[CodeGen][ARM] Enable Swing Module Scheduling for ARM This patch permits Swing Modulo Scheduling for ARM targets turns it on by default for the Cortex-M7. The t2Bcc instruction is recognized as a loop-ending branch. MachinePipeliner is extended by adding support for "unpipelineable" instructions. These instructions are those which contribute to the loop exit test; in the SMS papers they are removed before creating the dependence graph and then inserted into the final schedule of the kernel and prologues. Support for these instructions was not previously necessary because current targets supporting SMS have only supported it for hardware loop branches, which have no loop-exit-contributing instructions in the loop body. The current structure of the MachinePipeliner makes it difficult to remove/exclude these instructions from the dependence graph. Therefore, this patch leaves them in the graph, but adds a "normalization" method which moves them in the schedule to stage 0, which causes them to appear properly in kernel and prologues. It was also necessary to be more careful about boundary nodes when iterating across successors in the dependence graph because the loop exit branch is now a non-artificial successor to instructions in the graph. In additional, schedules with physical use/def pairs in the same cycle should be treated as creating an invalid schedule because the scheduling logic doesn't respect physical register dependence once scheduled to the same cycle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D122672	2022-04-28 13:01:18 -07:00
David Tenty	8042699a30	[LLVM] Add exported visibility style for XCOFF For the AIX linker, under default options, global or weak symbols which have no visibility bits set to zero (i.e. no visibility, similar to ELF default) are only exported if specified on an export list provided to the linker. So AIX has an additional visibility style called "exported" which indicates to the linker that the symbol should be explicitly globally exported. This change maps "dllexport" in the LLVM IR to correspond to XCOFF exported as we feel this best models the intended semantic (discussion on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853) and allows us to enable writing this visibility for the AIX target in the assembly path. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D123951	2022-04-28 14:56:00 -04:00
Alan Zhao	3333c28fc0	[llvm-ml] Improve indirect call parsing In MASM, if a QWORD symbol is passed to a jmp or call instruction in 64-bit mode or a DWORD or WORD symbol is passed in 32-bit mode, then MSVC's assembler recognizes that as an indirect call. Additionally, if the operand is qualified as a ptr, then that should also be an indirect call. Furthermore, in 64-bit mode, such operands are implicitly rip-relative (in fact, MSVC's assembler ml64.exe does not allow explicitly specifying rip as a base register.) To keep this patch managable, this patch does not include: * error messages for wrong operand types (e.g. passing a QWORD in 32-bit mode) * resolving indirect calls if the symbol is declared after it's first use (llvm-ml currently only runs a single pass). * imlementing the extern keyword (required to resolve https://crbug.com/762167.) This patch is likely missing a bunch of edge cases, so please do point them out in the review. Reviewed By: epastor, hans, MaskRay Committed By: epastor (on behalf of ayzhao) Differential Revision: https://reviews.llvm.org/D124413	2022-04-28 13:17:19 -04:00
Simon Pilgrim	a9215ed9cc	[InstCombine][X86] simplifyDemandedVectorEltsIntrinsic - handle avx2 per-element vector shifts	2022-04-28 18:14:54 +01:00
Alexey Bataev	75e1cf4a6a	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-28 10:04:41 -07:00
Craig Topper	ec11fbb1d6	[RISCV] Use default promotion for (i32 (shl 1, X)) on RV64 when Zbs is enabled. This improves opportunities to use bset/bclr/binv. Unfortunately, there are no W versions of these instrcutions so this isn't always a clear win. If we use SLLW we get free sign extend and shift masking, but need to put a 1 in a register and can't remove an or/xor. If we use bset/bclr/binv we remove the immediate materializationg and logic op, but might need a mask on the shift amount and sext.w. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124096	2022-04-28 09:58:30 -07:00
Simon Pilgrim	9e3b7e8e65	[X86] getTargetVShiftByConstNode - use SelectionDAG::FoldConstantArithmetic to perform constant folding. NFCI. Remove some unnecessary code duplication.	2022-04-28 17:10:20 +01:00
Craig Topper	8631a5e712	[RISCV] Fix alias printing for vmnot.m By clearing the HasDummyMask flag from mask register binary operations and mask load/store. HasDummyMask was causing an extra operand to get appended when converting from MachineInstr to MCInst. This extra operand doesn't appear in the assembly string so was mostly ignored, but it prevented the alias instruction printing from working correctly. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D124424	2022-04-28 08:33:52 -07:00
Alexey Bataev	9861ca0c23	Revert "[COST]Improve cost model for shuffles in SLP." This reverts commit 29a470e3804ca216d4e76c88a38086eb61c200f9 to fix a crash reported in https://reviews.llvm.org/D100486#3479989.	2022-04-28 08:11:56 -07:00
Simon Pilgrim	de7cee24b6	[X86] getBT - attempt to peek through aext(and(trunc(x),c)) mask/modulo Ideally we'd fold this with generic DAGCombiner, but that only works for !isTruncateFree cases - we might be able to adapt IsDesirableToPromoteOp to find truncated src ops in the future, but for now just use this peephole. Noticed in Issue #55138	2022-04-28 16:10:26 +01:00
Simon Pilgrim	ed8dffef4c	[X86] getFauxShuffle - don't assume an UNDEF src element for AND/ANDNP results in an UNDEF shuffle mask index The other src element might be zero, guaranteeing zero. Fixes #55157	2022-04-28 12:32:58 +01:00
Ties Stuij	051deb2d9d	[ARM] add Armv9 build attribute The build attribute number can be found in the Arm ABI addenda32 document: https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#335target-related-attributes Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D124090	2022-04-28 10:48:26 +01:00
Lian Wang	dc0ae8ce18	[RISCV] Support VP_SETCC mask operations Support VP_SETCC mask operations, turn it to logical operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124438	2022-04-28 08:52:29 +00:00
Luo, Yuanke	942ec5c36d	[X86][AMX] combine tile cast and load/store instruction. The `llvm.x86.cast.tile.to.vector` intrinsic is lowered to `llvm.x86.tilestored64.internal` and `load <256 x i32>`. The `llvm.x86.cast.vector.to.tile` is lowered to `store <256 x i32>` and `llvm.x86.tileloadd64.internal`. When `llvm.x86.cast.tile.to.vector` is used by `store <256 x i32>` or `load <256 x i32>` is used by `llvm.x86.cast.vector.to.tile`, they can be combined by `llvm.x86.tilestored64.internal` and `llvm.x86.tileloadd64.internal`. Differential Revision: https://reviews.llvm.org/D124378	2022-04-28 14:55:21 +08:00
Liqin.Weng	6365bde658	[XCORE][CodeGen][NFC] Use ArrayRef in TargetLowering functions Reviewed By: nigelp-xmos Differential Revision: https://reviews.llvm.org/D123661	2022-04-28 02:06:46 +00:00
Shengchen Kan	6a6b0e4a63	[X86] Check the address in machine verifier 1. The scale factor must be 1, 2, 4, 8 2. The displacement must fit in 32-bit signed integer Noticed by: https://github.com/llvm/llvm-project/issues/55091 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D124455	2022-04-28 10:05:39 +08:00
Bill Wendling	8f2ec974d1	[X86] Move target-generic code into CodeGen [NFC] This code is the same for all platforms. Differential Revision: https://reviews.llvm.org/D124566	2022-04-27 15:37:28 -07:00
Simon Pilgrim	e378577524	[X86] Use is128BitLaneRepeatedShuffleMask wrapper. NFC. We don't need to know the actual repeated mask.	2022-04-27 21:09:57 +01:00
Craig Topper	c2614b31d9	[RISCV] Add isCommutable to scalar FMA instructions. The default implementation of findCommutedOpIndices picks the first two source operands. That's exactly what we want for the scalar FMA instructions. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124463	2022-04-27 11:07:18 -07:00
Alexey Bataev	29a470e380	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-27 10:56:26 -07:00
Chris Bieneman	05b765ff69	[DXIL] [NFC] Remove dead attribute code paths DXIL doesn't support attributes added after LLVM 3.7. The DXILPrepare pass removes those attributes so they should never be present by the time we reach the DXIL bitcode writer. In the event that we somehow try to write a newer attribute in the DXIL writer, we should fail hard (crash), because the output would be invalid. This case should only be possible if the DXIL writer were called without DXILPrepare being run first, which shouldn't be possible. This patch also adds a default case to the switch statement over the attribute list which covers all the removed cases and any new attribute kinds that may be added in the future. The default case is handled like other unsupported cases by a call to llvm_unreachable.	2022-04-27 10:46:59 -05:00
Simon Pilgrim	03482bccad	[X86] collectConcatOps - add ability to collect from vector 'widening' patterns Recognise insert_subvector(undef, x, lo/hi) patterns where we double the width of a vector - creating an UNDEF subvector on the fly.	2022-04-27 15:38:58 +01:00
David Green	46cef9a82d	[AArch64] Attempt to fix bots by ensuring legalized type is a vector	2022-04-27 15:36:15 +01:00
Ivan Kosarev	6ddf2a824d	[AMDGPU] Adjust wave priority based on VMEM instructions to avoid duty-cycling. As older waves execute long sequences of VALU instructions, this may prevent younger waves from address calculation and then issuing their VMEM loads, which in turn leads the VALU unit to idle. This patch tries to prevent this by temporarily raising the wave's priority. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D124246	2022-04-27 14:37:18 +01:00
David Green	8e2a0e61f5	[AArch64] Break up larger shuffle-masks into legal sizes in getShuffleCost Given a larger-than-legal shuffle mask, the final codegen will split into multiple sub-vectors. This attempts to model that in AArch64TTIImpl::getShuffleCost, splitting masks up according to the size of the legalized vectors. If the sub-masks have at most 2 input sources we can call getShuffleCost on them and sum the costs, to get a more accurate final cost for the entire shuffle. The call to improveShuffleKindFromMask helps to improve the shuffle kind for the sub-mask cost call. Differential Revision: https://reviews.llvm.org/D123414	2022-04-27 13:51:50 +01:00
David Green	d6327050e0	[AArch64] Use PerfectShuffle costs in AArch64TTIImpl::getShuffleCost Given a shuffle with 4 elements size 16 or 32, we can use the costs directly from the PerfectShuffle tables to get a slightly more accurate cost for the resulting shuffle. Differential Revision: https://reviews.llvm.org/D123409	2022-04-27 12:09:01 +01:00
Liqin.Weng	ca3cd345a0	[MIPS][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Reviewed By: sdardis Differential Revision: https://reviews.llvm.org/D123577	2022-04-27 09:03:14 +00:00
Jim Lin	9de7b93bc0	[RISCV][NFC] Update and add missing closed curly bracket comment in RISCVInstrInfoZb.td	2022-04-27 15:08:51 +08:00
ShihPo Hung	6b55f133fb	[RISCV][RVV] Select unmasked TU RVV pseudos in a DAG post-process Following D118810 that reduced the size of ISel table, this patch optimizes allone-masked RVV pseudos with TU policy and swap them out to their unmasked TU pseudos. Since the UNDEF merge operand is not preserved, we turn it into TA pseudo regardless of the policy operand. Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D121881	2022-04-26 20:14:54 -07:00
Stanislav Mekhanoshin	6a24e37219	[AMDGPU] Remove now unused variable HasLdsModifier. NFC.	2022-04-26 17:49:30 -07:00
Stanislav Mekhanoshin	0274811b5a	[AMDGPU] Add both mayLoad and mayStore to MUBUF LDS opcodes Differential Revision: https://reviews.llvm.org/D124483	2022-04-26 17:30:24 -07:00
Stanislav Mekhanoshin	00d84a9f92	[AMDGPU] Remove vdata from buffer to lds load Differential Revision: https://reviews.llvm.org/D124485	2022-04-26 17:16:26 -07:00
Stanislav Mekhanoshin	a9ccc7bc54	[AMDGPU] Properly mark MUBUF and FLAT LDS DMA instructions. NFC. Add these bits to the MUBUF and FLAT LDS DMA instructions: - LGKM_CNT - these operate on LDS; - VALU - SPG 3.9.8: This instruction acts as both a MUBUF and VALU instruction; Codegen currently does not produce any of this, so the change is NFC. Differential Revision: https://reviews.llvm.org/D124472	2022-04-26 14:20:26 -07:00
Vasileios Porpodas	fa8a9fea47	Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c. Code review: https://reviews.llvm.org/D124202	2022-04-26 14:02:40 -07:00
Andrew Savonichev	0a27622a1d	[NVPTX] Disable DWARF .file directory for PTX Default behavior for .file directory was changed in D105856, but ptxas (CUDA 11.5 release) refuses to parse it: $ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll $ ptxas debug-file-loc.s ptxas debug-file-loc.s, line 42; fatal : Parsing error near '"foo.h"': syntax error Added a new field to MCAsmInfo to control default value of UseDwarfDirectory. This value is used if -dwarf-directory command line option is not specified. Differential Revision: https://reviews.llvm.org/D121299	2022-04-26 21:40:36 +03:00
Vasileios Porpodas	6a9bbd9f20	Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`" This reverts commit 55ce296d6f217fd0defed2592ff7b74b79b2c1f0.	2022-04-26 11:25:26 -07:00
Vasileios Porpodas	55ce296d6f	[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost` Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is now used for passing the operands of the shuffle. Differential Revision: https://reviews.llvm.org/D124202	2022-04-26 11:11:29 -07:00
Shao-Ce SUN	c59473aacc	[NFC][RISCV][CodeGen] Use ArrayRef in TargetLowering functions Based on D123467. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123653	2022-04-26 23:53:00 +08:00
Chris Bieneman	69c66bb211	[SPIRV][NFC] Remove unused variable This removes an unused local variable that was causing a warning to be emitted.	2022-04-26 10:17:36 -05:00
Chris Bieneman	8631c11590	[SPIRV][NFC] Fix warnings for switch cases Switch statements that cover all cases should not have a `default` case. When a switch covers all cases and includes a `default` case, clang emits a diagnostic. Omitting the `default` case allows the compiler to instead emit a diagnostic on unhandled enum values. This change removes default cases from all the places that they shouldn't be, and adds a missing enum case for one switch statement that wasn't covering all values.	2022-04-26 09:57:18 -05:00
Chris Bieneman	500d677f1d	[SPIRV][NFC] Fix warning on class/struct mismatch Clang issues a warning on class/struct mismatch because the MSVC ABI varries for classes and structs.	2022-04-26 09:51:46 -05:00
Xiang1 Zhang	c430f0f532	[X86] Add use condition for combineSetCCMOVMSK Reviewed by RKSimon, LuoYuanke Differential Revision: https://reviews.llvm.org/D123652	2022-04-26 16:42:50 +08:00
Luo, Yuanke	f3ad7ea03a	[X86][AMX] Report error when shapes are not pre-defined. Instead of report fatal error, this patch emit error message and exit when shapes are not pre-defined. This would cause the compiling fail but not crash. Differential Revision: https://reviews.llvm.org/D124342	2022-04-26 14:57:25 +08:00
Chris Bieneman	3143840f21	NFC. Add missing DXILPointerTyID case This resolves -Werror hexigon build failures.	2022-04-25 20:08:33 -05:00
Chris Bieneman	e6f44a3cd2	Add PointerType analysis for DirectX backend As implemented this patch assumes that Typed pointer support remains in the llvm::PointerType class, however this could be modified to use a different subclass of llvm::Type that could be disallowed from use in other contexts. This does not rely on inserting typed pointers into the Module, it just uses the llvm::PointerType class to track and unique types. Fixes #54918 Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D122268	2022-04-25 17:49:43 -05:00
Jakub Chlanda	76d1f5eaa8	[NVPTX] Support float <-> 2 x half bitcasts Make sure NVPTX backend can handle bitcasting between `float` and `<2 x half>` types. This was discovered through: https://github.com/intel/llvm/issues/5969 I'm not suggesting that such bitcasts make much sense, but it feels like the compiler should not hard crash on them. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D124171	2022-04-25 14:37:41 -07:00
Craig Topper	40f1af4760	[RISCV] Add isCommutable to ADD/ADDW/MUL/AND/OR/XOR/MIN/MAX/CLMUL Reviewed By: reames Differential Revision: https://reviews.llvm.org/D123970	2022-04-25 10:53:41 -07:00

1 2 3 4 5 ...

67064 Commits