llvm-project

Author	SHA1	Message	Date
Craig Topper	7fee58acf4	[RISCV] Update relax-per-target-feature.ll to use hexadecimal constants. NFC Needed after 3dde0d02568d31ae48b557c486a3ff4edb24199b	2023-12-14 21:08:01 -08:00
Saiyedul Islam	e21b7e2143	[AMDGPU][NFC] Check more autogenerated llc tests for COV5 (#75219 ) Regenerate a few more llc tests to check for COV5 instead of the default ABI version.	2023-12-15 10:27:49 +05:30
Craig Topper	2a21260ea8	[SelectionDAG] Use getVectorElementPointer in DAGCombiner::replaceStoreOfInsertLoad. (#74249 ) This ensures we clip the index to be in bounds of the vector we are inserting into. If the index is out of bounds the results of the insert element is poison. If we don't clip the index we can write memory that was not part of the original store. Fixes #74248 #75557.	2023-12-14 20:25:16 -08:00
Matt Arsenault	f4b5be1ecd	Reapply "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" This reverts commit 69c4930aad9659ec6ab846c8e7124d6afe044b1e. See if this sticks after a few more coalescer assertions are fixed.	2023-12-15 10:51:47 +07:00
Jianjian Guan	3fe81410b2	[clang][RISCV] Change default abi with f extension but without d extension (#73489 ) Now we have default abi lp64 for rv64if and ilp32 for rv32if, which is different with riscv-gnu-toolchain. In `8e9fb09a0c/configure (L3385)` when have f and not d, it prefers lp64f/ilp32f but no soft float. This patch tries to make their behaviors consistent.	2023-12-15 11:16:05 +08:00
Youngsuk Kim	f9304974cc	[llvm][NVPTX] Inform that 'DYNAMIC_STACKALLOC' is unsupported (#74684 ) Catch unsupported path early up, and emit error with information. Motivated by the following threads: * https://discourse.llvm.org/t/nvptx-problems-with-dynamic-alloca/70745 * #64017	2023-12-14 22:06:22 -05:00
Wang Yaduo	3dde0d0256	[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053 ) Enable the llvm-objdump to disassemble the immediate of RISCV instruction in hexadecimal format with --print-imm-hex flag.	2023-12-15 10:13:20 +08:00
Arthur Eubanks	239a41e8f2	Re-Reland [X86] Respect code models more when determining if a global reference can fit in 32 bits (#75386 ) For non-GlobalValue references, the small and medium code models can use 32 bit constants. For GlobalValue references, use TargetMachine::isLargeGlobalObject(). Look through aliases for determining if a GlobalValue is small or large. Even the large code model can reference small objects with 32 bit constants as long as we're in no-pic mode, or if the reference is offset from the GOT. Original commit broke the build... First reland broke large PIC builds referencing small data since it was using GOTOFF as a 32-bit constant.	2023-12-14 14:12:37 -08:00
Jon Roelofs	b071b70317	[GlobalISel] Always direct-call IFuncs and Aliases (#74902 ) This is safe because for both cases, the use must be in the same TU as the definition, and they cannot be forward declared.	2023-12-14 14:58:20 -07:00
Jon Roelofs	640c1d3dd1	[llvm] Support IFuncs on Darwin platforms (#73686 ) ... by lowering them as lazy resolve-on-first-use symbol resolvers. Note that this is subtly different timing than on ELF platforms, where ifunc resolution happens at load time. Since ld64 and ld-prime don't support all the cases we need for these, we lower them manually in the AsmPrinter.	2023-12-14 14:40:52 -07:00
Jay Foad	3e6da3252f	[AMDGPU] Add GFX12 s_sleep_var instruction and intrinsic (#75499 )	2023-12-14 21:11:39 +00:00
Arthur Eubanks	15617d14f7	Revert "Reland [X86] Respect code models more when determining if a global reference can fit in 32 bits (#75386 )" This reverts commit ec92d74a0ef89b9dd46aee6ec8aca6bfd3c66a54. Breaks some compiler-rt tests, e.g. https://lab.llvm.org/buildbot/#/builders/37/builds/28834	2023-12-14 12:28:50 -08:00
Natalie Chouinard	e75f37fd64	[SPIR-V][NFC] Require asserts on 2 tests (#75087 ) These tests currently fail on asserts, so adding a REQUIRES to make sure they're skipped on builds with asserts disabled. Follow-up from #74849	2023-12-14 15:17:11 -05:00
Mirko Brkusanin	c6351b4cc9	[AMDGPU][NFC] Regenerate .mir test	2023-12-14 18:58:43 +01:00
Arthur Eubanks	ec92d74a0e	Reland [X86] Respect code models more when determining if a global reference can fit in 32 bits (#75386 ) For non-GlobalValue references, the small and medium code models can use 32 bit constants. For GlobalValue references, use TargetMachine::isLargeGlobalObject(). Look through aliases for determining if a GlobalValue is small or large. Even the large code model can reference small objects with 32 bit constants as long as we're in no-pic mode, or if the reference is offset from the GOT. Original commit broke the build...	2023-12-14 09:49:35 -08:00
Philip Reames	7537c3c452	[RISCV] Precommit test coverage for VLMAX encodable via vsetivli	2023-12-14 09:42:54 -08:00
Arthur Eubanks	f0c03da63c	Revert "[X86] Respect code models more when determining if a global reference can fit in 32 bits" (#75500 ) Reverts llvm/llvm-project#75386 Breaks build.	2023-12-14 09:32:55 -08:00
Simon Pilgrim	88f1a2c50d	[X86] combineLoad - allow constant loads to share matching 'lower constant bits' with larger VBROADCAST_LOAD/SUBV_BROADCAST_LOAD nodes We already had separate support for VBROADCAST_LOAD - merge this with the generic load handling and add SUBV_BROADCAST_LOAD support as well.	2023-12-14 17:31:07 +00:00
Arthur Eubanks	5e38ba26d2	[X86] Respect code models more when determining if a global reference can fit in 32 bits (#75386 ) For non-GlobalValue references, the small and medium code models can use 32 bit constants. For GlobalValue references, use TargetMachine::isLargeGlobalObject(). Look through aliases for determining if a GlobalValue is small or large. Even the large code model can reference small objects with 32 bit constants as long as we're in no-pic mode, or if the reference is offset from the GOT.	2023-12-14 09:28:27 -08:00
Philip Reames	b7ebba3d8a	[riscv] Consolidate a set of load-add-store tests into one file	2023-12-14 09:08:04 -08:00
Philip Reames	1fdbdb84a1	[riscv] Convert a set of tests to opaque pointers	2023-12-14 08:57:13 -08:00
Philip Reames	46d1f30882	[RISCV][InsertSETVTLI] Handle large immediates in backwards walk (#75409 ) When doing our backwards walk, we were not handling the case where the AVL was defined by a register whose definition was an ADDI xN, x0, <imm>. Doing so (as we already do in the forward pass) allows us to prune a few more transitions.	2023-12-14 07:36:07 -08:00
Simon Pilgrim	3c423722cf	[X86] combineLoad - improve constant pool matches by ignoring undef elements When trying to share constant pool entries, we can ignore the undef elements of the entry that is being removed	2023-12-14 15:13:39 +00:00
Jonas Paulsson	01061ed370	[SystemZ] Improve shouldCoalesce() for i128. (#74942 ) The SystemZ implementation of shouldCoalesce() is merely a workaround for the fact that regalloc can run out of registers when extending 128-bit intervals with subreg (GPR64/GPR32) COPYs. This patch adds more freedom to the coalescer as it now only checks that the subreg interval is local to MBB and does not have too many physreg clobbers.	2023-12-14 15:55:27 +01:00
ostannard	4888218d03	[ARM] Do not emit unwind tables when saving LR around outlined call (#69611 ) In some cases, the machine outliner needs to preserve LR across an outlined call by pushing it onto the stack. Previously, this also generated unwind table instructions, which is incorrect because EHABI unwind tables cannot represent different stack frames a different points in the function, so the extra unwind info applied to the entire function. The outliner code already avoided generating CFI instructions, but EHABI unwind data is generated later from the actual instructions, so we need to avoid using the FrameSetup and FrameDestroy flags to prevent unwind data being generated.	2023-12-14 14:46:13 +00:00
Shih-Po Hung	b97c5a9554	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 21:16:11 +08:00
Valery Pykhtin	dd051295bc	[AMDGPU] Enable GCNRewritePartialRegUses pass by default. (#72975 ) Let's try once again after #69957 has landed.	2023-12-14 14:10:27 +01:00
Simon Pilgrim	2141a51be1	[X86] broadcast-elm-cross-splat-vec.ll - drop constant pool check This is handled in the assembly comments.	2023-12-14 12:10:33 +00:00
DianQK	7649d22306	[AArch64] ORRWrs is copy instruction when there's no implicit def of the X register (#75184 ) Follows https://github.com/llvm/llvm-project/pull/74682#issuecomment-1850268782. Fixes #74680.	2023-12-14 19:19:55 +08:00
Simon Pilgrim	a0c7a29655	[GlobalISel] IRTranslator::translateGetElementPtr - don't assume a gep constant offset is representable as i64 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65052	2023-12-14 11:02:38 +00:00
Simon Pilgrim	b7fc78255e	Revert rG2047ab00eaf0a17e71ce5e8a5b27a8c90f034c3d "[VPlan] Add a test for testing unused interleave recipes (#75026 )" vplan-unused-interleave-group.ll is causing buildbot failures	2023-12-14 10:25:41 +00:00
Shih-Po Hung	2047ab00ea	[VPlan] Add a test for testing unused interleave recipes (#75026 ) - Precommit of tests from #71360. - Replace `undef` pointer operands and add stores to avoid the loads being optmized away.	2023-12-14 17:36:58 +08:00
Philip Reames	1f3d13c415	[riscv] Fix build due to missing test update My 632f1c appears to have missed a test update, sorry for the breakage.	2023-12-13 18:18:41 -08:00
Philip Reames	632f1c5d18	[RISCV] When VLEN is exactly known, prefer VLMAX encoding for vsetvli (#75412 ) If we know the exact VLEN, then we can tell if the AVL for particular operation is equivalent to the vsetvli xN, zero, <vtype> encoding. Using this encoding is better than having to materialize an immediate in a register, but worse than being able to use the vsetivli zero, imm, <type> encoding.	2023-12-13 17:51:03 -08:00
Arthur Eubanks	c64334fb30	[X86][FastISel] Support medium code model in more places (#75375 ) The medium code model is basically identical to the small code model except that large objects cannot be referenced with 32-bit offsets.	2023-12-13 13:44:37 -08:00
Arthur Eubanks	e8f43883a0	[X86][test] Use separate check prefix in code-model-elf.ll Since these will produce different results in upcoming changes.	2023-12-13 13:05:59 -08:00
Philip Reames	29bb7f762b	[RISCV] Add test coverage for profitable vsetvli a0, zero, <vtype> cases Test coverage for an upcoming change, we can avoid generating an immediate in register if we know the immediate is equal to vlmax.	2023-12-13 12:58:25 -08:00
Stanislav Mekhanoshin	c6ecbcb48b	[AMDGPU] Fix no waitcnt produced between LDS DMA and ds_read on gfx10 (#75245 ) BUFFER_LOAD_DWORD_LDS was incorrectly touching vscnt instead of the vmcnt. This is VMEM load and DS store, so it shall use vmcnt.	2023-12-13 10:49:36 -08:00
Craig Topper	2c185709bc	[RISCV] Remove setJumpIsExpensive(). (#74647 ) Middle end up optimizations can speculate away the short circuit behavior of C/C++ && and \|\|. Using i1 and/or or logical select instructions and a single branch. SelectionDAGBuilder can turn i1 and/or/select back into multiple branches, but this is disabled when jump is expensive. RISC-V can use slt(u)(i) to evaluate a condition into any GPR which makes us better than other targets that use a flag register. RISC-V also has single instruction compare and branch. So its not clear from a code size perspective that using compare+and/or is better. If the full condition is dependent on multiple loads, using a logic delays the branch resolution until all the loads are resolved even if there is a cheap condition that makes the loads unnecessary. PowerPC and Lanai are the only CPU targets that use setJumpIsExpensive. NVPTX and AMDGPU also use it but they are GPU targets. PowerPC appears to have a MachineIR pass that turns AND/OR of CR bits into multiple branches. I don't know anything about Lanai and their reason for using setJumpIsExpensive. I think the decision to use logic vs branches is much more nuanced than this big hammer. So I propose to make RISC-V match other CPU targets. Anyone who wants the old behavior can still pass -mllvm -jump-is-expensive=true.	2023-12-13 09:37:25 -08:00
CarolineConcatto	f2464ca317	[SVE2.1][Clang][LLVM]Int/FP reduce builtin in Clang and LLVM intrinsic (#69926 ) This patch implements the builtins in Clang and the LLVM-IR intrinsic for the following: // Variants are also available for: // _s8, _s16, _u16, _s32, _u32, _s64, _u64, // _f16, _f32, _f64uint8x16_t svaddqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for: // _s8, _u16, _s16, _u32, _s32, _u64, _s64 uint8x16_t svandqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t sveorqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svorqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for: // _s8, _u16, _s16, _u32, _s32, _u64, _s64; uint8x16_t svmaxqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svminqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for _f32, _f64 float16x8_t svmaxnmqv[_f16](svbool_t pg, svfloat16_t zn); float16x8_t svminnmqv[_f16](svbool_t pg, svfloat16_t zn); According to the PR#257[1] The reduction instruction uses scalable vectors as input and fixed vectors as output, therefore we changed SVEEmitter to emit fixed vector types in case the neon header(arm_neon.h) is not present. [1]https://github.com/ARM-software/acle/pull/257 Co-author: Dinar Temirbulatov <dinar.temirbulatov@arm.com>	2023-12-13 15:45:59 +00:00
Petar Avramovic	6892c175c5	AMDGPU/GlobalISel: add AMDGPUGlobalISelDivergenceLowering pass (#75340 ) Add empty AMDGPUGlobalISelDivergenceLowering pass. This pass will implement - selection of divergent i1 phis as lane mask phis, requires lane mask merging in some cases - lower uses of divergent i1 values outside of the cycle using lane mask merging - lowering of all cases of temporal divergence: - lower uses of uniform i1 values outside of the cycle using lane mask merging - lower uses of uniform non-i1 values outside of the cycle using a copy to vgpr inside of the cycle Add very detailed set of regression tests for cases mentioned above. patch 1 from: https://github.com/llvm/llvm-project/pull/73337	2023-12-13 16:42:56 +01:00
Yingwei Zheng	3564c85b0e	[RISCV] Eliminate dead li after emitting VSETVLIs (#65934 ) This patch tracks li instructions that set AVL operands and does DCE after emitting VSETVLIs.	2023-12-13 23:18:48 +08:00
Mariusz Sikora	7f55d7de1a	[AMDGPU] GFX12: Add Split Workgroup Barrier (#74836 ) Co-authored-by: Vang Thao <Vang.Thao@amd.com>	2023-12-13 15:01:13 +01:00
Nikita Popov	9c093cbb5e	Revert "[StackColoring] Delete dead stack slots (#72633 )" This reverts commit a29457844bf0c4b2eb5c0f3877b6e8ef30cdef52. Causes an assertion failure in llvm/test/DebugInfo/COFF/lexicalblock.ll.	2023-12-13 14:31:09 +01:00
Piotr Sobczak	6eec80133b	[AMDGPU] Min/max changes for GFX12 (#75214 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-13 14:18:10 +01:00
Piotr Sobczak	fac093dd08	[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-13 13:52:40 +01:00
mohammed-nurulhoque	a29457844b	[StackColoring] Delete dead stack slots (#72633 ) Deletes slots that have lifetime markers and the lifetime ranges are empty.	2023-12-13 13:01:21 +01:00
Momchil Velikov	d7ee99a4fc	[MachineSink] Clear kill flags of sunk addressing mode registers (#75072 ) When doing sink-and-fold, the MachineSink clears the "killed" flags of the operands of the sunk (and deleted) instruction. However, this is not always sufficient. In some cases we can create the new load/store instruction with operands other than the ones present in the deleted instruction. One such example is folding a zero word extend into a memory load on AArch64. The zero-extend is represented by a pair of instructions - `MOV` (i.e. `ORRwrs`) followed by a `SUBREG_TO_REG`. The `SUBREG_TO_REG` is deleted (it is the sunk instruction), but the new load instruction mentions operands "killed" in the `MOV`, which is no longer correct. To fix this, clear the "killed" flags of the registers participating in the addressing mode.	2023-12-13 09:15:28 +00:00
paperchalice	60eca674b1	[CodeGen] Port `ExpandMemCmp` to new pass manager (#74050 )	2023-12-13 16:18:24 +08:00
Matt Arsenault	300a55003c	RegisterCoalescer: Fix implicit operand handling during rematerialize (#75271 ) If the rematerialize was placing a subregister into a super register, and implicit operands referenced the original register, we need to add undef flags to the now-subregister indexed implicit operands. Depends #75152	2023-12-13 15:15:36 +07:00

... 30 31 32 33 34 ...

52796 Commits