llvm-project

Author	SHA1	Message	Date
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
Momchil Velikov	a9d0ab2ee5	Re-apply "[AArch64] Enable "sink-and-fold" in MachineSink by default (#67432 )" This re-applies commit ace20e24287b, which was reverted in eff4ef25b3dc. The issues were fixed in: * b30765caf874 [AArch64] Fix an incorrect handling of debug values in MachineSink (#68107) * b454b04d6869 [AArch64] Fix a compiler crash in MachineSink (#67705)	2023-10-06 09:34:42 +01:00
Diana Picus	2e1718adc8	Reland "AMDGPU: Duplicate instead of COPY constants from VGPR to SGPR (#66882 )" Teach the si-fix-sgpr-copies pass to deal with REG_SEQUENCE, PHI or INSERT_SUBREG where the result is an SGPR, but some of the inputs are constants materialized into VGPRs. This may happen in cases where for instance several instructions use an immediate zero and SelectionDAG chooses to put it in a VGPR to satisfy all of them. This however causes the si-fix-sgpr-copies to try to switch the whole chain to VGPR and may lead to illegal VGPR-to-SGPR copies. Rematerializing the constant into an SGPR fixes the issue. This was originally reverted because it triggered an unrelated bug in PEI on one of the OpenMP buildbots. That bug has been fixed in #68299, so it should be ok to try again.	2023-10-06 10:03:50 +02:00
Diana	be382de059	[AMDGPU] Use correct operand order for shifts (#68299 ) In a special case in frame index elimination (when the offset is 0), we generate either a S_LSHR_B32 or a V_LSHRREV_B32 using the same code. However, they don't expect their operands in the same order - S_LSHR_B32 takes the value to be shifted first and then the shift amount, whereas V_LSHRREV_B32 has the operands reversed (hence the REV in its name). Update the code & tests to take this into account. Also remove an outdated comment (this code is definitely reachable now that non-entry functions no longer have a fixed emergency scavenge slot).	2023-10-06 09:43:04 +02:00
Matt Arsenault	5082e827c1	AMDGPU/GlobalISel: Add test for packed sub selection Mirror of the add test, I've had this lying around for a long time.	2023-10-05 10:07:57 -07:00
Matt Arsenault	b5ebf07499	AMDGPU/GlobalISel: Add global-isel run lines to shrink add/sub test	2023-10-05 10:07:57 -07:00
Matt Harding	bd7ca98b66	Ensure NoTrapAfterNoreturn is false for the wasm backend (#65876 ) In the WebAssembly back end, the TrapUnreachable option is currently load-bearing for correctness, inserting wasm `unreachable` instructions where needed to create valid wasm. There is another option, NoTrapAfterNoreturn, that removes some of those traps and causes incorrect wasm to be emitted. This turns off `NoTrapAfterNoreturn` for the Wasm backend and adds new tests.	2023-10-05 09:17:45 -07:00
Matt Arsenault	2ca30eb8fd	AMDGPU/GlobalISel: Handle mubuf load/store for more types (#68268 ) Fixes MUBUF path for most vectors and pointers, which unblocks fixing the gfx6/7 run lines in assorted tests. Also fixes inconsistent behavior for -flat-for-global.	2023-10-05 05:36:16 -07:00
Ivan Kosarev	f04aa1f814	[AMDGPU][CodeGen] Fold immediates in src1 operands of V_MAD/MAC/FMA/FMAC. (#68002 )	2023-10-05 14:22:29 +03:00
Kirill Stoimenov	0a776996af	Revert "[DAG] Attempt shl narrowing in SimplifyDemandedBits" This reverts commit 7a8c04ef84ecdab4390b451d4c2fe17bc45a7b63.	2023-10-04 22:15:41 +00:00
Jeffrey Byrnes	7794e16b49	[AMDGPU]: Allow combining into v_dot4 Differential Revision: https://reviews.llvm.org/D155995 Change-Id: Id15d232629a32a3549b13d47bf84d7a61b28b928	2023-10-04 13:31:36 -07:00
Luke Lau	3b0b84fd00	[RISCV] Fix illegal build_vector when lowering double id buildvec on RV32 (#67017 ) When lowering a constant build_vector sequence of doubles on RV32, if the addend wasn't zero, or the step/denominator wasn't one, it would crash trying to emit an illegal build_vector of <n x i64> with i32 operands, e.g: t15: v2i64 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1> This patch fixes this by lowering the splats with SelectionDAG::getConstant with the vector type, which handles making it legal via splat_vector_parts.	2023-10-04 21:28:44 +01:00
Nitin John Raj	da9f9082ea	[RISCV][GlobalISel] Legalize G_FRAME_INDEX (#67746 ) G_FRAME_INDEX is legal for pointers.	2023-10-04 13:27:23 -07:00
Alex Richardson	e86d6a43f0	Regenerate test checks for tests affected by D141060	2023-10-04 10:51:35 -07:00
Alex Richardson	83c4227ab7	Auto-generate test checks for tests affected by D141060 These files had manual CHECK lines which make the diff from D141060 very difficult to review.	2023-10-04 10:51:35 -07:00
Philip Reames	45a334d31c	[RISCV] Generaize reduction tree matching to all integer reductions (#68014 ) (reapply) This was reverted in 824251c9b349d859a9169196cd9533c619a715ce exposed by this change in a previous patch. Fixed in 199cbec987ee68d70611db8e7961b43c3dbad83e. Original commit message follows. This builds on the transform introduced in https://github.com/llvm/llvm-project/pull/67821, and generalizes it for all integer reduction types. A couple of notes: * This will only form smax/smin/umax/umin reductions when zbb is enabled. Otherwise, we lower the min/max expressions early. I don't care about this case, and don't plan to address this further. * This excludes floating point. Floating point introduces concerns about associativity. I may or may not do a follow up patch for that case. * The explodevector test change is mildly undesirable from a clarity perspective. If anyone sees a good way to rewrite that to stablize the test, please suggest.	2023-10-04 10:41:29 -07:00
Philip Reames	199cbec987	[RISCV] Don't try to form VECREDUCE without vector instructions This fixes a bug in f0505c which wasn't noticed until 7a0b9da had landed. This triggered a revert of 7a0b9da, which will be reapplied after this fix.	2023-10-04 10:29:27 -07:00
Anatoly Trosinenko	d32cce5b75	[AArch64][PAC] Specify Defs and Uses of PAUTH_(PROLOGUE\|EPILOGUE) This is a follow-up to eb02ee44d32531931af5312cd450779011664eef.	2023-10-04 18:20:52 +03:00
Evgenii Kudriashov	0dcc65359b	[X86] Add combine tests for pointers of mixed sizes (NFC) (#68219 ) Precommit for #67168 to solve #66873	2023-10-04 16:31:24 +02:00
Alex Bradbury	824251c9b3	Revert "[RISCV] Generaize reduction tree matching to all integer reductions (#68014 )" This reverts commit 7a0b9daac9edde4293d2e9fdf30d8b35c04d16a6 and 63bbc250440141b1c51593904fba9bdaa6724280. I'm seeing issues (e.g. on the GCC torture suite) where combineBinOpOfExtractToReduceTree is called when the V extensions aren't enabled and triggers a crash due to RISCVSubtarget::getElen asserting. I'll aim to follow up with a minimal reproducer. Although it's pretty obvious how to avoid this crash with some extra gating, there are a few options as to where that should be inserted so I think it's best to revert and agree the appropriate fix separately.	2023-10-04 12:51:01 +01:00
Ivan Kosarev	cf80defae2	[AMDGPU][GFX11] Do not rewrite V_FMA/FMAC_* to V_FMAAK_F16_t16 on operand legalization. (#66202 ) V_FMAAK_F16_t16 takes VGPR_32_Lo128 operands whereas the original instructions would have VGPR_32 operands. Switching the opcodes without updating operands' register classes leads to MachineVerifier complaining about the classes not matching instruction definitions. The problem only reveals itself of builds with expensive checks enabled because of missing -verify-machineinstrs in the test. This is the third attempt to update CodeGen/AMDGPU/fma.f16.ll to run for GFX11, following the second attempt in a1e38e0b8e3e, partially reverted in eaf737a4e004.	2023-10-04 12:41:46 +01:00
Simon Pilgrim	7a8c04ef84	[DAG] Attempt shl narrowing in SimplifyDemandedBits If a shl node leaves the upper half bits zero / undemanded, then see if we can profitably perform this with a half-width shl and a free trunc/zext. Followup to D146121 Differential Revision: https://reviews.llvm.org/D155472	2023-10-04 10:23:02 +01:00
Momchil Velikov	b30765caf8	[AArch64] Fix an incorrect handling of debug values in MachineSink (#68107 )	2023-10-04 10:11:47 +01:00
David Green	20fc2ffb15	[AArch64][GlobalISel] Handle fp constant splats This changes the DUP(constant) -> MOVI code to handle either integer or fp types, allowing more constant to be selected, and fixes up some cases where fp constants were being incorrectly selected.	2023-10-04 08:50:21 +01:00
Rahman Lavaee	61785ffcfc	Do not remove empty basic blocks which have address taken. (#67740 ) This PR replaces `isMachineBlockAddressTaken` by `hasAddressTaken` to include blocks which have their IR address taken as well. These blocks are also not removable since their predecessors' terminators do not directly point to the block.	2023-10-03 16:09:31 -07:00
Yuta Saito	da0ca5dee4	[WebAssembly] Define local sp if `llvm.stacksave` is used (#68133 ) Usually `llvm.stacksave/stackrestore` are used together with `alloca` but they can appear without it (e.g. `alloca` can be optimized away). WebAssembly's function local physical user sp register, which is referenced by `llvm.stacksave` is created while frame lowering and replaced with virtual register. However the sp register was not created when `llvm.stacksave` is used without `alloca`, and it led MIR verification failure about use-before-def of sp virtual register. Resolves https://github.com/llvm/llvm-project/issues/62235	2023-10-03 14:51:35 -07:00
Alex Bradbury	eae1e28cc2	[RISCV] Mark the Zfa extension as non-experimental (#68113 ) Following the version bump in #67964 and the bug fix in #68026 I believe we're ready to mark Zfa as non-experimental. I'll note the GCC torture suite passes now with Zfa enabled (though it's more of a litmus test than anything else).	2023-10-03 18:16:13 +01:00
Alex Bradbury	18c3c46858	[RISCV] Update Zfa extension version to 1.0 (#67964 ) The Zfa specification was recently ratified <https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions>. This commit bumps the version to 1.0, but leaves it as an experimental extension (to be done in a follow-on patch), so reviews can focus on confirming there haven't been spec changes we have missed (which as noted below, is more difficult than usual). Because the development of the Zfa spec overlapped with the transition of riscv-isa-manual from LaTeX to AsciiDoc, it's more difficult than usual to confirm version changes. The linked PDF in RISCVUsage is for some reason a 404. Key commit histories to review are: * Changes to zfa.adoc on the main branch <https://github.com/riscv/riscv-isa-manual/commits/main/src/zfa.adoc> * Changes to zfa.tex on the now defunct latex branch <https://github.com/riscv/riscv-isa-manual/commits/latex/src/zfa.tex> From reviewing these, I believe there have been no changes to the spec since version 0.1/0.2 (sadly the AsciiDoc and LaTeX versions of the spec are inconsistent about version numbering).	2023-10-03 17:54:29 +01:00
Luke Lau	169c20584d	[RISCV] Relax some Zvbb patterns and lowerings to Zvkb (#68115 ) vandn, vrev8 and vro{l,r} are now part of Zvkb, which Zvbb now implies. This patch updates the predicates to check for Zvkb instead of Zvbb in the tablegen patterns for the SD and VL nodes, as well as some of the lowering logic in RISCVISelLowering.	2023-10-03 17:42:40 +01:00
Simon Pilgrim	4c37372dae	[X86] promoteExtBeforeAdd - determine if an addition is implicitly NSW/NUW Pulled out of D155472	2023-10-03 17:32:28 +01:00
Simon Pilgrim	d97f49b7e0	[X86] Add pointer mask test coverage for implicit NSW/NUW adds promoteExtBeforeAdd currently relies on a NSW/NUW flag, which have been lost by previous folds.	2023-10-03 17:32:28 +01:00
Kai Nacke	42de2b7e99	[SystemZ/z/OS] Add library names for intrinsics (#68114 ) On z/OS, many library functions have a non-standard name. This change initializes the table of runtime function which results from lowering intrinsics to library calls.	2023-10-03 18:53:52 +03:00
Dinar Temirbulatov	8232ab76d0	[AArch64][SVE][SVE2] Enable tbl, tbl2 for shuffle lowering for fixed vector types. This change enablse some of shuffle lowering with TBL instruction with SVE and SVE2 for indexing for one register and TBL version for SVE2 while indexing to both registers. Differential Revision: https://reviews.llvm.org/D152205	2023-10-03 15:19:00 +00:00
Philip Reames	7a0b9daac9	[RISCV] Generaize reduction tree matching to all integer reductions (#68014 ) This builds on the transform introduced in https://github.com/llvm/llvm-project/pull/67821, and generalizes it for all integer reduction types. A couple of notes: * This will only form smax/smin/umax/umin reductions when zbb is enabled. Otherwise, we lower the min/max expressions early. I don't care about this case, and don't plan to address this further. * This excludes floating point. Floating point introduces concerns about associativity. I may or may not do a follow up patch for that case. * The explodevector test change is mildly undesirable from a clarity perspective. If anyone sees a good way to rewrite that to stablize the test, please suggest.	2023-10-03 07:34:39 -07:00
Mikhail Gudim	9b5120050f	[RISCV] A test demonstrating missed opportunity to combine `addi` into (#67022 ) load / store offset. The patch to address this will be in a separate PR. A possible fix: https://github.com/llvm/llvm-project/pull/67024/files	2023-10-03 10:18:31 -04:00
Simon Pilgrim	77c43e1489	[X86][FastISel] X86SelectIntToFP - don't assume value type is simple. Fixes #68068	2023-10-03 11:05:14 +01:00
Martin Storsjö	6ae36c0127	[AArch64] Disable loop alignment for Windows targets (#67894 ) This should fix #66912. When emitting SEH unwind info, we need to be able to calculate the exact length of functions before alignments are fixed. Until that limitation is overcome, just disable all loop alignment on Windows targets.	2023-10-02 23:55:23 +03:00
Alex Bradbury	0152e1f2d5	[RISCV] Fix incorrect codegen for Zfa with negated forms of constants in the lookup table (#68026 ) The logic in `RISCVLoadFPImm::getLoadFPImm` recognises that the only supported negative value is -1.0, but due to a typo returns `false` otherwise (entry 0, which is -1.0) rather than returning -1 (indicating no match found).	2023-10-02 21:20:38 +01:00
Craig Topper	3c0990c188	[RISCV] Generalize the (ADD (SLLI X, 32), X) special case in constant materialization. (#66931 ) We don't have to limit ourselves to a shift amount of 32. We can support other shift amounts that make the upper 32 bits line up.	2023-10-02 13:03:06 -07:00
Alex Bradbury	451255b207	[RISCV][test] Extend test coverage for Zfa's fli instructions to cover miscompile There's a miscompile currently for negative numbers (other than -1) that are the negated form of numbers in the fli lookup table. This adds tests that capture the issue, with a fix to follow in a separate commit/PR.	2023-10-02 20:48:30 +01:00
Matt Arsenault	f79379398d	Revert "CodeGen: Disable isCopyInstrImpl if there are implicit operands" This reverts commit bc7d88faf1a595ab59952a2054418cdd0d9eeee8. This is broken with 414ff812d6241b728754ce562081419e7fc091eb reverted.	2023-10-02 22:43:24 +03:00
Matt Arsenault	d4fb503f83	CodeGen: Add regressions from subreg_to_reg implicit-defs These catch assertions hit after 414ff812d6241b728754ce562081419e7fc091eb	2023-10-02 22:38:31 +03:00
Kirill Stoimenov	e0f86ca200	Revert "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" This reverts commit 414ff812d6241b728754ce562081419e7fc091eb.	2023-10-02 18:19:52 +00:00
Simon Pilgrim	b4f591363c	[DAG] visitSHL - move SimplifyDemandedBits after all standard folds to give them a chance to match Pulled out of D155472	2023-10-02 16:09:35 +01:00
Felipe de Azevedo Piovezan	a41ce98064	[FastISel][DebugInfo] Handle dbg.value targeting allocas (#67187 ) FastISel currently drops dbg.values targeting allocas. It may seem surprising that a simple case would fail to be lowered, but dbg.values targeting allocas are not common; we usually have dbg.declares doing that, and those are handled by the common code between FastISel and SelectionDAGISel. This patch addresses the issue by querying the static alloca map from FuncInfo. If we have a frame index for it, we create a DBG_VALUE intrinsic from it.	2023-10-02 07:10:11 -07:00
Ben Shi	5db0a450be	[AVR] Fix a crash in AVRInstrInfo::insertIndirectBranch (#67324 ) Fixes https://github.com/llvm/llvm-project/issues/67042	2023-10-02 21:14:22 +08:00
Matt Arsenault	bc7d88faf1	CodeGen: Disable isCopyInstrImpl if there are implicit operands This is a conservative workaround for broken liveness tracking of SUBREG_TO_REG to speculatively fix all targets. The current reported failures are on X86 only, but this issue should appear for all targets that use SUBREG_TO_REG. The next minimally correct refinement would be to disallow only implicit defs. The coalescer now introduces implicit-defs of the super register to track the dependency on other subregisters. If we see such an implicit operand, we cannot simply treat the subregister def as the result operand in case downstream users depend on the implicitly defined parts. Really target implementations should be considering the implicit defs and trying to interpret them appropriately (maybe with some generic helpers). The full implicit def could possibly be reported as the move result, rather than the subregister def but that requires additional work. Hopefully fixes #64060 as well. This needs to be applied to the release branch. https://reviews.llvm.org/D156346	2023-10-02 15:16:40 +03:00
Simon Pilgrim	2984e3529b	[X86] matchIndexRecursively - fold zext(addlike(shl_nuw(x,c1),c2) patterns into LEA Pulled out of D155472 - handle zeroextended scaled address indices	2023-10-02 12:38:25 +01:00
Simon Pilgrim	2908142089	[X86] Add test coverage for zext(or(shl_nuw(x,c1),c2)) pointer math Additional test coverage for D155472	2023-10-02 12:38:25 +01:00
JP Lehr	e816c89c84	Revert "InlineSpiller: Consider if all subranges are the same when avoiding redundant spills" This reverts commit d8127b2ba8a87a610851b9a462f2fc2526c36e37.	2023-10-02 06:26:33 -05:00

... 49 50 51 52 53 ...

52796 Commits