llvm-project

Author	SHA1	Message	Date
Gleb Popov	0356d0cfdc	Print more descriptive error message when trying to link a global with appending linkage (#69613 ) This is a proper fix for https://github.com/llvm/llvm-project/issues/40308	2024-04-03 12:26:12 +01:00
Chen Zheng	29c7d1a60c	[PPC] [NFC] add testcase for more store forwarding	2024-04-03 04:46:29 -04:00
David Green	6288f36c16	[AArch64][GlobalISel] Basic add_sat and sub_sat vector handling. (#80650 ) This tries to fill in the basic vector handling for sadd_sat/uadd_sat and ssub_sat/usub_sat. It just handles the basics, marking legal types and clamping illegally sized vectors to legal ones.	2024-04-03 08:44:51 +01:00
Ryotaro KASUGA	ea4a11926b	Reapply "[CodeGen] Fix register pressure computation in MachinePipeli… (#87312 ) …ner (#87030)" Fix broken test. This reverts commit b8ead2198f27924f91b90b6c104c1234ccc8972e.	2024-04-03 09:28:09 +09:00
Craig Topper	a9af66a90e	[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). (#87283 ) If the odd vector is undef or poison, the widening add and multiply trick doesn't work unless we freeze the odd vector. Unfortunately, freezing doesn't work when the operand is provably undef/poison. MIR doesn't have a representation for freeze so it just becomes a COPY from IMPLICIT_DEF which freely propagates undef to each operand independently. To work around this, check for undef explicitly and lower to a VZEXT_VL of the even vector. This produces better code than we'd get from a freeze anyway. I've left a FIXME for adding a freeze. I'll do that as a separate patch as it affects other tests and doesn't help with the new test.	2024-04-02 11:58:41 -07:00
Craig Topper	8c1dc5dd58	[RISCV] Add test for miscompile of vector.interleave when odd vector is literal poison. The interleave lowering relies on a math trick that requires passing the odd vector to two math instructions. In order to be correct these instructions must see the same value. If the odd vector is provably poison or undef, SelectionDAG will create a vwadd and vwmaccu where the operand is a copy from IMPLICIT_DEF. Later this will become just the undef flag on the operand. This gives the register allocator freedom to pick a different register for each instruction.	2024-04-02 11:49:08 -07:00
Simon Pilgrim	8bc2d19c13	[X86] canonicalizeShuffleWithOp - don't fold VPERMI(BINOP(X,Y)) -> BINOP(VPERMI(X),VPERMI(Y)) VPERMI (VPERMQ/PD) is nearly always lane-crossing and poorly merges with target shuffles (other than itself). For now, I've restricted VPERMI to only merge with itself, constants, loads and splats. We might be able to merge with a few other special cases (AND/ANDNP with constant?), which could help the shuffle-vs-trunc-256.ll AVX512VL regression, but since that now gives similar codegen to the other AVX512 variants, I'd prefer to improve the shuffle lowering for that properly.	2024-04-02 18:38:37 +01:00
Michael Maitland	153b8431bb	[RISCV][GISEL] Legalize G_BITCAST for scalable vectors (#85970 ) SelectionDAG marks ISD::BITCAST as legal between scalable vector types and ISelDAGToDAG deletes them. We mark G_BITCAST between scalable vectors as legal in GISel. A future patch will handle what to do with them after the legalizer (likley either drop them in a isel-preprocess or convert them to COPYs). BITCAST is needed for legalization of G_INSERT and G_EXTRACT. This is a precommit for legalization of G_INSERT and G_EXTRACT.	2024-04-02 12:30:51 -04:00
Bevin Hansson	cd6434f9ec	[ExpandLargeDivRem] Scalarize vector types. (#86959 ) expand-large-divrem cannot handle vector types. If overly large vector element types survive into isel, they will likely be scalarized there, but since isel cannot handle scalar integer types of that size, it will assert. Handle vector types in expand-large-divrem by scalarizing them and then expanding the scalar type operation. For large vectors, this results in a massive code expansion, but it's better than asserting.	2024-04-02 16:37:36 +02:00
Farzon Lotfi	82d8a95611	[SPIRV][HLSL] Add HLSL intrinsic tests (#86844 ) This PR is part of bookkeeping for #83882. It also brings the SPIRV hlsl intrinsic tests in parity with where the testing is on the DXIL backend.	2024-04-02 10:21:21 -04:00
Kevin P. Neal	737fc353d2	[FPEnv][AArch64] Correct strictfp test. Correct strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics These tests needed the strictfp attribute added to some function definitions and some function calls. Test changes verified with D146845.	2024-04-02 09:35:44 -04:00
Il-Capitano	0ef7437780	[SelectionDAG][Statepoint] Fix truncation of `gc.statepoint` ID argument (#85908 ) The ID argument of `gc.statepoint` gets incorrectly truncated to 32 bits during code generation. This is fixed by using `uint64_t` instead of `unsigned` for the `ID` member in `SelectionDAGBuilder::StatepointLoweringInfo`, and a `patchpoint` test case is extended to check for 64 bit ID generation in stackmaps.	2024-04-02 09:28:19 -04:00
Vyacheslav Levytskyy	6cce67a8f9	[SPIR-V] Fix validity of atomic instructions (#87051 ) This PR fixes validity of atomic instructions and improves type inference. More tests are able now to be accepted by `spirv-val`.	2024-04-02 10:59:18 +02:00
Thorsten Schütt	8bb9443333	[GlobalIsel] Combine G_EXTRACT_VECTOR_ELT (#85321 ) preliminary steps	2024-04-02 09:01:24 +02:00
Luke Lau	59dd10faf8	[RISCV] Add tests for fixed vector vwsll. NFC We are missing patterns for fixed vectors, where the sexts and zexts are legalized to _vl nodes.	2024-04-02 13:02:03 +08:00
Gulfem Savrun Yeniceri	b8ead2198f	Revert "[CodeGen] Fix register pressure computation in MachinePipeliner (#87030 )" This reverts commit a4dec9d6bc67c4d8fbd4a4f54ffaa0399def9627 because the test failed in the following builder: https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751864477467126481/overview	2024-04-01 18:27:41 +00:00
Ryotaro KASUGA	a4dec9d6bc	[CodeGen] Fix register pressure computation in MachinePipeliner (#87030 ) `RegisterClassInfo::getRegPressureSetLimit` has been changed to return a smaller value than before so the limit may become negative in later calculations. As a workaround, change to use `TargetRegisterInfo::getRegPressureSetLimit`. Also improve tests.	2024-04-01 17:04:44 +09:00
Vitaly Buka	cbb27bef3e	[CodeGen] Fix test after #86049	2024-04-01 00:44:27 -07:00
Vitaly Buka	d76a1233f7	[CodeGen] Fix test after #86049	2024-03-31 23:48:23 -07:00
Vitaly Buka	b890c17892	[CodeGen] Fix test after #86049	2024-03-31 23:22:07 -07:00
Vitaly Buka	289d2cc3f3	[CodeGen] Fix test after #86049	2024-03-31 23:10:21 -07:00
Sameer Sahasrabuddhe	421557974a	[AMDGPU] Use glue for convergence tokens at call-like operations (#86766 ) The earlier implementation on AMDGPU used explicit token operands at SI_CALL and SI_CALL_ISEL. This is now replaced with CONVERGENCECTRL_GLUE operands, with the following effects: - The treatment of tokens at call-like operations is now consistent with the treatment at intrinsics. - Support for tail calls using implicit tokens at SI_TCRETURN "just works". - The extra parameter at call-like instructions is eliminated, thus restoring those instructions and their handling to the original state. The new glue node is placed after the existing glue node for the outgoing call parameters, which seems to not interfere with selection of the call-like nodes.	2024-04-01 10:51:13 +05:30
Vitaly Buka	20f56e1f8e	[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049 ) RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641	2024-03-31 22:19:33 -07:00
Yingchi Long	70deb7bfe9	[BPF] expand cttz, ctlz for i32, i64 (#73668 ) Fixes: https://github.com/llvm/llvm-project/issues/62252 Depends on: #73667	2024-04-01 10:57:54 +08:00
Ruiling, Song	216b5e9666	[AMDGPU] Expose RTZ version of f16 interpolation for gfx11+ (#86614 )	2024-04-01 09:48:37 +08:00
Austin Kerbow	b5b34dbb27	[AMDGPU] Use directive for kernarg preload header padding (#86004 )	2024-03-31 11:03:03 -07:00
Austin Kerbow	0234d90d81	[AMDGPU] Extend MFMA padding option to gfx90a+ (#86768 ) It was shown experimentally that this may have some benefit on newer HW.	2024-03-31 10:46:05 -07:00
Jacek Caban	799e1d6a12	[IR] Use EXPORTAS for ARM64EC mangled symbols with dllexport attribute. (#81940 ) We currently just use mangled name. This works fine, because linker should detect that and demangle it for the export table. However, on MSVC, the compiler is more specific and passes demangled name as well, with EXPORTAS. This PR aims to match that. MSVC doesn't use quotes in this case, so I added '#' to the list of characters that don't need it.	2024-03-30 16:48:39 +01:00
Brandon Wu	29e8bfc13c	[RISCV] RISCV vector calling convention (2/2) (#79096 ) This commit handles vector arguments/return for function definition/call, the new class RVVArgDispatcher is added for doing all vector register assignment including mask types, data types as well as tuple types. It precomputes the register number for each argument as per https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#standard-vector-calling-convention-variant and it's passed to calling convention function to handle all vector arguments. Depends on: #78550	2024-03-30 21:05:33 +08:00
Jay Foad	95258419f6	[AMDGPU] Use AMDGPU::isIntrinsicAlwaysUniform in isSDNodeAlwaysUniform (#87085 ) This is mostly just a simplification, but tests show a slight codegen improvement in code using the deprecated amdgcn.icmp/fcmp intrinsics.	2024-03-30 08:01:18 +00:00
Shilei Tian	3a106e5b2c	[GlobalISel] Fold G_ICMP if possible (#86357 ) This patch tries to fold `G_ICMP` if possible.	2024-03-29 15:59:50 -04:00
Helena Kotas	b42fa8645c	[DXIL] Add lowering for `ceil` (#87043 ) Add lowering of llvm.ceil intrinsics to DXIL ops. Fixes #86984	2024-03-29 15:09:44 -04:00
Alex MacLean	7daa65a088	Reland "[NVPTX] Use .common linkage for common globals" (#86824 ) Switch from `.weak` to `.common` linkage for common global variables where possible. The `.common` linkage is described in [PTX ISA 11.6.4. Linking Directives: .common] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#linking-directives-common) > Declares identifier to be globally visible but “common”. > >Common symbols are similar to globally visible symbols. However multiple object files may declare the same common symbol and they may have different types and sizes and references to a symbol get resolved against a common symbol with the largest size. > >Only one object file can initialize a common symbol and that must have the largest size among all other definitions of that common symbol from different object files. > >.common linking directive can be used only on variables with .global storage. It cannot be used on function symbols or on symbols with opaque type. I've updated the logic and tests to only use `.common` for PTX 5.0 or greater and verified that the new tests now pass with `ptxas`.	2024-03-29 11:58:41 -07:00
Farzon Lotfi	e74332a266	[HLSL][DXIL] HLSL's `round` should follow `roundeven` behavior (#87078 ) fixes #86999	2024-03-29 13:19:28 -04:00
Shilei Tian	661bb9daae	[GlobalISel] Handle div-by-pow2 (#83155 ) This patch adds similar handling of div-by-pow2 as in `SelectionDAG`.	2024-03-29 12:41:47 -04:00
Marc Auberer	d3bc9cc99b	[AArch64][GISEL] Regenerate select tests with inline register classes (#87013 ) Use inline register class syntax for select test file.	2024-03-29 15:45:06 +01:00
Luke Lau	3f69d90351	[RISCV] Add missing RISCVMaskedPseudo for TIED pseudos (#86787 ) This was preventing us from folding away the vmerge into its mask.	2024-03-29 22:21:22 +08:00
Thorsten Schütt	84299df301	[GlobalIsel] add trunc flags (#87045 ) https://github.com/llvm/llvm-project/pull/85592	2024-03-29 13:38:08 +01:00
Luke Lau	76ba3c8e64	[RISCV] Add test case for vmerge fold for tied pseudos with rounding mode. NFC	2024-03-29 19:47:09 +08:00
Luke Lau	2a315d800b	[RISCV] Combine (or disjoint ext, ext) -> vwadd (#86929 ) DAGCombiner (or InstCombine) will convert an add to an or if the bits are disjoint, which can prevent what was originally an (add {s,z}ext, {s,z}ext) from being selected as a vwadd. This teaches combineBinOp_VLToVWBinOp_VL to recover it by treating it as an add.	2024-03-29 19:45:24 +08:00
Luke Lau	131be5de90	[RISCV] Add more disjoint or tests for vwadd[u].{w,v}v. NFC	2024-03-29 19:11:26 +08:00
Wang Pengcheng	610b9e23c5	[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505 ) We can avoid libcalls. Fixes #86205	2024-03-29 15:38:39 +08:00
Sudharsan Veeravalli	e005a09df5	[RISCV][TypePromotion] Dont generate truncs if PromotedType is greater than Source Type (#86941 ) We currently check if the source and promoted types are not equal before generating truncate instructions. This does not work for RV64 where the promoted type is i64 and this lead to a crash due to the generation of truncate instructions from i32 to i64. Fixes #86400	2024-03-28 21:22:05 -07:00
Philip Reames	9ea0396f16	[RISCV] Extend pattern matches involving shNadd to support disjoint or (#87001 ) I tried to add representative tests while not duplicating complete coverage. If there's other tests you'd like to see, let me know.	2024-03-28 16:34:04 -07:00
Marc Auberer	c482fad2c1	[AArch64][GISEL] Consider fcmp true and fcmp false in cond code selection (#86972 ) Fixes #86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.	2024-03-28 23:08:38 +01:00
Luke Lau	a3c2d8c072	[RISCV] Combine ({s,u}{div,rem} (zext, zext)) -> (zext ({s,u}{div,rem} (zext, zext))) (#86779 ) This narrows unsigned and signed div and rem nodes via combineBinOpOfZExt. Unlike other binary ops, there are no widening div or rem instructions. So we will end up with an extra vzext.vf2. However I'm assuming that div/rem are expensive enough that by reducing their EMUL we will gain back the cost. Alive2 proof: https://alive2.llvm.org/ce/z/Et_L6y	2024-03-29 05:55:38 +08:00
Craig Topper	23d45e55ed	[MCP] Remove dead copies from basic blocks with successors. (#86973 ) Previously we wouldn't remove dead copies from basic blocks with successors. The comment said we didn't want to trust the live-in lists. The comment is very old so I'm not sure if that's still a concern today. This patch checks the live-in lists and removes copies from MaybeDeadCopies if they are referenced by any live-ins in any successors. We only do this if the tracksLiveness property is set. If that property is not set, we retain the old behavior.	2024-03-28 14:43:49 -07:00
Helena Kotas	62d6beba97	[DXIL] Add lowering for `reversebits` and `trunc` (#86909 ) Add lowering of `llvm.bitreverse` and `llvm.trunc` intrinsics to DXIL ops. Fixes #86582 Fixes #86581	2024-03-28 17:41:33 -04:00
Zaara Syeda	6582509daa	[AIX] Handle toc-data offset overflowing 16-bits (#80092 ) When the toc-data offset overflows the 16-bits, we can truncate the value to the 16-bit value as the linker will handle overflow through fixup code.	2024-03-28 13:55:13 -04:00
Jonas Paulsson	16b7cc69ef	[SystemZ] Eliminate call sequence instructions early. (#77812 ) On SystemZ, the outgoing argument area which is big enough for all calls in the function is created once during the prolog, as opposed to adjusting the stack around each call. The call-sequence instructions are therefore not really useful any more than to compute the maximum call frame size, which has so far been done by PEI, but can just as well be done at an earlier point. This patch removes the mapping of the CallFrameSetupOpcode and CallFrameDestroyOpcode and instead computes the MaxCallFrameSize directly after instruction selection and then removes the ADJCALLSTACK pseudos. This removes the confusing pseudos and also avoids the problem of having to keep the call frame size accurate when creating new MBBs. This fixes #76618 which exposed the need to maintain the call frame size when splitting blocks (which was not done).	2024-03-28 18:26:38 +01:00

1 2 3 4 5 ...

52796 Commits