llvm-project

Author	SHA1	Message	Date
Stephen Tozer	da0faa0594	[DebugInfo] Produce variadic DBG_INSTR_REFs from ISel This patch modifies SelectionDAG and FastISel to produce DBG_INSTR_REFs with variadic expressions, and produce DBG_INSTR_REFs for debug values with variadic location expressions. The former essentially means just prepending DW_OP_LLVM_arg, 0 to the existing expression. The latter is achieved in MachineFunction::finalizeDebugInstrRefs and InstrEmitter::EmitDbgInstrRef. Reviewed By: jmorse, Orlando Differential Revision: https://reviews.llvm.org/D133929	2023-01-09 08:58:33 +00:00
Serguei Katkov	fd64bd94ed	[Inline Spiller] Extend the snippet by statepoint uses Snippet is a tiny live interval which has copy or fill like def and copy or spill like use at the end (any of them might abcent). Snippet has only one use/def inside interval and interval is located in one basic block. When inline spiller spills some reg around uses it also forces the spilling of connected snippets those which got by splitting the same original reg and its def is a full copy of our reg or its last use is a full copy to our reg. The definition of snippet is extended to allow not only one use/def but more. However all other uses are statepoint instructions which will fold fill into its operand. That way we do not introduce new fills/spills. Reviewed By: qcolombet, dantrushin Differential Revision: https://reviews.llvm.org/D138093	2023-01-09 13:30:57 +07:00
Simon Pilgrim	ddab12d118	[X86] Add shuffle test coverage for Issue #59860	2023-01-08 19:06:06 +00:00
Stephen Tozer	c383f4d655	[DebugInfo] Allow non-stack_value variadic expressions and use in DBG_INSTR_REF Prior to this patch, variadic DIExpressions (i.e. ones that contain DW_OP_LLVM_arg) could only be created by salvaging debug values to create stack value expressions, resulting in a DBG_VALUE_LIST being created. As of the previous patch in this patch stack, DBG_INSTR_REF's syntax has been changed to match DBG_VALUE_LIST in preparation for supporting variadic expressions. This patch adds some minor changes needed to allow variadic expressions that aren't stack values to exist, and allows variadic expressions that are trivially reduceable to non-variadic expressions to be handled similarly to non-variadic expressions. Reviewed by: jmorse Differential Revision: https://reviews.llvm.org/D133926	2023-01-06 19:31:10 +00:00
James Y Knight	1ae36b1387	Remove special cases for invoke of non-throwing inline-asm. Non-throwing inline asm infers the nounwind attribute in instcombine. Thus, it can be handled in the same manner as non-throwing target functions are generally. Further special casing is unnecessary complexity.	2023-01-06 13:53:10 -05:00
Stephen Tozer	e10e936315	[DebugInfo][NFC] Add new MachineOperand type and change DBG_INSTR_REF syntax This patch makes two notable changes to the MIR debug info representation, which result in different MIR output but identical final DWARF output (NFC w.r.t. the full compilation). The two changes are: * The introduction of a new MachineOperand type, MO_DbgInstrRef, which consists of two unsigned numbers that are used to index an instruction and an output operand within that instruction, having a meaning identical to first two operands of the current DBG_INSTR_REF instruction. This operand is only used in DBG_INSTR_REF (see below). * A change in syntax for the DBG_INSTR_REF instruction, shuffling the operands to make it resemble DBG_VALUE_LIST instead of DBG_VALUE, and replacing the first two operands with a single MO_DbgInstrRef-type operand. This patch is the first of a set that will allow DBG_INSTR_REF instructions to refer to multiple machine locations in the same manner as DBG_VALUE_LIST. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D129372	2023-01-06 18:03:48 +00:00
Sanjay Patel	bf82070ea4	[SDAG] try to avoid multiply for X*Y==0 Forking this off from D140850 - https://alive2.llvm.org/ce/z/TgBeK_ https://alive2.llvm.org/ce/z/STVD7d We could almost justify doing this in IR, but consideration for "minsize" requires that we only try it in codegen -- the transform is not reversible. In all other cases, avoiding multiply should be a win because a mul is more expensive than simple/parallelizable compares. AArch even has a trick to keep instruction count even for some types. Differential Revision: https://reviews.llvm.org/D141086	2023-01-06 09:06:11 -05:00
Sanjay Patel	f58eedeeee	[x86] add tests for x*y == 0; NFC	2023-01-06 08:37:04 -05:00
Noah Goldstein	960bf8a454	[X86] Add tests for atomic bittest with register/memory operands Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D140938	2023-01-06 17:55:38 +08:00
Nikita Popov	e3c2faa64a	Revert "[X86] Revert -fno-plt __tls_get_addr workaround for old GNU ld" This reverts commit 2679e8bba3e166e3174971d040b9457ec7b7d768. This change is a significant backwards-compatibility break, which does in fact break the entire Rust ecosystem, which uses an -fno-plt -mrelax-relocations=0 default. Please go through pre-commit review for this change in order to gain broader consensus.	2023-01-06 09:43:47 +01:00
Noah Goldstein	a698790c51	[X86] Add additional tests to no-shift.ll Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D141076	2023-01-06 14:44:45 +08:00
Craig Topper	11e92bd61f	[SelectionDAG] Improve codegen for udiv by constant if any divisors are 1. If the divisor is 1, the magic algorithm does not return a correct result and we end up using a select to pick the numerator for those elements at the end. Therefore we can use undef for that element of the earlier operations when the divisor is 1. We sometimes get this through SimplifyDemandedVectorElts, but not always. Definitely seems like we don't if the NPQ fixup is used. Unfortunately, DAGCombiner is unable to fold srl X, <0, undef> to X so I had to add flags to avoid emitting the srl unless one of the shift amounts is non-zero. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D141022	2023-01-05 08:41:44 -08:00
Freddy Ye	27b8f54f51	[X86] Support -march=emeraldrapids Reviewed By: pengfei, skan Differential Revision: https://reviews.llvm.org/D140950	2023-01-05 20:27:32 +08:00
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Roman Lebedev	068033a2f7	[NFC][X86] Make vec_anyext.ll test non-useless	2023-01-05 01:12:30 +03:00
Philip Reames	9768a71a5e	[X86] Regen a couple tests so they are autogen clean [nfc] These appear to have had 32 bit check lines manually deleted - presumably since the checks are verbose. Please don't do this! Split the test file if you want, but manually deleting test lines makes the diffs for later autogen changes really confusing.	2023-01-04 12:17:32 -08:00
Philip Reames	56a40cd4ab	[X86] Autogen tests for ease of update in upcoming change [nfc]	2023-01-04 12:00:20 -08:00
Roman Lebedev	846d06c707	[DAG] `tryToFoldExtendOfConstant()`: `sext undef` is not `undef` https://alive2.llvm.org/ce/z/cLGpWV, but https://alive2.llvm.org/ce/z/TGNH4P	2023-01-04 22:42:43 +03:00
Philip Reames	5226077b21	[X86] Autogen tests for ease of update in upcoming change [nfc]	2023-01-04 11:30:49 -08:00
Roman Lebedev	e4b260efb2	[Codegen][X86] `LowerBUILD_VECTOR()`: improve lowering w/ multiple FREEZE-UNDEF ops While we have great handling for UNDEF operands, FREEZE-UNDEF operands are effectively normal operands. We are better off "interleaving" such BUILD_VECTORS into a blend between a splat of FREEZE-UNDEF, and "thawed" source BUILD_VECTOR, both of which are more natural for us to handle. Refs. `f738ab9075 (r95017306)`	2023-01-04 21:16:11 +03:00
Roman Lebedev	91f1c59fcd	[NFC][X86] Add few more tests for freezing BUILD_VECTOR	2023-01-04 21:16:11 +03:00
Amaury Séchet	ac17b6b963	[NFC] Autogenerate CodeGen/X86/sdiv-pow2.ll	2023-01-04 16:43:47 +00:00
Roman Lebedev	4fc417ec37	[DAGCombiner] `convertBuildVecZextToBuildVecWithZeros()`: rework split factor calculation The original computation was both making assumptions that do not hold in practice, and being overly pessimistic. We should just check every possible split factor, and pick the best one. Fixes https://github.com/llvm/llvm-project/issues/59781	2023-01-02 18:34:35 +03:00
Roman Lebedev	1337821f11	[DAGCombiner][X86] Fold a CONCAT_VECTORS of SHUFFLE_VECTOR and it's operand into wider SHUFFLE_VECTOR This was showing as a source of many regressions with more aggressive ZERO_EXTEND_VECTOR_INREG recognition.	2023-01-01 23:18:42 +03:00
Roman Lebedev	a190b40861	[NFC][X86] Add tests for concatenation of shuffle's operand to the shuffle	2023-01-01 23:12:21 +03:00
Fangrui Song	2679e8bba3	[X86] Revert -fno-plt __tls_get_addr workaround for old GNU ld ENABLE_X86_RELAX_RELOCATIONS has defaulted to on in 2020. This workaround is not exercised for a long time.	2022-12-31 22:39:20 -08:00
Roman Lebedev	16facf1ca6	[DAGCombiner][TLI] Do not fuse bitcast to <1 x ?> into a load/store of a vector Single-element vectors are legalized by splitting, so the the memory operations would also get scalarized. While we do have some support to reconstruct scalarized loads, we clearly don't catch everything. The comment for the affected AArch64 store suggests that having two stores was the desired outcome in the first place. This was showing as a source of many regressions with more aggressive ZERO_EXTEND_VECTOR_INREG recognition.	2022-12-31 03:49:43 +03:00
Roman Lebedev	2480164247	[NFC][Codegen][x86] Add tests for load/store of a single-element vectors	2022-12-31 03:23:24 +03:00
Roman Lebedev	e4d25a9c23	[DAG] BUILD_VECTOR: absorb ZERO_EXTEND of a single first operand if all other ops are zeros This kind of pattern seems to come up as regressions with better ZERO_EXTEND_VECTOR_INREG recognition. For initial implementation, this is quite restricted to the minimal viable transform, otherwise there are too many regressions to be dealt with.	2022-12-31 00:58:11 +03:00
Roman Lebedev	a35b216290	[NFC][X86] Add exhaustive-ish coverage for broadcast of implicitly aext/zext element Some of these even crash instruction selection for AVX512. This is one of the patterns that comes up as regressions with more aggressive ZERO_EXTEND_VECTOR_INREG recognition. https://godbolt.org/z/x88aqfrT5	2022-12-30 22:40:20 +03:00
Roman Lebedev	c823517ef5	[NFC][Codegen][X86] zero_extend_vector_inreg.ll: add SSE4.2 runline	2022-12-30 01:44:15 +03:00
Roman Lebedev	248567a327	[DAGCombiner] Try to partition ISD::EXTRACT_VECTOR_ELT to accomodate it's ISD::BUILD_VECTOR users This mainly cleans up a few patterns that are legalized by scalarization from a wide-element vector, but then are further split apart to build a more narrow-sized-element vector. In particular this happens in some cases for illegal ISD::ZERO_EXTEND_VECTOR_INREG. Given a ISD::EXTRACT_VECTOR_ELT, which is a glorified bit sequence extract, recursively analyse all of it's users. and try to model themselves as bit sequence extractions. If all of them agree on the new, narrower element type, and all of them can be modelled as ISD::EXTRACT_VECTOR_ELT's of that new element type, do that, but only if unmodelled users are ISD::BUILD_VECTOR.	2022-12-30 01:15:53 +03:00
Craig Topper	8abd70081f	[TargetLowering] Teach BuildUDIV to take advantage of leading zeros in the dividend. If the dividend has leading zeros, we can use them to reduce the size of the multiplier and avoid the fixup cases. This patch is for scalars only, but we might be able to do this for vectors in a follow up. Differential Revision: https://reviews.llvm.org/D140750	2022-12-29 13:58:46 -08:00
Roman Lebedev	778a7df50e	[NFC][Codegen][X86] Add exhaustive-ish test coverage for ZERO_EXTEND_VECTOR_INREG It should be possible to deduplicate AVX2 and AVX512F checklines, but i'm not sure which combination of check prefixes would do that. https://godbolt.org/z/sndT9n1nz	2022-12-29 03:18:01 +03:00
Thomas Köppe	82be8a1d2b	[X86] Emit RIP-relative access to local function in PIC medium code model Currently, the medium code model for x86_64 emits position-dependent relocations (R_X86_64_64) for local functions, regardless of PIC or no-PIC mode. (This means generically that code compiled with the medium model cannot be linked into a position-independent executable.) Example: ``` static int g(int n) { return 2 * n + 3; } void f(int(*p)(int)) { p = g; } ``` This results in: ``` Disassembly of section .text: 0000000000000000 <f>: 0: 48 b8 00 00 00 00 00 00 00 00 movabs rax, 0x0 a: 48 89 07 mov qword ptr [rdi], rax d: c3 ret ``` ``` Relocation section '.rela.text' at offset 0xf0 contains 1 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000002 0000000200000001 R_X86_64_64 0000000000000000 .text + 10 ``` This patch changes the behaviour to unconditionally emit a RIP-relative access, both in PIC and non-PIC mode. This fixes PIC mode, and is perhaps an improvement in non-PIC mode, too, since it results in a shorter instruction. A 32-bit relocation should suffice since the medium memory model demands that all code fit within 2GiB. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D140593	2022-12-28 11:14:39 -08:00
Roman Lebedev	c4f815d705	[DAGCombine] `combineShuffleToZeroExtendVectorInReg()`: widen shuffle elements before trying to match We might have sunk a bitcast into shuffle, and now it might be operating on more fine-grained elements than what we'd match, so we must not be dependent on whatever the granularity the shuffle happened to be in, but transform it into the one canonical for us - with widest elements.	2022-12-27 00:47:45 +03:00
Roman Lebedev	cc051b0730	[NFC][X86] Add some tests that can be matched as ZERO_EXTEND_VECTOR_INREG	2022-12-27 00:41:59 +03:00
Roman Lebedev	e26e7ed69a	[DAG] `combineShuffleToZeroExtendVectorInReg()`: try to match w/ commuted operands We don't have any reason to expect that the operand we will match is on any particular hand of the shuffle, so we should try both.	2022-12-26 22:54:03 +03:00
Roman Lebedev	ec99bf2480	[NFC][Codegen][X86] Autogenerate check lines in shift-i256.ll	2022-12-24 19:26:42 +03:00
Roman Lebedev	110c5442b8	[NFC][Codegen] Add tests with oversized shifts by non-byte-multiple	2022-12-24 19:26:41 +03:00
Roman Lebedev	a9fbf25a14	[NFC][Codegen] Rename tests for oversized shifts by byte multiple	2022-12-24 19:26:41 +03:00
Roman Lebedev	387c1573f8	[NFC][Codegen] Tests with wide scalar shifts, for new potential legalization strategy	2022-12-24 00:47:25 +03:00
Roman Lebedev	aad725928d	[NFC][Codegen][X86] Add codegen test coverage for the variably-indexed load of alloca w/zero upper half	2022-12-23 20:16:41 +03:00
Roman Lebedev	03e848293e	[DAGCombiner] `visitFREEZE()`: fix cycle breaking Depending on the particular DAG, we might either create a `freeze`, or not. And only in the former case, the cycle would be formed. It would be nicer to have `ReplaceAllUsesOfValueWithIf()`, like we have in IR, but we don't have that. Fixes https://github.com/llvm/llvm-project/issues/59677	2022-12-23 18:16:22 +03:00
Roman Lebedev	d8f541efe7	[DAGCombiner] `visitFREEZE()`: fix handling of no maybe-poison ops The original code was confusing. It was stripping poison-generating flags, but the comments were saying that doing so was a TODO. If the poison-generating flags are present, then even if all operands are guaranteed not to be undef or poison, the whole operation may still produce undef or poison. We can still deal with that case, and we already do deal with it in fact, by also dropping those flags. Refs. https://github.com/llvm/llvm-project/issues/59676	2022-12-23 17:26:05 +03:00
Roman Lebedev	d7a63a0421	[DAGCombiner] `visitFREEZE()`: restore previous behaviour on no maybe-poison operands Lack of such operands implies that the op might be poison-producing due to it's flags. We seem to drop them already, but the comments are confusing. Fixes https://github.com/llvm/llvm-project/issues/59676	2022-12-23 17:26:05 +03:00
Roman Lebedev	e7f21d750c	[NFC][Codegen][X86] Tests w/ final optimized IR of SROA-with-variably-indexed-loads (D140493) 32-byte ones are for consistency only, we really only care about up to 16-byte on 64-bit and maybe up to 8-byte on 32-bit. In 16byte ones, we are still having some redundant vec<->scalar traffic. https://reviews.llvm.org/D140493	2022-12-23 04:41:32 +03:00
Roman Lebedev	6fea27662d	[DAGCombiner] `visitFREEZE()`: be less greedy with replacing other uses of undef	2022-12-23 02:26:36 +03:00
Roman Lebedev	f738ab9075	[DAGCombiner] `visitFREEZE()`: allow multiple maybe-poison operands for `BUILD_VECTOR`	2022-12-23 02:26:36 +03:00
Roman Lebedev	1234754bbc	[DAGCombine] `BUILD_VECTOR` can not create undef or poison	2022-12-23 02:26:36 +03:00

1 2 3 4 5 ...

18577 Commits