llvm-project

Author	SHA1	Message	Date
Amara Emerson	95ac3d15e9	[AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization. For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize completely if the source is <= 64b. This change adds support for that in the legalizer. If the source has a pow-2 num elements, then we can do a tree reduction using the scalar operation in the individual elements. Otherwise, we just create a sequential chain of operations. For AArch64, we only need to scalarize if the input is <64b. If it's great than 64b then we can first do a fewElements step to 64b, taking advantage of vector instructions until we reach the point of scalarization. I also had to relax the verifier checks for reductions because the intrinsics support <1 x EltTy> types, which we lower to scalars for GlobalISel. Differential Revision: https://reviews.llvm.org/D108276	2021-08-19 16:38:52 -07:00
Fangrui Song	fbb8e772ec	[InstrProfiling] Make COFF use the ELF comdat scheme (drop link.exe compatibility) The COFF specific `DataReferencedByCode` complexity (D103372 D103717) is due to a link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE is not really dropped, so it can cause duplicate definition error.	2021-08-19 16:38:32 -07:00
Fangrui Song	2f7ea054f3	[test] Split icall.ll into comdat/nocomdat variants darwin/aix don't support comdat. Using IR comdat is incorrect.	2021-08-19 16:36:59 -07:00
Amara Emerson	a0051f7149	[AArch64][GlobalISel] Fix miscompile of <16 x s8> G_EXTRACT_VECTOR_ELT. When support for copying vector s8 lanes was added recently, this also had the side effect of fixing a fallback for <16 x s8> extracts since both used the same helper. However, there was a bug in another helper to get the regclass for a specific FPR-native type, which was assigning FPR16 to s8 instead of FPR8.	2021-08-19 16:22:32 -07:00
Thomas Lively	be6c49e743	[WebAssembly] Add explicit casts to silence -Wc++11-narrowing	2021-08-19 16:00:07 -07:00
Yonghong Song	c1169b8bd3	Revert "[DebugInfo] generate btf_tag annotations for DIComposite types" This reverts commit 2fded193e7a8fb5bd8fb339f00fd9de686390530. Builtbot reports some test failures. Revert now so I can take time to fix the issues.	2021-08-19 15:54:38 -07:00
Yonghong Song	2fded193e7	[DebugInfo] generate btf_tag annotations for DIComposite types Clang patch D106614 added attribute btf_tag support. This patch generates btf_tag annotations for DIComposite types. A field "annotations" is introduced to DIComposite, and the annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DICompositeType(..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Each btf_tag annotation is represented as a 2D array of meta strings. Each record may have more than one btf_tag annotations, as in the above example. Differential Revision: https://reviews.llvm.org/D106615	2021-08-19 15:37:44 -07:00
Thomas Lively	fd0557dbf1	[WebAssembly] More convert_low and promote_low codegen The convert_low and promote_low instructions can widen the lower two lanes of a four-lane vector, but we were previously scalarizing patterns that widened lanes besides the low two lanes. The commit adds a shuffle to move the widened lanes into the low lane positions so the convert_low and promote_low instructions can be used instead of scalarizing. Depends on D108266. Differential Revision: https://reviews.llvm.org/D108341	2021-08-19 15:37:12 -07:00
Thomas Lively	b311a040ef	[WebAssembly] Pattern match SIMD convert_low and promote_low during ISel Since the simplest DAG patterns for convert_low and promote_low instructions involved v2i32, v2f32, v4i64, and v4f64 types, which are not legal in the WebAssembly backend and would be eliminated by type legalization, we were previously matching those patterns in a DAG combine before the type legalization stage. However in cases where the vectors were wider than 128 bits, the patterns we matched were not created until the type legalization stage when the wide vectors were split up. Type legalization would continue to eliminate the illegal types we were matching as well, so the code ended up scalarized. To make the ISel for these instructions more robust, match the scalarized patterns rather than the patterns containing illegal types. Add tests with double-wide vectors to show that this works as intended. Fixes PR51098. Depends on D107502. Differential Revision: https://reviews.llvm.org/D108266	2021-08-19 15:24:28 -07:00
Akira Hatanaka	898dc4590c	Refactor inlineRetainOrClaimRVCalls. NFC This is in preparation for committing https://reviews.llvm.org/D103000.	2021-08-19 14:55:45 -07:00
Arthur Eubanks	7c8206cd2a	[NFC] Cleanup AttributeList::getStackAlignment() So that we don't use a confusing index.	2021-08-19 14:21:40 -07:00
Alexandre Rames	cd28003336	[Support] Update `MD5` to follow other hashes. Introduce `StringRef final()` and `StringRef result()`. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D107781	2021-08-19 14:13:14 -07:00
Arthur Eubanks	44a3241f10	[NFC] Replace some attribute methods that use confusing indexes	2021-08-19 14:10:26 -07:00
Alexandre Rames	10a126325d	[NFC][Support] Move `MD5` members in `InternalState`. This prepares an update to follow other hashes. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D108388	2021-08-19 14:04:14 -07:00
Florian Mayer	73323c6eaa	[hwasan] re-enable stack safety by default. The failed assertion was fixed in D108337. Reviewed By: vitalybuka, eugenis Differential Revision: https://reviews.llvm.org/D108381	2021-08-19 21:11:24 +01:00
Adrian Prantl	1e586bcc3e	Move function definition out-of-line to fix the modularized build (NFC)	2021-08-19 12:26:23 -07:00
Thomas Lively	b69374ca58	[WebAssembly] Legalize vector types by widening The default legalization of unsupported vector types is to promote the integers in each lane, which leads to extra sign or zero extending and masking when moving data into and out of vectors. Switch our preferred type legalization from the default to vector widening, which keeps the data in the low lanes of the vector rather than in the low bits of each lane. The unused high lanes can be ignored. Half-wide vectors are now loaded from memory into the low 64 bits of the v128 rather than spread out among the lanes. As a result, v128.load64_splat is a much more common operation, so add new patterns to support it. Differential Revision: https://reviews.llvm.org/D107502	2021-08-19 12:07:33 -07:00
Philip Reames	17b9cb1817	[runtimeunroll] Support multiple exits to latch exit w/prolog loop This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used. This is the prolog companion to D107381. Since this was LGTMed, a problem with DT updating was reported against that patch. I roled in the analogous fix here as it seemed obvious, and not worth re-review. As an aside, our prolog form leaves a lot of potential value on the floor when there is an invariant load or invariant condition in the loop being runtime unrolled. We should probably consider a "required prolog" heuristic. (Alternatively, maybe we should be peeling these cases more aggressively?) Differential Revision: https://reviews.llvm.org/D108262	2021-08-19 11:43:52 -07:00
Stanislav Mekhanoshin	8d7d89b081	[AMDGPU] Add alias.scope metadata to lowered LDS struct Alias analysis is unable to disambiguate accesses to the structure fields without it unlike distinct variables. As a result we cannot combine ds_read and ds_write operations in a case of any store in between which always considered clobbering. Differential Revision: https://reviews.llvm.org/D108315	2021-08-19 11:40:30 -07:00
Nikita Popov	8cf5b69f69	[GuardWidening] Preserve MemorySSA As reported on https://bugs.llvm.org/show_bug.cgi?id=51020, the guard widening pass doesn't preserve MemorySSA, so it can no longer be scheduled in the same loop pass manager as LICM. However, the loop-schedule.ll test indicates that this is supposed to work. Fix this by preserving MemorySSA if available, as this seems to be trivial in this case (we only need to drop the memory access for the removed guards). Differential Revision: https://reviews.llvm.org/D108386	2021-08-19 20:23:17 +02:00
Philip Reames	447256f22b	[runtimeunroll] Fix reported DT verification error after 94d0914 In 94d0914, I added support for unrolling of multiple exit loops which have multiple exits reaching the latch. Per reports on the review post commit, I'd missed updating the domtree for one case. This fix addresses that ommission. There's no new test as this is covered by existing tests with expensive verification turned on.	2021-08-19 11:06:17 -07:00
Simon Pilgrim	5fa6039a5f	[SLP][X86] Add llvm.isnan intrinsic test coverage We still need to tag the llvm.isnan.? intrinsic as vectorizable	2021-08-19 18:56:23 +01:00
Simon Pilgrim	26ed14f413	[SLP][X86] Regenerate intrinsic.ll test checks	2021-08-19 18:56:22 +01:00
Tim Northover	edab411ee6	AArch64: copy all parts of the mem operand across when combining a store In particular we were dropping volatility, which can lead to unwanted transformations.	2021-08-19 18:26:39 +01:00
Simon Pilgrim	72ebcd3198	[CostModel][X86] Add isnan half/float/double costs tests	2021-08-19 18:07:06 +01:00
Chang-Sun Lin, Jr	9cae598f8b	[InstCombine] Avoid folding GEPs across loop boundaries Folding a GEP from outside to inside a loop will materialize an add where there wasn't an equivalent operation before. Check the containing loops before making this fold. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D107935	2021-08-19 20:03:44 +03:00
Chang-Sun Lin, Jr	0f0905605e	[NFC][InstCombine] Add test for one-use one-index geps in different loops	2021-08-19 20:03:44 +03:00
Arthur Eubanks	33d44b762e	[OpaquePtr][Inline] Use byval type instead of pointee type Reviewed By: #opaque-pointers, dblaikie Differential Revision: https://reviews.llvm.org/D105711	2021-08-19 09:56:08 -07:00
Owen Anderson	06a4c85890	Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64. This allows the instruction selector to realize that it can directly broadcast the low byte of the memset value, rather than replicating it to a 64-bit GPR before broadcasting. This fixes PR50985. Differential Revision: https://reviews.llvm.org/D108354	2021-08-19 16:54:07 +00:00
Simon Pilgrim	94e1442d78	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 17:49:32 +01:00
Augie Fackler	e59c88294b	MemoryBuiltins: trailing , on collection literal This was probably bugging more than is reasonable, but it makes merging changes in this file slightly less annoying to have the trailing comma here. I only noticed this because Rust is currently carrying a patch to this file and it kept making life a little difficult.	2021-08-19 17:59:23 +02:00
Thomas Preud'homme	9d476f0af9	Fix CodeGen/X86/fsafdo_test2.ll fail in release Require debug build for CodeGen/X86/fsafdo_test2.ll since it checks for messages only printed in debug mode. Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D108364	2021-08-19 16:54:04 +01:00
Simon Pilgrim	ff69c65b05	Fix empty paragraph passed to parameter Wdocumentation warning. NFC.	2021-08-19 16:48:28 +01:00
Craig Topper	84cea602f9	Revert "[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand." This reverts commit add08c874147638e52d89eb07e40797dbc98d73b. There was a compile time jump on tramp3d-v4 on https://llvm-compile-time-tracker.com/ Want to see if it goes away with this reverted.	2021-08-19 08:42:05 -07:00
Yaron Keren	1987eb9e9c	[docs] Document how to install sphinx and recommonmark on Ubuntu Differential Revision: https://reviews.llvm.org/D108374	2021-08-19 18:24:17 +03:00
David Green	d10f23a25d	[ISel] Expand saddsat and ssubsat via asr and xor This changes the lowering of saddsat and ssubsat so that instead of using: r,o = saddo x, y c = setcc r < 0 s = c ? INTMAX : INTMIN ret o ? s : r into using asr and xor to materialize the INTMAX/INTMIN constants: r,o = saddo x, y s = ashr r, BW-1 x = xor s, INTMIN ret o ? x : r https://alive2.llvm.org/ce/z/TYufgD This seems to reduce the instruction count in most testcases across most architectures. X86 has some custom lowering added to compensate for cases where it can increase instruction count. Differential Revision: https://reviews.llvm.org/D105853	2021-08-19 16:08:07 +01:00
Jinsong Ji	a9cc662722	[AIX] Remove XFAIL from macro-same-context We have enabled inline asm intergrated assembler support, this test is passing now.	2021-08-19 14:52:58 +00:00
Simon Pilgrim	87c8c8ae97	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 15:40:10 +01:00
Simon Pilgrim	9419729b6a	[CostModel][X86] Add VPOPCNTDQ/BITALG ctpop costs VPOPCNTDQ + BITALG add ctpop instructions for vXi64/vXi32 + vXi16/vXi8 vector types respectively	2021-08-19 15:40:09 +01:00
Craig Topper	add08c8741	[SelectionDAGBuilder] Compute and cache PreferredExtendType on demand. Previously we pre-calculated this and cached it for every instruction in the function. Most of the calculated results will never be used. So instead calculate it only on the first use, and then cache it. The cache was originally added to fix a compile time issue which caused r216066 to be reverted. This change exposed that we weren't pre-computing the Value for Arguments. I've explicitly disabled that for now as it seemed to regress some tests on AArch64 which has sext built into its compare instructions. Spotted while investigating how to improve heuristics to work better with RISCV preferring sign extend for unsigned compares for i32 on RV64. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D107976	2021-08-19 07:18:33 -07:00
Craig Topper	c60a4c1ba5	[TypePromotion] Use Instruction* instead of Value* for a couple functions. NFC This matches how they are called and allows some isa/cast/dyn_cast to be removed. Differential Revision: https://reviews.llvm.org/D108333	2021-08-19 07:09:38 -07:00
Craig Topper	36d8316cc8	[RISCV] Reduce duplicate code for calling SimplifyDemandedBits. This encapsulates the APInt creation and worklist management into a helper function. To keep one common interface I've use Log2_32 in places that previously created a mask by subtracting 1 from a power of 2. Differential Revision: https://reviews.llvm.org/D108324	2021-08-19 07:09:38 -07:00
David Green	765a421276	[ARM] Add MVE min/max intrinsic tests. NFC	2021-08-19 14:33:34 +01:00
Alexey Lapshin	ab9d506be3	[DWARF][Verifier][NFC] Use reference to DWARFAddressRangesVector to avoid copying. Avoid copying while access to RangesOrError.get().	2021-08-19 16:23:05 +03:00
Simon Pilgrim	2d60fdd7aa	[CostModel][X86] Add VPOPCNT/BITALG test coverage for ctpop/cttz costs	2021-08-19 14:05:58 +01:00
Ben Shi	b10e74389e	[RISCV][test] Improve tests for (add (mul x, c1), c2) Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107710	2021-08-19 21:04:35 +08:00
Sanjay Patel	ec54e275f5	Revert "[CVP] processSwitch: Remove default case when switch cover all possible values." This reverts commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e. This patch may cause miscompiles because it missed a constraint as shown in the examples from: https://llvm.org/PR51531	2021-08-19 08:43:51 -04:00
Sanjay Patel	eee0ded337	[InstCombine] add min/max intrinsics as freely invertible candidates In the optimized test, we are able to peak through the min/max that has 2 min/max operands and invert them all: https://alive2.llvm.org/ce/z/7gYMN5	2021-08-19 08:41:38 -04:00
Sanjay Patel	610d3d512a	[InstCombine] add tests for min/max with inverts; NFC	2021-08-19 08:41:38 -04:00
Sanjay Patel	e10c3beca5	[InstCombine] add one-use check for min/max fold with not operands; NFC This makes the intrinsic logic match the cmp+select idiom folds just below. It's not clearly a win either way unless we think that a 'not' op costs more than min/max. The cmp+select folds on these patterns are more extensive than the intrinsics currently and may have some complicated interactions, so I'm trying to make those line up and bring the optimizations for intrinsics up to parity.	2021-08-19 08:41:38 -04:00

1 2 3 4 5 ...

220248 Commits