llvm-project

Author	SHA1	Message	Date
Jyotsna Verma	65a2f6ad9c	[Hexagon] Create an intrinsic to profile using a custom handler The intrinsic is lowered into a hexagon pseudo instruction which after register allocation is expanded into A2_tfrsi and J2_call.	2022-03-28 10:31:41 -05:00
Arthur Eubanks	16823adf2a	[test] Modify some tests to remove implicit -basic-aa in legacy PM RUN lines	2022-03-08 14:35:06 -08:00
Krzysztof Parzyszek	108910c667	[Hexagon] Handle v2f16 in build_vector in isel	2022-03-07 11:54:24 -08:00
Krzysztof Parzyszek	0792161c00	[Hexagon] Fix operation actions for v128f16 There were more cases of operations that should have been "Custom" for v128f16, but ended up "Legal" (e.g. load and store).	2022-02-08 15:28:37 -08:00
Krzysztof Parzyszek	7403c02f06	[Hexagon] Fix crash with shuffle_vector of v128f16	2022-02-08 13:05:22 -08:00
Krzysztof Parzyszek	c935f6e048	[Hexagon] Punt on registers without reaching defs in addr mode opt This fixes https://github.com/llvm/llvm-project/issues/52636.	2022-02-01 09:52:59 -08:00
Fangrui Song	e6cdef187e	[XRay][test] Clean up llc RUN lines	2022-01-21 17:00:03 -08:00
Pranav Bhandarkar	bde1032588	[Hexagon] Fix optimize address mode pass only handle BaseImmOffset mode This is a fix for a crash in the HexagonOptAddrMode pass that was looking for the third operand (offset) in the following instruction that does not, in fact, have a third operand: $r1 = L2_loadw_locked $r1 Additionally, this patch also adds an addrMode value to vgather pseudos in the Hexagon backend. Differential Revision: https://reviews.llvm.org/D117133	2022-01-14 15:45:23 -08:00
Nadav Rotem	e2cc091a7d	Fix a missed opportunity to merge stores. This commit fixes a missed opportunity in merging consecutive stores. The code that searches for stores skipped the case of stores that directly connect to the root. The comment above the implementation lists this case but the code did not handle it. I found this pattern when looking into the shared_ptr destructor. GCC generates the right sequence. Here is a small repo: int foo(int* buff) { buff[0] = 0; int x = buff[1]; buff[1] = 0; return x; } Differential Revision: https://reviews.llvm.org/D116895	2022-01-10 13:49:02 -08:00
Nikita Popov	f430c1eb64	[Tests] Add elementtype attribute to indirect inline asm operands (NFC) This updates LLVM tests for D116531 by adding elementtype attributes to operands that correspond to indirect asm constraints.	2022-01-06 14:23:51 +01:00
Ikhlas Ajbar	2819e5de42	[Hexagon] Handle instruction selection for select(I1,Q,Q) Lower select(I1,Q,Q) by converting vector predicate Q to vector register V, doing select(I1,V,V), and then converting the resulting V back to Q. Also, try to avoid creating such situations in the first place.	2022-01-05 14:50:12 -08:00
Shubham Pawar	41085357df	[Hexagon] Extend OptAddrMode pass to vgather This change extends the addressing mode optimization pass to HVX vgather. This is specifically intended to resolve compiler not generating indexed addresses for vgather stores to vtcm. Changed the vgather pseudo instructions to accept an immediate operand and handled addition of appropriate immediate operand in addressing mode optimization pass.	2022-01-05 08:44:21 -08:00
Sumanth Gundapaneni	822448635e	[Hexagon] Fix MachineSink not to hoist FP instructions that update USR. Ideally we should make USR as Def for these floating point instructions. However, it violates some assembler MCChecker rules. This patch fixes the issue by marking these FP instructions as non-sinkable.	2022-01-04 15:55:22 -08:00
SANTANU DAS	52f347010a	[Hexagon] Make A2_tfrsi not cheap for operands exceeding 16 bits This patch aids to reduce code size since it removes generation of back-to-back A2_tfrsi instructions. It is enabled only at -Os/-Oz.	2022-01-04 15:46:26 -08:00
Krzysztof Parzyszek	60944d132f	[Hexagon] Convert codegen testcase from .ll to .mir	2022-01-04 15:41:32 -08:00
Harsha Jagasia	2b1c6df5a6	[Hexagon] Performance regression with b2b For code below: { r7 = addasl(r3,r0,#2) r8 = addasl(r3,r2,#2) r5 = memw(r3+r0<<#2) r6 = memw(r3+r2<<#2) } { p1 = cmp.gtu(r6,r5) if (p1.new) memw(r8+#0) = r5 if (p1.new) memw(r7+#0) = r6 } { r0 = mux(p1,r2,r4) } In packetizer, a new packet is created for the cmp instruction since there arent enough resources in previous packet. Also it is determined that the cmp stalls by 2 cycles since it depends on the prior load of r5. In current packetizer implementation, the predicated store is evaluated for whether it can go in the same packet as compare, and since the compare stalls, the stall of the predicated store does not matter and it can go in the same packet as the cmp. However the predicated store will stall for more cycles because of its dependence on the addasl instruction and to avoid that stall we can put it in a new packet. Improve the packetizer to check if an instruction being added to packet will stall longer than instruction already in packet and if so create a new packet.	2022-01-04 14:09:47 -08:00
Brendon Cahoon	db5b791595	[Hexagon] Fix an instruction move in HexagonVectorCombine The HexagonVectorCombine pass was moving an instruction incorrectly, which caused a use in a GEP that was not yet defined. HexagonVectorCombine removes a load from a group due to its dependences, but in realignGroup, the load is processed anyways. In realignGroup, when determining the maximum alignment, only those instructions still in the group should be considered.	2022-01-04 11:41:42 -08:00
Tasmia Rahman	e88eb6443f	[Hexagon] Fix buildVector32 for v4i8 constants The code for constructing a 32-bit constant from 4 8-bit constants has a typo and uses one of the constants twice	2022-01-04 11:19:15 -08:00
Krzysztof Parzyszek	78f5014fea	[Hexagon] Conversions to/from FP types, HVX and scalar Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com> Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com>	2022-01-04 11:03:51 -08:00
Krzysztof Parzyszek	db83e3e507	[Hexagon] Generate HVX/FP arithmetic instructions Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com> Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com> Co-authored-by: Joshua Herrera <joshherr@quicinc.com>	2021-12-30 12:47:30 -08:00
Krzysztof Parzyszek	9e6afbedb0	[Hexagon] Generate HVX/FP compare instructions Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>	2021-12-30 12:17:22 -08:00
Krzysztof Parzyszek	e107374e40	[Hexagon] Explicitly use integer types when rescaling a mask	2021-12-30 10:14:00 -08:00
Krzysztof Parzyszek	eb574259b6	[Hexagon] Handle HVX/FP {masked,wide} loads/stores Co-authored-by: Rahul Utkoor <quic_rutkoor@quicinc.com> Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>	2021-12-30 10:14:00 -08:00
Krzysztof Parzyszek	cd997689f2	[Hexagon] Fix isTypeForHVX to recognize floating point types Co-authored-by: Sumanth Gundapaneni <sgundapa@quicinc.com>	2021-12-30 10:01:05 -08:00
Krzysztof Parzyszek	23423638cc	[Hexagon] Handle HVX/FP shuffles, insertion and extraction Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>	2021-12-30 08:44:10 -08:00
Krzysztof Parzyszek	95c7dd8810	Revert "[Hexagon] Don't build two halves of HVX vector in parallel" This reverts commit ba07f300c6d67a2c6dde8eef216b7a77ac4600bb. A build-vector sequence is made of pairs: rotate+insert. When constructing a single vector, this results in a chain of 2*N instructions. The rotate operation is a permute operation, but the insert uses a multiplication resource: insert and rotate can execute in the same cycle, but obviously they cannot operate on the same vector. The original halving idea is still beneficial since it does allow for insert/rotate overlap, and for hiding insert's latency.	2021-12-30 07:57:11 -08:00
Krzysztof Parzyszek	ba07f300c6	[Hexagon] Don't build two halves of HVX vector in parallel There can only be one permute operations per packet, so this actually pessimizes the code (due to the extra "or").	2021-12-29 11:00:01 -08:00
Joshua Herrera	505d57486e	[Hexagon] Improve BUILD_VECTOR codegen For vectors with repeating values, old codegen would rotate and insert every duplicate element. This patch replaces that behavior with a splat of the most common element, vinsert/vror only occur when needed.	2021-12-29 10:18:21 -08:00
Krzysztof Parzyszek	4df2aba294	[Hexagon] Calling conventions for floating point vectors They are the same as for the other HVX vectors, but types need to be listed explicitly. Also, add a detailed codegen testcase. Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>	2021-12-29 09:01:07 -08:00
Krzysztof Parzyszek	2ce586bc49	[Hexagon] Handle floating point splats Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com>	2021-12-29 06:52:24 -08:00
Krzysztof Parzyszek	33fc675e16	[Hexagon] Handle floating point vector loads/stores	2021-12-29 05:52:39 -08:00
Krzysztof Parzyszek	6a6ac3b36f	[Hexagon] Support BUILD_VECTOR of floating point HVX vectors Co-authored-by: Anirudh Sundar Subramaniam <quic_sanirudh@quicinc.com> Co-authored-by: Ankit Aggarwal <aankit@quicinc.com>	2021-12-28 14:59:08 -08:00
Brian Cain	1e68c79987	Reapply [xray] add support for hexagon Adds x-ray support for hexagon to llvm codegen, clang driver, compiler-rt libs. Differential Revision: https://reviews.llvm.org/D113638 Reapplying this after 543a9ad7c460bb8d641b1b7c67bbc032c9bfdb45, which fixes the leak introduced there.	2021-12-10 05:32:28 -08:00
Brian Cain	ab28cb1c5c	Revert "[xray] add support for hexagon" This reverts commit 543a9ad7c460bb8d641b1b7c67bbc032c9bfdb45.	2021-12-09 07:30:40 -08:00
Brian Cain	543a9ad7c4	[xray] add support for hexagon Adds x-ray support for hexagon to llvm codegen, clang driver, compiler-rt libs. Differential Revision: https://reviews.llvm.org/D113638	2021-12-09 05:47:53 -08:00
Zarko Todorovski	dc9b5550b2	[NFC][llvm][Hexagon] Inclusive Terms remove uses of sanity in Hexagon taget Most changes are rewording comments but there are some assertions that I rephrased. Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D114132	2021-11-22 10:08:01 -05:00
Jay Foad	6abbc3a420	[LiveIntervals] Update subranges in processTiedPairs In TwoAddressInstructionPass::processTiedPairs when updating live intervals after moving the last use of RegB back to the newly inserted copy, update any affected subranges as well as the main range. Differential Revision: https://reviews.llvm.org/D110411	2021-11-11 12:24:59 +00:00
Jay Foad	9951d437d3	[Hexagon] Add machine verification to some tests	2021-11-02 15:41:30 +00:00
Max Kazantsev	8daf76935d	[Test] Regenerate some of llc test checks using auto updater	2021-10-28 16:18:30 +07:00
Jay Foad	20c0280733	[LiveIntervals] Repair subreg ranges in processTiedPairs In TwoAddressInstructionPass::processTiedPairs, update subranges of the live interval for RegB as well as the main range. This is a small step towards switching TwoAddressInstructionPass over from LiveVariables to LiveIntervals. Currently this path is only tested if you explicitly enable -early-live-intervals. Differential Revision: https://reviews.llvm.org/D110526	2021-09-28 08:10:16 +01:00
Jay Foad	e4e95f14f1	[LiveIntervals] Repair live intervals that gain subranges In repairIntervalsInRange, if the new instructions refer to subregs but the old instructions did not, make sure any existing live interval for the superreg is updated to have subranges. Also skip repairing any range that we have recalculated from scratch, partly for efficiency but also to avoids some cases that repairOldRegInRange can't handle. The existing test/CodeGen/AMDGPU/twoaddr-regsequence.mir provides some test coverage for this change: when TwoAddressInstructionPass converts REG_SEQUENCE into subreg copies, the live intervals will now get subranges and MachineVerifier will verify that the subranges are correct. Unfortunately MachineVerifier does not complain if the subranges are not present, so the test also passed before this patch. This patch also fixes ~800 of the ~1500 failures in the whole CodeGen lit test suite when -early-live-intervals is forced on. Differential Revision: https://reviews.llvm.org/D110328	2021-09-24 11:58:08 +01:00
Matt Arsenault	54d755a034	DAG: Fix incorrect folding of fmul -1 to fneg The fmul is a canonicalizing operation, and fneg is not so this would break denormals that need flushing and also would not quiet signaling nans. Fold to fsub instead, which is also canonicalizing.	2021-09-14 21:25:02 -04:00
Matt Arsenault	4a36e96c3f	RegAllocGreedy: Account for reserved registers in num regs heuristic This simple heuristic uses the estimated live range length combined with the number of registers in the class to switch which heuristic to use. This was taking the raw number of registers in the class, even though not all of them may be available. AMDGPU heavily relies on dynamically reserved numbers of registers based on user attributes to satisfy occupancy constraints, so the raw number is highly misleading. There are still a few problems here. In the original testcase that made me notice this, the live range size is incorrect after the scheduler rearranges instructions, since the instructions don't have the original InstrDist offsets. Additionally, I think it would be more appropriate to use the number of disjointly allocatable registers in the class. For the AMDGPU register tuples, there are a large number of registers in each tuple class, but only a small fraction can actually be allocated at the same time since they all overlap with each other. It seems we do not have a query that corresponds to the number of independently allocatable registers. Relatedly, I'm still debugging some allocation failures where overlapping tuples seem to not be handled correctly. The test changes are mostly noise. There are a handful of x86 tests that look like regressions with an additional spill, and a handful that now avoid a spill. The worst looking regression is likely test/Thumb2/mve-vld4.ll which introduces a few additional spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll shows a massive improvement by completely eliminating a large number of spills inside a loop.	2021-09-14 21:00:29 -04:00
Brendon Cahoon	42dace9c5b	[Hexagon] Use getTypeAllocSize to compute difference between objects The code was using getTypeStoreSize to calculate the difference between consecutive objects. The calculation was incorrect due to padding that is added between consecutive objects. The getTypeAllocSize includes the padding amount. For example, if the type is [19 x i8], the difference between consecutive objects is 32 bytes, not 19 bytes. A second case for getTypeAllocSize is needed when computing the pointer values for the vector accesses. The calculation needs to account for the padding as well. Differential Revision: https://reviews.llvm.org/D109403	2021-09-13 19:04:59 -05:00
Ankit Aggarwal	a72763af67	[Hexagon] Handle bitcast of i64/i128 -> v64i1/v128i1	2021-09-13 18:52:30 -05:00
Nikita Popov	90ec6dff86	[OpaquePtr] Forbid mixing typed and opaque pointers Currently, opaque pointers are supported in two forms: The -force-opaque-pointers mode, where all pointers are opaque and typed pointers do not exist. And as a simple ptr type that can coexist with typed pointers. This patch removes support for the mixed mode. You either get typed pointers, or you get opaque pointers, but not both. In the (current) default mode, using ptr is forbidden. In -opaque-pointers mode, all pointers are opaque. The motivation here is that the mixed mode introduces additional issues that don't exist in fully opaque mode. D105155 is an example of a design problem. Looking at D109259, it would probably need additional work to support mixed mode (e.g. to generate GEPs for typed base but opaque result). Mixed mode will also end up inserting many casts between i8* and ptr, which would require significant additional work to consistently avoid. I don't think the mixed mode is particularly valuable, as it doesn't align with our end goal. The only thing I've found it to be moderately useful for is adding some opaque pointer tests in between typed pointer tests, but I think we can live without that. Differential Revision: https://reviews.llvm.org/D109290	2021-09-10 15:18:23 +02:00
Krzysztof Parzyszek	64d5b6e373	[Hexagon] Fix resetting dead registers in DBG_VALUE_LISTs This fixes https://llvm.org/PR51229.	2021-07-27 18:36:28 -05:00
Johannes Doerfert	25a3130d89	[Local] Do not introduce a new `llvm.trap` before `unreachable` This is the second attempt to remove the `llvm.trap` insertion after https://reviews.llvm.org/rGe14e7bc4b889dfaffb7180d176a03311df2d4ae6 reverted the first one. It is not clear what the exact issue was back then and it might already be gone by now, it has been >5 years after all. Replaces D106299. Differential Revision: https://reviews.llvm.org/D106308	2021-07-26 23:33:36 -05:00
David Green	4ce26deac2	[DAG] Reassociate Add with Or We already have reassociation code for Adds and Ors separately in DAG combiner, this adds it for the combination of the two where Ors act like Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where we know that the Ors operands have no common bits set, and the Or has one use. Differential Revision: https://reviews.llvm.org/D104765	2021-07-07 10:21:07 +01:00
Krzysztof Parzyszek	94e01d579c	[Hexagon] Generate trap/undef if misaligned access is detected This applies to memory accesses to (compile-time) constant addresses (such as memory-mapped registers). Currently when a misaligned access to such an address is detected, a fatal error is reported. This change will emit a remark, and the compilation will continue with a trap, and "undef" (for loads) emitted. This fixes https://llvm.org/PR50838. Differential Revision: https://reviews.llvm.org/D50524	2021-07-06 14:52:23 -05:00

1 2 3 4 5 ...

993 Commits