llvm-project

Author	SHA1	Message	Date
Ilya Leoshkevich	8e810dc7d9	[SystemZ] Support builtin_{frame,return}_address() with non-zero argument (#69405 ) When the code is built with -mbackchain, it is possible to retrieve the caller's frame and return addresses. GCC already can do this, add this support to Clang as well. Use RISCVTargetLowering and GCC's s390_return_addr_rtx() as inspiration. Add tests based on what GCC is emitting.	2023-10-18 19:05:31 +02:00
Yusra Syeda	6cf41ada44	[SystemZ][z/OS] Add vararg support to z/OS (#68834 ) This PR adds vararg support to z/OS and updates the call-zos-vararg.ll lit test. Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>	2023-10-12 12:42:55 +02:00
Jay Foad	7b3bbd83c0	Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038 )" This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c. Reverted due to various buildbot failures.	2023-10-09 12:31:32 +01:00
Jay Foad	2501ae58e3	[CodeGen] Really renumber slot indexes before register allocation (#67038 ) PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.	2023-10-09 11:44:41 +01:00
Kai Nacke	42de2b7e99	[SystemZ/z/OS] Add library names for intrinsics (#68114 ) On z/OS, many library functions have a non-standard name. This change initializes the table of runtime function which results from lowering intrinsics to library calls.	2023-10-03 18:53:52 +03:00
Noah Goldstein	de7881ebf5	[DAGCombiner] Combine `(select c, (and X, 1), 0)` -> `(and (zext c), X)` The middle end canonicalizes: `(and (zext c), X)` -> `(select c, (and X, 1), 0)` But the `and` + `zext` form gets better codegen.	2023-09-28 13:46:46 -05:00
Jay Foad	44e997a158	[TwoAddressInstruction] Use isPlainlyKilled in processTiedPairs (#65976 ) Calling isPlainlyKilled instead of directly checking for a kill flag should make processTiedPairs behave the same with LiveIntervals (i.e. when compiling with -early-live-intervals) as it does with LiveVariables.	2023-09-19 16:44:20 +01:00
Jay Foad	e0919b189b	[CodeGen] Renumber slot indexes before register allocation (#66334 ) RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps. This also means that enabling -early-live-intervals, which runs the SlotIndexes analysis earlier, will not cause large amounts of churn due to different register allocator decisions.	2023-09-19 11:18:12 +01:00
Jay Foad	0528dbfe5c	Add some -early-live-intervals RUN lines (#66058 ) This adds test coverage for an upcoming change to TwoAddressInstructionPass::processTiedPairs.	2023-09-12 13:06:10 +01:00
Neumann Hon	d00f59893e	[SystemZ][z/OS] Fix the entry point marker for leaf functions The function emitFunctionEntryLabel does not look at whether or not a function is a leaf when setting the entry flags, and instead blindly marks all functions as non-leaf routines. Differential Revision: https://reviews.llvm.org/D157701 Reviewed By: uweigand	2023-08-23 09:50:01 -04:00
Neumann Hon	43207225b6	Revert "[SystemZ][z/OS] Fix the entry point marker for leaf functions" This reverts commit 8af297bbb8e97de8908b857eae1a44f46a0d5afe. Testcase LLVM :: MC/GOFF/ppa1.ll needs to be updated to account for this.	2023-08-20 22:04:02 -04:00
Neumann Hon	8af297bbb8	[SystemZ][z/OS] Fix the entry point marker for leaf functions The function emitFunctionEntryLabel does not look at whether or not a function is a leaf when setting the entry flags, and instead blindly marks all functions as non-leaf routines. Change it to check if a function is a leaf function and mark it accordingly.	2023-08-20 21:53:13 -04:00
Neumann Hon	3e139be29f	[SystemZ][z/OS] Add support for function name field of PPA1 This PR causes the PPA1 to emit the function's name if it exists. This field is not emitted for unnamed functions. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D157494	2023-08-10 04:40:19 -04:00
Jay Foad	56d92c1758	[MachineScheduler] Track physical register dependencies per-regunit Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Recommit after fixing AggressiveAntiDepBreaker in D156880. Differential Revision: https://reviews.llvm.org/D156552	2023-08-07 15:41:40 +01:00
Jay Foad	e2e3f06813	Revert "[MachineScheduler] Track physical register dependencies per-regunit" This reverts commit 1a54671d5405a39de362e9692ce963c0638023bc. It was causing lit test failures in a LLVM_ENABLE_EXPENSIVE_CHECKS build.	2023-07-29 18:05:25 +01:00
Jay Foad	1a54671d54	[MachineScheduler] Track physical register dependencies per-regunit Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Differential Revision: https://reviews.llvm.org/D156552	2023-07-29 15:34:53 +01:00
Kevin P. Neal	58ad5699e7	[FPEnv][SystemZ] Correct strictfp tests. Correct a SystemZ strictfp test to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics This test, like many others, just needed the function definition corrected. Test changes verified with D146845.	2023-07-26 09:08:46 -04:00
Amaury Séchet	872276de4b	[NFC] Autogenerate CodeGen/SystemZ/int-{uadd,sub}-0*.ll	2023-07-05 20:14:43 +00:00
Yusra Syeda	163aad6bcb	[SystemZ][z/OS] z/OS ADA codegen and emission This patch adds support for the ADA (associated data area), doing the following: -Creates the ADA table to handle displacements -Emits the ADA section in the SystemZAsmPrinter -Lowers the ADA_ENTRY node into the appropriate load instruction Differential Revision: https://reviews.llvm.org/D153788	2023-07-05 13:21:52 -04:00
Yusra Syeda	1bfdc534aa	Revert "[SystemZ][z/OS] This patch adds support for the ADA (associated data area), doing the following:" This reverts commit 9df0f66af5462e23216eae31aedbd4d2f459cc3d.	2023-06-28 11:18:12 -04:00
Yusra Syeda	9df0f66af5	[SystemZ][z/OS] This patch adds support for the ADA (associated data area), doing the following: - Creates the ADA table to handle displacements - Emits the ADA section in the SystemZAsmPrinter - Lowers the ADA_ENTRY node into the appropriate load instruction Differential Revision: https://reviews.llvm.org/D153788	2023-06-28 10:13:10 -04:00
Matt Arsenault	80e2c26dfd	RegisterCoalescer: Fix name of pass I finally snapped and fixed this inconsistency.	2023-06-21 10:30:43 -04:00
Amaury Séchet	0a76f7d9d8	[NFC] Autogenerate numerous SystemZ tests	2023-06-14 21:47:31 +00:00
Neumann Hon	8a7a2da18f	[SystemZ][z/OS] Correct value of length/4 of params field in PPA1. The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function. Differential Revision: https://reviews.llvm.org/D152920 Reviewed By: uweigand	2023-06-14 13:37:46 -04:00
Neumann Hon	049324ac5e	Revert "[SystemZ][z/OS] Correct value of length/4 of params field in PPA1." This reverts commit e0f7b0e0f704dc3759925602e474b9e669270fcb.	2023-06-14 13:34:16 -04:00
Neumann Hon	e0f7b0e0f7	[SystemZ][z/OS] Correct value of length/4 of params field in PPA1. The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function. Differential revision: https://reviews.llvm.org/D119049 Reviewed By: uweigand	2023-06-14 13:20:45 -04:00
Amaury Séchet	a70d5e25f3	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-13 09:14:37 +00:00
JP Lehr	c9998ec145	Revert "[DAGCombine] Make sure combined nodes are added back to the worklist in topological order." This reverts commit e69fa03ddd85812be3143d79a0359c3e8d43bd45. This patch lead to build time outs on the AMDGPU OpenMP runtime buildbot.	2023-06-05 10:55:58 -04:00
Amaury Séchet	e69fa03ddd	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-05 11:09:18 +00:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
Tobias Hieta	b71edfaa4e	[NFC][Py Reformat] Reformat python files in llvm This is the first commit in a series that will reformat all the python files in the LLVM repository. Reformatting is done with `black`. See more information here: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: jhenderson, JDevlieghere, MatzeB Differential Revision: https://reviews.llvm.org/D150545	2023-05-17 10:48:52 +02:00
Jonas Paulsson	64599ac97e	[MachineSink] Don't reject sinking because of dead def in isProfitableToSinkTo(). An instruction should be sunk (if otherwise legal and profitable) regardless of if it has a dead def of a physreg or not. Physreg defs are checked in other places and sinking is only done with dead defs of regs that are not live into the target MBB. Differential Revision: https://reviews.llvm.org/D150447 Reviewed By: sebastian-ne, arsenm	2023-05-16 10:00:44 +02:00
Neumann Hon	80c643c464	[SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked When the stack frame extension routine is used, the contents of r3 is overwritten. However, if r3 is live in the prologue (ie. one of the function's parameters resides in r3), it needs to be saved. We save r3 in r0 if r0 is available (ie. r0 is not used as temporary storage for r4), and in the corresponding stack slot for the third parameter otherwise. Differential Revision: https://reviews.llvm.org/D150332 Reviewed By: uweigand	2023-05-12 09:32:04 -04:00
Neumann Hon	39b8af47fc	Revert "[SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked" This reverts commit 1aec3d15aaa25c39fae026688708d7353d488974.	2023-05-11 22:32:16 -04:00
Neumann Hon	1aec3d15aa	[SystemZ][z/OS] Save (and restore) R3 to avoid clobbering parameter when call stack frame extension is invoked When the stack frame extension routine is used, the contents of r3 is overwritten. However, if r3 is live in the prologue (ie. one of the function's parameters resides in r3), it needs to be saved. We save r3 in r0 if r0 is available (ie. r0 is not used as temporary storage for r4), and in the corresponding stack slot for the third parameter otherwise. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D150332	2023-05-11 21:25:05 -04:00
Zequan Wu	3977b77a6b	[CodeGen] Fix nomerge attribute not working in tail calls. In D79537, `nomerge` was made to only apply to non-tail calls. This fixes it by also applying it to tail calls. For ARM, I only made the new MI to inherit the flag under `TCRETURNdi` and `TCRETURNri`, because that's the place tail calls got replaced. Not sure if there's any other place needed. Fixes #61545. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D146749	2023-05-10 14:25:11 -04:00
Jonas Paulsson	655f0fc4b9	Reapply "[SystemZ] Bugfix in expansion of memmem operations." The new test case showed that the NoPHIs flag needs to be cleared. Original commit message: [SystemZ] Bugfix in expansion of memmem operations. Since NC, OC, and XC clobber CC, the EXRL_Pseudo targeting these must also be marked to do so. Original patch by uweigand. Reviewed by: uweigand Differential Revision: https://reviews.llvm.org/D150251 Fixes: https://github.com/llvm/llvm-project/issues/62572	2023-05-10 12:40:57 +02:00
Jonas Paulsson	dfa42a69b8	Revert "[SystemZ] Bugfix in expansion of memmem operations." Sorry - mir test fails with expensive checks on build bot. Seems to relate to the fact that there are no PHIs in the .mir input, but after they are created the verifyer reports "Found PHI instruction with NoPHIs property set". This reverts commit 00454a17f361d677d5423905c888daca1a80661a.	2023-05-10 11:34:55 +02:00
Jonas Paulsson	00454a17f3	[SystemZ] Bugfix in expansion of memmem operations. Since NC, OC, and XC clobber CC, the EXRL_Pseudo targeting these must also be marked to do so. Original patch by uweigand. Reviewed by: uweigand Differential Revision: https://reviews.llvm.org/D150251 Fixes: https://github.com/llvm/llvm-project/issues/62572	2023-05-10 11:05:13 +02:00
Zequan Wu	321d02cc6b	[NFC] Update CodeGen/*/nomerge.ll tests with utils/update_llc_test_checks.py. Precommit this patch for better diff view on D146749. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D147454	2023-04-03 19:52:39 -04:00
Jonas Paulsson	b4b4950f7f	[SystemZ] Allow fp/int casting with inline assembly operands. Support bitcasting between int/fp/vector values and 'r'/'f'/'v' inline assembly operands. This is intended to match GCCs beahvior. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D146059	2023-03-24 19:57:25 +01:00
Carl Ritson	e7c1b4b64c	[SystemZ] Fix modelling of composed subreg indices. A rare case where coalescing resulted in a hh32 (high32 of high64 of vector register) subreg usage caused getSubReg() to fail as the vector reg does not have that subreg in its subregs list, but rather h32 which was expected to also act as hh32. See link below for the discussion when solving this. Patch By: critson Reviewed By: uweigand Fixes: https://github.com/llvm/llvm-project/issues/61390	2023-03-21 16:39:22 +01:00
Nikita Popov	687b5b9a0c	[SCEVExpander] Always use scevgep as name With opaque pointers the scevgep / uglygep distinction no longer makes sense -- GEPs are always emitted in offset-based representation.	2023-03-17 14:27:03 +01:00
Jonas Paulsson	f8803919ad	[SystemZ] Clear NW flags on an ISD::SUB when reused as comparison. The SystemZ backend will try to reuse an existing subtraction of two values whenever they are to be compared for equality. This depends on the SystemZ subtraction instruction setting the condition code, which can also signal overflow. A later pass will remove the compare and reuse the CC from the subtraction directly. However, if that subtraction has the NSW flag set it will not include the overflow bit in the updated CC user. That was a bug which can lead to wrong results, as shown by a csmith program. Fixes: https://github.com/llvm/llvm-project/issues/61268 Reviewed By: nikic, uweigand Differential Revision: https://reviews.llvm.org/D145811	2023-03-14 19:46:41 +01:00
Simon Pilgrim	723b6cf7a8	[DAG] visitFREEZE - handle case where the folded node merges with another existing node Fixes #60413	2023-02-04 20:53:47 +00:00
Sanjay Patel	fb3e3ef62e	[SDAG] fix miscompiles caused by using ValueTracking matchSelectPattern to create FMINIMUM/FMAXIMUM ValueTracking attempts to match compare+select patterns to FP min/max operations, but it was created before the newer IEEE-754-2019 minimum/maximum ops were defined. Ie, matchSelectPattern() does not account for the -0.0/+0.0 behavior that is specified in the newer standard. FMINIMUM/FMAXIMUM nodes were created to map to the newer standard: /// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0 /// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008 /// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics. We could adjust ValueTracking to deal with signed zero, but it seems like a moot point given the divergent NaN behavior discussed in D143056, so just delete this possibility to avoid bugs when converting IR to SDAG. Differential Revision: https://reviews.llvm.org/D143106	2023-02-03 09:53:47 -05:00
Jonas Paulsson	0ece2050da	[SystemZ] Implement isGuaranteedNotToBeUndefOrPoisonForTargetNode(). Returning true from this method for PCREL_WRAPPER and PCREL_OFFSET avoids problems when a PCREL_OFFSET node ends up with a freeze operand, which is not handled or expected by the backend. Fixes #60107 Reviewed By: uweigand, RKSimon Differential Revision: https://reviews.llvm.org/D142971	2023-02-01 13:28:18 +01:00
Jonas Paulsson	7fd3ed9ad7	[SystemZ] Add atomicrmw tests for i128 (NFC). Review: Ulrich Weigand	2023-01-26 19:20:59 +01:00
Jonas Paulsson	a9c5a98f81	[SystemZ] Improvement in tryRxSBG(). Only allow replacements of nodes that have a single user. This is better as simple instructions (e.g. XGRK) are one cycle faster, and it helps in cases where both inputs share a common node. Review: Ulrich Weigand	2023-01-19 10:43:52 -06:00
Tulio Magno Quites Machado Filho	1136cf1721	[SystemZ] Implement lowering of GET_ROUNDING Add support for _FLT_ROUNDS_ in SystemZ. Patch by Tulio Magno Quites Machado Filho. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D140988	2023-01-18 14:41:19 -06:00

1 2 3 4 5 ...

877 Commits