llvm-project

Author	SHA1	Message	Date
Tony Tao	019ecdf7bb	[SystemZ][GOFF] Implement emitGlobalAlias for GOFF/HLASM (#180041 ) HLASM has a requirement where aliasing labels need to be emitted at the same time as the aliasee label, similar to AIX. I used their implementation for reference with some modifications as we can only alias functions and we must emit all symbol attributes before the label is emitted to ensure the XATTR instruction contains the correct attributes. --------- Co-authored-by: Tony Tao <tonytao@ca.ibm.com>	2026-02-06 14:49:25 -05:00
Kai Nacke	f3bd1b9526	[SystemZ][z/OS] Use the text section for jump tables (#179793 ) Jump tables are read only data, and the text section is the best choice for them.	2026-02-05 08:18:17 -05:00
sujianIBM	8b28f5229e	[SystemZ][z/OS] Reverse the order of instructions to save and restore CSRs (#179540 ) Reverse the order of instructions to save and restore CSRs so instruction on small numbered reg goes first.	2026-02-04 11:48:09 -05:00
sujianIBM	bc80d1ac0c	[SystemZ][z/OS] Set R5 as not restored. (#179666 ) R5 (environment register) should not be restored. This is missing in the code. Add it back and also add a test to verify it.	2026-02-04 10:57:24 -05:00
Tony Tao	637a038c04	[SystemZ][GOFF] Implement lowerConstant (#179394 ) Implement lowerConstants for SystemZ and handle special cases where entries need to be created in the ADA for static functions or VCon for externals. --------- Co-authored-by: Tony Tao <tonytao@ca.ibm.com>	2026-02-04 10:03:34 -05:00
Kai Nacke	8a895b3151	[GOFF] Add emission of debug sections (#178677 ) This PR adds the definition of the debug sections for emission into GOFF files. Currently, there is no debugger available which supports all the sections. However, they all must defined to avoid regression in LIT test cases.	2026-02-03 14:38:24 -05:00
Jonas Paulsson	09f9a2892a	[SystemZ] Bugfix: Add VLR16 to SystemZInstrInfo::copyPhysReg(). (#178932 ) Support COPYs involving higher FP16 regs (like F24H) with a new pseudo instruction 'VLR16'. This is needed with -O0/regalloc=fast, and probably in more cases as well. Fixes #178788.	2026-01-30 14:55:07 -06:00
Anikesh Parashar	fd45140ed6	[DAG] SimplifyDemandedBits - ICMP_SLT(X,0) - only sign mask of X is required (#164946 ) Resolves #164589	2026-01-28 17:30:23 +00:00
Dominik Steenken	355898a6ce	[SystemZ] Enable -fpatchable-function-entry=M,N (#178191 ) This PR enables the option `-fpatchable-function-entry` for SystemZ. It utilizes existing common code and just adds the emission of nops after the function label in the backend. SystemZ provides multiple nop options of varying length, making the semantics of this option somewhat ambiguous. In order to align with what `gcc` does with that same option, we#re choosing `nopr` as the canoonical nop for this purpose. For test, this adapts an existing test file from aarch64.	2026-01-28 10:42:54 +01:00
Jonas Paulsson	c999e9a4fe	[SystemZ] Support fp16 vector ABI and basic codegen. (#171066 ) - Make v8f16 a legal type so that arguments can be passed in vector registers. Handle fp16 vectors so that they have the same ABI as other fp vectors. - Set the preferred vector action for fp16 vectors to "split". This will scalarize all operations, which is not always necessary (like with memory operations), but it avoids the superfluous operations that result after first widening and then scalarizing a narrow vector (like v4f16). Fixes #168992	2026-01-26 13:42:25 -06:00
Amy Kwan	41567d8ec2	[SystemZ] Implement ctor/dtor emission via @@SQINIT and .xtor sections (#171476 ) This patch implements support for constructors/destructors by introducing the `@@SQINIT` section and emitting `.xtor.<priority>` sections within the SystemZ AsmPrinter and in the GOFF object lowering layer.	2026-01-23 13:29:08 -05:00
Jonas Paulsson	8eccda10d2	[SystemZ] Add SP alignment to the DataLayout string. (#176041 ) Add '-S64' to the SystemZ datalayout string, to avoid overalignment of stack objects. Fixes #173402	2026-01-20 09:54:47 -06:00
Mikhail Gudim	40a28769a4	[ReachingDefAnalysis] Print basic blocks. (#175568 )	2026-01-14 06:29:31 -08:00
Kai Nacke	84bbaa097c	[SystemZ][z/OS] Handle labels for parts (#175665 ) Global data is emitted into parts, which are modelled as a MCSection. A label (symbol of type LD) is not allowed in a part, which requires special handling. The approach is to not emit the label at all, and using the part symbol in relocations.	2026-01-13 09:15:27 -05:00
Trevor Gross	e7f23b410b	[SystemZ] Remove the `softPromoteHalfType` override (#175410 ) `softPromoteHalfType` is being phased out because it is prone to miscompilations (further context at [1]). SystemZ is one of the few remaining platforms to override the default, so remove it here. This only affects SystemZ when the `soft-float` option is used. [1]: https://github.com/llvm/llvm-project/pull/175149	2026-01-11 16:43:40 +01:00
Amy Kwan	1671bb67e7	[SystemZ] Change default backend to ASCII (#174470 ) The current (and default) backend on z/OS is EBCDIC. This patch updates the default backend to be ASCII, which is beneficial when porting new languages. With this change, ASCII is the default when no special metadata nodes (such as `zos_le_char_mode`) are present.	2026-01-07 14:27:25 -05:00
Matt Arsenault	56ce7ed72b	llvm: Convert some assorted lit tests to opaque pointers (#174564 ) Some of the MIR test hit a bug where it errors if there is a raw global reference as the referenced value. Worked around some of those by just keeping a no-op bitcast constant expression.	2026-01-06 11:41:27 +00:00
Kai Nacke	611a271e8d	[GOFF] Write out relocations in the GOFF writer (#167054 ) Add support for writing relocations. Since the symbol numbering is only available after the symbols are written, the relocations are collected in a vector. At write time, the relocations are converted using the symbols ids, compressed and written out. A relocation data record is limited to 32K-1 bytes, which requires making sure that larger relocation data is written into multiple records.	2025-12-20 15:51:46 -05:00
Kai Nacke	37545b80f7	[GOFF] Emit symbols for functions. (#144437 ) A function entry is mapped to a LD symbol with an offset to the begin of the section. HLASM syntax is emitted accordingly.	2025-12-20 13:22:04 -05:00
Anikesh Parashar	7b101d2198	[SystemZ] Update CodeGen/SystemZ/tdc-05.ll test file (#172437 ) This PR updates `llvm/test/CodeGen/SystemZ/tdc-05.ll` using `llvm/utils/update_llc_test_checks.py` to refresh the expected output. The updated checks reflect the current output of llc and reduce noise in future diffs.	2025-12-19 21:33:32 +01:00
KRM7	c9aea6248a	[RegisterCoalescer] Don't commute two-address instructions which only define a subregister (#169031 ) Currently, the register coalescer may try to commute an instruction like: ``` %0.sub_lo32:gpr64 = AND %0.sub_lo32:gpr64(tied-def 0), %1.sub_lo32:gpr64 USE %0:gpr64 ``` resulting in: ``` %1.sub_lo32:gpr64 = AND %1.sub_lo32:gpr64(tied-def 0), %0.sub_lo32:gpr64 USE %1:gpr64 ``` However, this is not correct if the instruction doesn't define the entire register, as the value of the upper 32-bits of the register used in `USE` will not be the same.	2025-12-18 23:24:44 +01:00
Folkert de Vries	a587ccd87d	fix `llvm.fma.f16` double rounding issue when there is no native support (#171904 ) fixes https://github.com/llvm/llvm-project/issues/98389 As the issue describes, promoting `llvm.fma.f16` to `llvm.fma.f32` does not work, because there is not enough precision to handle the repeated rounding. `f64` does have sufficient space. So this PR explicitly promotes the 16-bit fma to a 64-bit fma. I could not find examples of a libcall being used for fma, but that's something that could be looked in separately to work around code size issues.	2025-12-17 22:03:01 +01:00
Nikita Popov	5a24dfa339	[SDAG] Remove most non-canonical libcall handing (#171288 ) This is a followup to https://github.com/llvm/llvm-project/pull/171114, removing the handling for most libcalls that are already canonicalized to intrinsics in the middle-end. The only remaining one is fabs, which has more test coverage than the others.	2025-12-10 11:45:26 +01:00
Dominik Steenken	ca12d1d8f1	[SystemZ] Improve CCMask optimization (#171137 ) This commit addresses a shortcoming in the implementation of `combineBR_CCMASK` and `combineSELECT_CCMASK`. In cases where `combineCCMask` was able to reduce the ccmask going into the select or branch to either true (`ccvalid`) or false (`0`), a trivial instruction would be emitted (i.e. either a select that would only ever select one side, or a conditional branch with `true` or `false` as the branch condition). This led under certain circumstances to, e.g., `BRC` instructions being emitted that triggered an assert in the AsmPrinter meant to exclude such branch conditions. For the select case, this commit introduces an early bailout that simply returns the value that would "always" be selected. For the branch case, the commit introduces an additional guard that prevents the DAGCombine from taking effect, thereby preventing the illegal instruction from being emitted.	2025-12-09 11:20:40 +01:00
Nikita Popov	c15a3dd932	[SystemZ] Generate test checks (NFC)	2025-12-09 10:49:49 +01:00
Jonas Paulsson	0b252daf64	[SystemZ] Handle IR struct arguments correctly. (#169583 ) - The size of the stack slot was previously computed in LowerCall() by using the original type, but that didn't work for a struct. Compute the size by looking at the VT of each part and the number of them instead. - All the members of a struct have the same OrigArgIndex, so it doesn't work to assume that following parts belong to a split argument until another OrigArgIndex is encountered. Use the isSplit() and isSplitEnd() flags instead. - Detect any scalar integer argumet >64 bits in CanLowerReturn() instead of just i128, in order to let all of them be passed on stack. Fixes #168460	2025-12-04 13:14:31 -06:00
Kai Nacke	66ca3f1367	[SystemZ] Serialize ada entry flags (#169395 ) Adding support for serializing the ada entry flags helps with mir based test cases. Without this change, the flags are simple displayed as being "unkmown".	2025-11-27 08:14:43 -05:00
Kai Nacke	47efff777d	[SystemZ] Emit optional argument area length field (#169679 ) The Language Environment (LE) reserves 128 byte for the argument area when the optional field is not present. If the argument area is larger, then the field must be present to guarantee that the space is reserved on stack extension. Creating this field when alloca() is used may reduce the needed stack space in case alloca() causes a stack extension.	2025-11-26 16:16:13 -05:00
Matt Arsenault	dfdada1b78	CodeGen: Remove target hook for terminal rule (#165962 ) Enables the terminal rule for remaining targets	2025-11-12 21:12:19 +00:00
Nicolai Hähnle	d1387ed272	CodeGen: More accurate mayAlias for instructions with multiple MMOs (#166211 ) There can only be meaningful aliasing between the memory accesses of different instructions if at least one of the accesses modifies memory. This check is applied at the instruction-level earlier in the method. This change merely extends the check on a per-MMO basis. This affects a SystemZ test because PFD instructions are both mayLoad and mayStore but may carry a load-only MMO which is now no longer treated as aliasing loads. The PFD instructions are from llvm.prefetch generated by loop-data-prefetch.	2025-11-06 09:19:37 -08:00
Vigneshwar Jayakumar	b5f200129a	[CodeGen] Register-coalescer remat fix subreg liveness (#165662 ) This is a bugfix in rematerialization where the liveness of subreg mask was incorrectly updated causing crash in scheduler.	2025-11-04 22:40:40 -06:00
Craig Topper	d310693bde	[SelectionDAG] Use GetPromotedInteger when promoting integer operands of PATCHPOINT/STACKMAP. (#165926 ) This is consistent with other promotion, but causes negative constants to be sign extended instead of zero extended in some cases. I guess getNode and type legalizer are inconsistent about what ANY_EXTEND of a constant does.	2025-10-31 22:11:13 +00:00
anoopkg6	242c716c68	Fix Linux kernel build failure for SytemZ. (#165274 ) Linux kernel build fails for SystemZ as output of INLINEASM was GR32Bit general-purpose register instead of SystemZ::CC. --------- Co-authored-by: anoopkg6 <anoopkg6@github.com> Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>	2025-10-27 18:22:01 +01:00
paperchalice	3656f6f226	[CodeGen] Remove `-enable-unsafe-fp-math` option (#164559 ) Hope this can unblock #105746.	2025-10-22 15:40:31 +08:00
Simon Pilgrim	1360aecb01	[SystemZ] Avoid trunc(add(X,X)) patterns (#164378 ) Replace with trunc(add(X,Y)) to avoid premature folding in upcoming patch #164227	2025-10-21 09:35:16 +00:00
anoopkg6	6712e20c52	Add support for flag output operand "=@cc" for SystemZ. (#125970 ) Added Support for flag output operand "=@cc", inline assembly constraint for SystemZ. - Clang now accepts "=@cc" assembly operands, and sets 2-bits condition code for output operand for SyatemZ. - Clang currently emits an assertion that flag output operands are boolean values, i.e. in the range [0, 2). Generalize this mechanism to allow targets to specify arbitrary range assertions for any inline assembly output operand. This will be used to assert that SystemZ two-bit condition-code values are in the range [0, 4). - SystemZ backend lowers "@cc" targets by using ipm sequence to extract condition code from PSW. - DAGCombine tries to optimize lowered ipm sequence by combining CCReg and computing effective CCMask and CCValid in combineCCMask for select_ccmask and br_ccmask. - Cost computation is done for merging conditionals for branch instruction in SelectionDAG, as split may cause branches conditions evaluation goes across basic block and difficult to combine. --------- Co-authored-by: anoopkg6 <anoopkg6@github.com> Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>	2025-10-14 11:53:42 +02:00
Luke Lau	795a115d19	[RegAlloc] Remove default restriction on non-trivial rematerialization (#159211 ) In the register allocator we define non-trivial rematerialization as the rematerlization of an instruction with virtual register uses. We have been able to perform non-trivial rematerialization for a while, but it has been prevented by default unless specifically overriden by the target in `TargetTransformInfo::isReMaterializableImpl`. The original reasoning for this given by the comment in the default implementation is because we might increase a live range of the virtual register, but we don't actually do this. LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize instructions whose virtual registers are already live at the use sites. https://reviews.llvm.org/D106408 had originally tried to remove this restriction but it was reverted after some performance regressions were reported. We think it is likely that the regressions were caused by the fact that the old isTriviallyReMaterializable API sometimes returned true for non-trivial rematerializations. However https://github.com/llvm/llvm-project/pull/160377 recently split the API out into a separate non-trivial and trivial version and updated the call-sites accordingly, and https://github.com/llvm/llvm-project/pull/160709 and #159180 fixed heuristics which weren't accounting for the difference between non-trivial and trivial. With these fixes in place, this patch proposes to again allow non-trivial rematerialization by default which reduces a significant amount of spills and reloads across various targets. For llvm-test-suite built with -O3 -flto, we get the following geomean reduction in reloads: - arm64-apple-darwin: 11.6% - riscv64-linux-gnu: 8.1% - x86_64-linux-gnu: 6.5%	2025-10-04 22:50:44 +00:00
Matt Arsenault	3537e8abfa	RegAllocGreedy: Check if copied lanes are live in trySplitAroundHintReg (#160424 ) For subregister copies, do a subregister live check instead of checking the main range. Doesn't do much yet, the split analysis still does not track live ranges.	2025-10-02 12:21:02 +00:00
Mikhail Gudim	562146499c	[CodeGen][NewPM] Port `ReachingDefAnalysis` to new pass manager. (#159572 ) In this commit: (1) Added new pass manager support for `ReachingDefAnalysis`. (2) Added printer pass. (3) Make old pass manager use `ReachingDefInfoWrapperPass`	2025-09-19 09:38:34 -04:00
Folkert de Vries	8a9e3333dd	s390x: optimize 128-bit fshl and fshr by high values (#154919 ) Turn a funnel shift by N in the range `121..128` into a funnel shift in the opposite direction by `128 - N`. Because there are dedicated instructions for funnel shifts by values smaller than 8, this emits fewer instructions. This additional rule is useful because LLVM appears to canonicalize `fshr` into `fshl`, meaning that the rules for `fshr` on values less than 8 would not match on organic input.	2025-08-27 09:31:49 +02:00
Folkert de Vries	558657298a	s390x: pattern match saturated truncation (#155377 ) Simplify min/max instruction matching by making the related SelectionDAG operations legal. Add patterns to match (signed and unsigned) saturated truncation based on open-coded min/max patterns. Fixes https://github.com/llvm/llvm-project/issues/153655	2025-08-26 17:19:58 +02:00
Nikita Popov	63e7766047	[SystemZ] Allow forming overflow op for i128 (#153557 ) Allow matching i128 overflow pattern into UADDO, which then allows use of vaccq.	2025-08-14 16:15:22 +02:00
KRM7	ee47427386	[RegisterCoalescer] Fix subrange update when rematerialization widens a def (#151974 ) Currently, when an instruction rematerialized by the register coalescer defines more subregs of the destination register than the original COPY instruction did, we only add dead defs for the newly defined subregs if they were not defined anywhere else. For example, consider something like this before rematerialization: ``` %0:reg64 = CONSTANT 1 %1:reg128.sub_lo64_lo32 = COPY %0.lo32 %1:reg128.sub_lo64_hi32 = ... ... ``` that would look like this after rematerializing `%0`: ``` %0:reg64 = CONSTANT 2 %1:reg128.sub_lo64 = CONSTANT 2 %1:reg128.sub_lo64_hi32 = ... ... ``` A dead def would not be added for `%1.sub_lo64_hi32` at the 2nd instruction because it's subrange wasn't empty beforehand.	2025-08-05 22:32:31 +09:00
Matt Arsenault	12568b6a4f	SystemZ: Add sincos intrinsic test (#147473 ) The ZOS run line is mostly broken. update_test_checks seems to not work on it and I have no idea what I'm looking at here. It's not obvious to me what the calls are. I added some checks for the references to the libcalls printed at the end of the module, but didn't check anything in the function body. half also just asserts somewhere.	2025-08-05 12:55:26 +09:00
sujianIBM	fc12fc635b	[SystemZ] Fix code in widening vector multiplication (#150836 ) Commit cdc7864 has an error which would wrongly fold widening multiplications into an even/odd widening operation. This PR fixes it and adds tests to check scenarios which should not be folded into an even/odd widening operation are actually not.	2025-07-31 13:18:23 -04:00
Simon Pilgrim	c37942df00	[DAG] visitFREEZE - limit freezing of multiple operands (#149797 ) This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084). The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison. The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes. I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x. Fixes #148084	2025-07-22 15:40:55 +01:00
Trevor Gross	0db197adef	[Test] Mark a number of libcall tests `nounwind` (#148329 ) Many tests for floating point libcalls include CFI directives, which isn't needed for the purpose of these tests. Mark some of the relevant test functions `nounwind` in order to remove this noise.	2025-07-12 11:57:28 +02:00
Vikram Hegde	fcd4a2fe7a	[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690 )	2025-07-10 13:10:46 +05:30
Fangrui Song	68494ae072	[XRay] xray_fn_idx: fix alignment directive Use `emitValueToAlignment` as the section does not contain code. `emitCodeAlignment` would lead to ALIGN relocations on RISC-V and LoongArch with linker relaxation. In addition, change the alignment to wordsize, sufficient for the runtime requirement (`XRayFunctionSledIndex`). Related to #147322	2025-07-08 21:52:53 -07:00
Matt Arsenault	026307958b	SystemZ: Remove unnecessary requires asserts from test (#147477 )	2025-07-09 09:28:57 +09:00

1 2 3 4 5 ...

1075 Commits