593 Commits

Author SHA1 Message Date
yingopq
754ed95b66
[Mips] Fix compiler crash when returning fp128 after calling a functi… (#117525)
…on returning { i8, i128 }

Fixes https://github.com/llvm/llvm-project/issues/96432.
2025-01-20 16:47:40 +08:00
Craig Topper
e6b2495545
[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531)
SDNode::use_iterator now returns an SDUse& when dereferenced.
SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses
work on use_iterator. SDNode::user_begin/user_end/users work on
user_iterator.

We can now write range based for loops using SDUse& and SDNode::uses().
I've converted many of these in this patch. I didn't update loops that
have additional variables updated in their for statement.

Some loops use SDNode::use_iterator::getOperandNo() which also prevents
using range based for loops. I plan to move this into SDUse in a follow
up patch.
2024-12-19 08:35:32 -08:00
Craig Topper
bd261ecc5a
[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509)
Most of these are just places that want the first user and aren't
iterating over the whole list.

While there I changed some use_size() == 1 to hasOneUse() which
is more efficient.

This is part of an effort to rename use_iterator to user_iterator
and provide a use_iterator that dereferences to SDUse&. This patch
helps reduce the diff on later patches.
2024-12-18 22:13:04 -08:00
Craig Topper
104ad9258a
[SelectionDAG] Rename SDNode::uses() to users(). (#120499)
This function is most often used in range based loops or algorithms
where the iterator is implicitly dereferenced. The dereference returns
an SDNode * of the user rather than SDUse * so users() is a better name.

I've long beeen annoyed that we can't write a range based loop over
SDUse when we need getOperandNo. I plan to rename use_iterator to
user_iterator and add a use_iterator that returns SDUse& on dereference.
This will make it more like IR.
2024-12-18 20:09:33 -08:00
anoopkg6
dc04d414df
SystemZ: Add support for __builtin_setjmp and __builtin_longjmp. (#119257)
This pr includes fixes for original pr##116642.
Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ..
2024-12-10 19:50:51 +01:00
Ulrich Weigand
8787bc72a6 Revert "[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642)"
This reverts commit 030bbc92a705758f1131fb29cab5be6d6a27dd1f.
2024-12-07 00:55:54 +01:00
Ulrich Weigand
9f430bd415 Revert "[SystemZ] Fix a warning"
This reverts commit 3c47e63723b1aa9e76f30fc8d1acef9caf4ea783.
2024-12-07 00:55:41 +01:00
Kazu Hirata
3c47e63723 [SystemZ] Fix a warning
This patch fixes:

  llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:953:30: error:
  unused variable 'TRI' [-Werror,-Wunused-variable]
2024-12-06 14:52:22 -08:00
anoopkg6
030bbc92a7
[SystemZ] Add support for __builtin_setjmp and __builtin_longjmp (#116642)
Implementation for __builtin_setjmp and __builtin_longjmp for SystemZ.
2024-12-06 23:33:33 +01:00
Craig Topper
b076fbb844
[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587)
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
2024-12-03 22:06:55 -08:00
Nikita Popov
815a1bb53a
[SystemZ] Use getSignedConstant() where necessary (#117181)
This will avoid assertion failures once we disable implicit truncation
in getConstant().

Inside adjustSubwordCmp() I ended up suppressing the issue with an
explicit cast, because this code deals with a mix of unsigned and signed
immediates.
2024-11-25 09:47:49 +01:00
Jonas Paulsson
77ddcf7cbf
[SystemZ] Fix bitwidth problem in FindReplicatedImm(). (#115383)
A test case emerged with an i32 truncating store of an i64 constant
operand, where the i64 constant did not fit in 32 bits, which caused
FindReplicatedImm() to crash.

Make sure to truncate the APInt in these cases.
2024-11-11 22:16:20 +01:00
Yingwei Zheng
cf9d1c1486
[SDAG] Simplify SDNodeFlags with bitwise logic (#114061)
This patch allows using enumeration values directly and simplifies the
implementation with bitwise logic. It addresses the comment in
https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.
2024-10-31 08:10:07 +08:00
Jonas Paulsson
09160a9821
[SystemZ] Silence compiler warning (#113894)
Use SystemZ::NoRegister instead of 0 in
SystemZTargetLowering::getRegisterByName().
2024-10-28 11:32:39 +01:00
Alex Rønne Petersen
5785cbb405
[llvm] Ensure that soft float targets don't emit fma() libcalls. (#106615)
The previous behavior could be harmful in some edge cases, such as
emitting a call to `fma()` in the `fma()` implementation itself.

Do this by just being more accurate in `isFMAFasterThanFMulAndFAdd()`.
This was already done for PowerPC; this commit just extends that to Arm,
z/Arch, and x86. MIPS and SPARC already got it right, but I added tests
for them too, for good measure.

Note: I don't have commit access.
2024-10-19 06:13:15 -07:00
Jonas Paulsson
5059059c7b
[SystemZ] Add missing newline character in verifyNarrowIntegerArgs_Call(). (#112499) 2024-10-16 10:39:28 +02:00
Kazu Hirata
abb594b965
[SystemZ] Avoid repeated hash lookups (NFC) (#112072) 2024-10-12 08:01:26 -07:00
Kazu Hirata
df691ca74b [SystemZ] Fix a warning
This patch fixes:

  llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:9858:18: error:
  using the result of an assignment as a condition without parentheses
  [-Werror,-Wparentheses]
2024-09-30 09:12:23 -07:00
Jonas Paulsson
f9fbfc587d
[SystemZ] Dump function signature on missing arg extension. (#109699)
Make it easier to handle detected problems by providing the function
signature(s) involved in cases of missing argument extensions.
2024-09-30 17:03:18 +02:00
Jonas Paulsson
0ef24aa549
Fix for logic in combineExtract() (#108208)
A (csmith) test case appeared where combineExtract() crashed when the
input vector was a bitcast into a vector of i1:s. Fix this by adding a check
with canTreatAsByteVector() before the call.
2024-09-25 12:12:27 +02:00
Kazu Hirata
72b04b9f16 [SystemZ] Fix a warning
This patch fixes:

  llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:9857:21: error:
  unused variable 'Flags' [-Werror,-Wunused-variable]
2024-09-19 09:03:47 -07:00
Jonas Paulsson
14120227a3
Target ABI: improve call parameters extensions handling (#100757)
For the purpose of verifying proper arguments extensions per the target's ABI,
introduce the NoExt attribute that may be used by a target when neither sign-
or zeroextension is required (e.g. with a struct in register). The purpose of
doing so is to be able to verify that there is always one of these attributes
present and by this detecting cases where sign/zero extension is actually
missing.

As a first step, this patch has the verification step done for the SystemZ
backend only, but left off by default until all known issues have been
addressed.

Other targets/front-ends can now also add NoExt attribute where needed and do
this check in the backend.
2024-09-19 16:59:31 +02:00
Nikita Popov
7d1a68178e [SystemZ] Use APInt::getAllOnes()
This was using -1 without setting the signed flag.

Split off from https://github.com/llvm/llvm-project/pull/80309.
2024-09-05 15:25:05 +02:00
Abhina Sree
a0be7053d7
[SystemZ][z/OS] Continuation of __ptr32 support (#103393)
This is a continuation of the __ptr32 support added here
135fecd444
2024-08-14 13:26:30 -04:00
Jonas Paulsson
22bc9db92b
[SystemZ] Use the EVT version of getVectorVT() in combineTruncateExtract(). (#100150)
A test case showed up where the new vector type is v24i16, which is not a simple
MVT. In order to get an extended value type for cases like this, EVT::getVectorVT()
needs to be called instead of MVT::getVectorVT(), otherwise the following call
to getVectorElementType() in combineExtract() will fail.
2024-07-26 14:33:40 +02:00
Amara Emerson
f270a4dd66
[AArch64] Don't tail call memset if it would convert to a bzero. (#98969)
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.

This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.

rdar://131419786
2024-07-17 01:31:52 -07:00
Ulrich Weigand
e8e406041e Fix sext_in_reg from i1 to i128
The combineSIGN_EXTEND_INREG routine was using
DAG.getConstant(-1, DL, VT), which does not result in
the expected value when VT has more than 64 bits.

Fix this by using DAG.getAllOnesConstant(DL, VT) instead.

Also add test cases for v1i128 comparisons (which triggers
the bug).
2024-07-15 11:26:37 +02:00
Kazu Hirata
5e22a53698
[Target] Use range-based for loops (NFC) (#98705) 2024-07-13 17:40:51 -07:00
Joseph Huber
3f1a767572
[LLVM] Factor disabled Libcalls into the initializer (#98421)
Summary:
These Libcalls represent which functions are available to the backend.
If a runtime call is not available, the target sets the the name to
`nullptr`. Currently, this logic is spread around the various targets.
This patch pulls all of the locations that disable libcalls into the
intializer. This patch is effectively NFC.

The motivation behind this patch is that currently the LTO handling uses
the list of all runtime calls to determine which functions cannot be
internalized and must be extracted from static libraries. We do not want
this to happen for libcalls that are not emitted by the backend. A
follow-up patch will move out this logic so the LTO pass can know which
rtlib calls are actually used by the backend.
2024-07-11 12:59:25 -05:00
Nikita Popov
4169338e75
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
2024-06-28 14:30:47 +02:00
Matt Arsenault
ddb87e0f96
SystemZ: Use REG_SEQUENCE for PAIR128 (#90640)
PAIR128 should probably just be removed entirely

Depends #90638
2024-05-17 13:16:34 +02:00
Ulrich Weigand
0a0cac6dbd
[SystemZ] Simplify f128 atomic load/store (#90977)
Change definition of expandBitCastI128ToF128 and expandBitCastF128ToI128
to allow for simplified use in atomic load/store.

Update logic to split 128-bit loads and stores in DAGCombine to also
handle the f128 case where appropriate. This fixes the regressions
introduced by recent atomic load/store patches.
2024-05-06 12:17:19 +02:00
Matt Arsenault
edbe6ebb4d
SystemZ: Don't promote atomic store in IR (#90899)
This is the mirror to the recent atomic load change. The same
bitcast-back-to-integer case is a small code quality regression for the
same reason. This would disappear with a bitcastable legal 128-bit type.
2024-05-03 10:04:12 +02:00
Matt Arsenault
38f9c013a0
SystemZ: Stop casting fp typed atomic loads in the IR (#90768)
shouldCastAtomicLoadInIR is a hack that should be removed. Simple
bitcasting of operations should be in the domain of ordinary type
legalization and does not need to be done in the IR.

This introduces a code quality regression due to the hack currently used
to avoid using 128-bit values in the case where the floating point value
is ultimately used as an integer. This would be avoidable if there were
always a legal 128-bit type (like v2i64). This is a pretty niche
situation so I assume it's not important.

I implemented about 85% of the work necessary to make v2i64 legal, but
it was taking too long and I lack the necessary familiarity with systemz
to complete it. I've pushed it here for someone to pick up:
https://github.com/arsenm/llvm-project/pull/new/systemz-legal-v2i64

Depends #90861
2024-05-02 21:31:29 +02:00
Fangrui Song
5a12f2867a LLVM_FALLTHROUGH => [[fallthrough]]. NFC 2024-04-25 17:50:59 -07:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Kai Nacke
cce4dc7b7a
[SystemZ][z/OS] Implement llvm.returnaddress for XPLINK (#89440)
The implementation follows the ELF implementation.
2024-04-22 11:01:22 -04:00
Kai Nacke
7e2c2981fb
[SystemZ][z/OS] Implement llvm.frameaddr for XPLINK (#89284)
The implementation follows the ELF implementation.
2024-04-19 08:09:49 -04:00
Jonas Paulsson
7e4c6e98fa
[SystemZ] Bugfix in getDemandedSrcElements(). (#88623)
For the intrinsic s390_vperm, all of the elements are demanded, so use
an APInt with the value of '-1' for them (not '1').

Fixes https://github.com/llvm/llvm-project/issues/88397
2024-04-15 16:32:14 +02:00
Dominik Steenken
b794dc2325
[SystemZ] Add custom handling of legal vectors with reduce-add. (#88495)
This commit skips the expansion of the `vector.reduce.add` intrinsic on
vector-enabled SystemZ targets in order to introduce custom handling of
`vector.reduce.add` for legal vector types using the VSUM instructions.
This is limited to full vectors with scalar types up to `i32` due to
performance concerns.

It also adds testing for the generation of such custom handling, and
adapts the related cost computation, as well as the testing for that.

The expected result is a performance boost in certain benchmarks that
make heavy use of `vector.reduce.add` with other benchmarks remaining
constant.

For instance, the assembly for `vector.reduce.add<4 x i32>` changes from
```hlasm
        vmrlg   %v0, %v24, %v24
        vaf     %v0, %v24, %v0
        vrepf   %v1, %v0, 1
        vaf     %v0, %v0, %v1
        vlgvf   %r2, %v0, 0
```
to
```hlasm
        vgbm    %v0, 0
        vsumqf  %v0, %v24, %v0
        vlgvf   %r2, %v0, 3
```
2024-04-12 18:05:30 +02:00
Kazu Hirata
17c3f102be [SystemZ] Fix an unused variable warning
This patch fixes:

  llvm/lib/Target/SystemZ/SystemZISelLowering.cpp:8181:9: error:
  unused variable 'TFL' [-Werror,-Wunused-variable]
2024-03-28 14:19:39 -07:00
Jonas Paulsson
16b7cc69ef
[SystemZ] Eliminate call sequence instructions early. (#77812)
On SystemZ, the outgoing argument area which is big enough for all calls
in the function is created once during the prolog, as opposed to
adjusting the stack around each call. The call-sequence instructions are
therefore not really useful any more than to compute the maximum call
frame size, which has so far been done by PEI, but can just as well be
done at an earlier point.

This patch removes the mapping of the CallFrameSetupOpcode and
CallFrameDestroyOpcode and instead computes the MaxCallFrameSize
directly after instruction selection and then removes the ADJCALLSTACK
pseudos. This removes the confusing pseudos and also avoids the problem
of having to keep the call frame size accurate when creating new MBBs.

This fixes #76618 which exposed the need to maintain the call frame size
when splitting blocks (which was not done).
2024-03-28 18:26:38 +01:00
Jonas Paulsson
94b5c118b3
[ISel] Move handling of atomic loads from SystemZ to DAGCombiner (NFC). (#86484)
The folding of sign/zero extensions into an atomic load by specifying an
extension type is not target specific, and therefore belongs in the
DAGCombiner rather than in the SystemZ backend.

- Handle atomic loads similarly to regular loads by adding
AtomicLoadExtActions with set/get methods.
- Move SystemZ extendAtomicLoad() to DagCombiner.cpp.
2024-03-28 16:14:35 +01:00
Ulrich Weigand
4b907414d2 [SystemZ] Add support for llvm.readcyclecounter
The llvm.readcyclecounter intrinsic can be implemented via the
STORE CLOCK FAST (STCKF) instruction.
2024-03-22 20:01:02 +01:00
Ulrich Weigand
335f365982 Reapply: [SystemZ] Fix overflow flag for i128 USUBO
We use the VSCBIQ/VSBIQ/VSBCBIQ family of instructions to implement
USUBO/USUBO_CARRY for the i128 data type.  However, these instructions
use an inverted sense of the borrow indication flag (a value of 1
indicates *no* borrow, while a value of 0 indicated borrow).  This
does not match the semantics of the boolean "overflow" flag of the
USUBO/USUBO_CARRY ISD nodes.

Fix this by generating code to explicitly invert the flag.  These
cancel out of the result of USUBO feeds into an USUBO_CARRY.

To avoid unnecessary zero-extend operations, also improve the
DAGCombine handling of ZERO_EXTEND to optimize (zext (xor (trunc)))
sequences where appropriate.

Fixes: https://github.com/llvm/llvm-project/issues/83268
2024-03-19 14:07:08 +01:00
Ulrich Weigand
d1c3795968 Revert "Fix overflow flag for i128 USUBO"
This reverts commit d9c31ee9568277e4303715736b40925e41503596.
2024-03-19 11:43:05 +01:00
Ulrich Weigand
d9c31ee956 Fix overflow flag for i128 USUBO
We use the VSCBIQ/VSBIQ/VSBCBIQ family of instructions to implement
USUBO/USUBO_CARRY for the i128 data type.  However, these instructions
use an inverted sense of the borrow indication flag (a value of 1
indicates *no* borrow, while a value of 0 indicated borrow).  This
does not match the semantics of the boolean "overflow" flag of the
USUBO/USUBO_CARRY ISD nodes.

Fix this by generating code to explicitly invert the flag.  These
cancel out of the result of USUBO feeds into an USUBO_CARRY.

To avoid unnecessary zero-extend operations, also improve the
DAGCombine handling of ZERO_EXTEND to optimize (zext (xor (trunc)))
sequences where appropriate.

Fixes: https://github.com/llvm/llvm-project/issues/83268
2024-03-19 11:20:52 +01:00
Jonas Paulsson
8b8e1adbde
[SystemZ] Don't lower ATOMIC_LOAD/STORE to LOAD/STORE (#75879)
- Instead of lowering float/double ISD::ATOMIC_LOAD / ISD::ATOMIC_STORE
nodes to regular LOAD/STORE nodes, make them legal and select those nodes
properly instead. This avoids exposing them to the DAGCombiner.

- AtomicExpand pass no longer casts float/double atomic load/stores to integer
  (FP128 is still casted).
2024-03-18 17:21:50 -04:00
Jonas Paulsson
9c0e45d7f0
[SystemZ] Use VT (not ArgVT) for SlotVT in LowerCall(). (#82475)
When an integer argument is promoted and *not* split (like i72 -> i128 on
a new machine with vector support), the SlotVT should be i128, which is
stored in VT - not ArgVT.

Fixes #81417
2024-02-21 16:26:16 +01:00
Kazu Hirata
39fa304866 [llvm] Use StringRef::starts_with (NFC) 2024-01-31 23:54:07 -08:00