It is common to have ABI requirements for illegal types: For example,
two i64 argument parts that originally came from an fp128 argument may
have a different call ABI than ones that came from a i128 argument.
The current calling convention lowering does not provide access to this
information, so backends come up with various hacks to support it (like
additional pre-analysis cached in CCState, or bypassing the default
logic entirely).
This PR adds the original IR type to InputArg/OutputArg and passes it
down to CCAssignFn. It is not actually used anywhere yet, this just does
the mechanical changes to thread through the new argument.
If a copy exists between creation of a crbit and a spill, machine-cp
may delete the copy since it seems unaware of the relation between a cr
and crbit. A fix was previously made for the generic ppc64 lowering. It
should be applied to the pwr9 and pwr10 variants too.
Likewise, relax and extend the pwr8 test to verify pwr9 and pwr10
codegen too.
This fixes#143989.
This patch fixes:
llvm/lib/Target/PowerPC/PPCSelectionDAGInfo.h:25:3: error:
'EmitTargetCodeForMemcmp' overrides a member function but is not
marked 'override' [-Werror,-Winconsistent-missing-override]
AIX has "millicode" routines, which are functions loaded at boot time
into fixed addresses in kernel memory. This allows them to be customized
for the processor. The __memcmp routine is a millicode implementation;
we use millicode for the memcmp function instead of a library call to
improve performance.
The information whether a specific argument is vararg or fixed is
currently stored separately from all the other argument information in
ArgFlags. This means that it is not accessible from CCAssign, and
backends have developed all kinds of workarounds for how they can access
it after all.
Move this information to ArgFlags to make it directly available in all
relevant places.
I've opted to invert this and store it as IsVarArg, as I think that both
makes the meaning more obvious and provides for a better default (which
is IsVarArg=false).
We can expand the init intrinsic to create a descriptor for the nested
procedure by combining the entry point and TOC pointer from the global
descriptor with the nest argument. The normal indirect call sequence
then calls the nested procedure through the descriptor like all other
calls. Patch also implements support for a nest parameter by mapping it
to gpr 11.
The patch fixed a bug introduced patch [[PowePC] using MTVSRBMI
instruction instead of constant pool in
power10+](https://github.com/llvm/llvm-project/pull/144084#top).
The issue arose because the layout of vector register elements differs
between little-endian and big-endian modes — specifically, the elements
appear in reverse order. This led to incorrect behavior when loading
constants using MTVSRBMI in big-endian configurations.
This change rebases the anonymous `xxeval` patterns to use the new
XXEvalPattern class based on the [[PowerPC] Exploit xxeval instruction
for ternary patterns - ternary(A, X,
and(B,C))](https://github.com/llvm/llvm-project/pull/141733#top)
Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
Compare sinking is selectable based on the result of
hasMultipleConditionRegisters. This function is too coarse grained by
not taking into account the differences between scalar and vector
compares. This PR extends the interface to take an EVT to allow finer
control.
The new interface is used by AArch64 to disable sinking of scalable
vector compares, but with isProfitableToSinkOperands updated to maintain
the cases that are specifically tested.
"unset" is not a meaningful separae state and it will be merged into
"undefined".
https://reviews.llvm.org/D72570 ported a GAS hack that moves the label
when a prefixed instruction on the same source line needs automatic
alignment. However, we can used isDefined instead of isUnset.
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
`Data` now references the first byte of the fixup offset within the current fragment.
MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.
Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```
This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846
Pull Request: https://github.com/llvm/llvm-project/pull/151724
When arbitrary sized (non-simple type, or non-power of two types)
integers are passed on the stack, these integers are not handled when
lowering formal arguments on AIX as we always assume we will encounter
simple type integers.
However, it is possible for frontends to generate arbitrary sized
immediate values in IR. Specifically in rustc, it will generate an
integer value in LLVM IR for small structures that are less than a
pointer size, which is done for optimization purposes for the Rust ABI.
For example, if a Rust structure of three characters is passed into
function on the stack,
```
struct my_struct {
field1: u8,
field2: u8,
field3: u8,
}
```
This will generate an `i24` type in LLVM IR.
Currently, it is not obvious for the backend to distinguish an integer
versus something that wasn't an integer to begin with (such as a
struct), and the latter case would not have an extend on the parameter.
Thus, this PR allows us to perform a truncation and extend on integers,
both non-simple and simple types.
## Description
<!--- Title/Description will be Subject/Body of commit message. -->
<!--- Please be concise and limit the subject line to 50 characters, -->
<!--- and wrap the Description at 72 characters. -->
<!--- Describe why this is required, what problem it solves. -->
Adds support for ternary equivalent operations of the form `ternary(A,
X, and(B,C))` where `X=[xor(B,C)| nor(B,C)| eqv(B,C)| not(B)| not(C)]`.
List of `xxeval` equivalent ternary operations added and the
corresponding `imm` value required:
Ternary Operator| Imm Value
--|--
ternary(A, xor(B,C), and(B,C)) | 22
ternary(A, nor(B,C), and(B,C)) | 24
ternary(A, eqv(B,C), and(B,C)) | 25
ternary(A, not(C), and(B,C)) | 26
ternary(A, not(B), and(B,C)) | 28
eg. `xxeval XT,XA,XB,XC,22`
- performs `XA ? xor(XB, XC) : and(XB,XC)`and places the result in `XT`.
Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
The object file format specific derived classes are used in context like
MCStreamer and MCObjectTargetWriter where the type is statically known.
We don't use isa/dyn_cast and we want to eliminate
MCSection::SectionVariant in the base class.
This patch updates `overrideSchedPolicy` and `overridePostRASchedPolicy`
to take a
`SchedRegion` parameter instead of just `NumRegionInstrs`. This provides
access to both the
instruction range and the parent `MachineBasicBlock`, which enables
looking up function-level
attributes.
With this change, targets can select post-RA scheduling direction per
function using a function
attribute. For example:
```cpp
void overridePostRASchedPolicy(MachineSchedPolicy &Policy,
const SchedRegion &Region) const {
const Function &F = Region.RegionBegin->getMF()->getFunction();
Attribute Attr = F.getFnAttribute("amdgpu-post-ra-direction");
...
}
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.
Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.
Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
Avoid reliance on the MCAssembler::evaluateFixup workaround that checks
MCFixupKindInfo::FKF_IsPCRel. Additionally, standardize how fixups are
appended. This helper will facilitate future fixup data structure
optimizations.
MCFixup::Loc has been removed in favor of MCExpr::Loc through
`const MCExpr *Value` (commit 777391a2164b89d2030ca013562151ca3c3676d1).
While here, change Kind to uint16_t from MCFixupKind. Most fixup kinds
are target-specific.
The instruction MTVSRBMI set 0x00(or 0xFF) to each byte of VSR based on
the bits mask. Using the instruction instead of constant pool can reduce
the asm code size and instructions in power10.
Follow-up to #141333. Relocation generation called both addReloc and
applyFixup, with the default addReloc invoking shouldForceRelocation,
resulting in three virtual calls. This approach was also inflexible, as
targets needing additional data required extending
`shouldForceRelocation` (see #73721, resolved by #141311).
This change integrates relocation handling into applyFixup, eliminating
two virtual calls. The prior default addReloc is renamed to
maybeAddReloc. Targets overriding addReloc now call their customized
addReloc implementation.
To align with the majority of targets where these overrides are
out-of-line. The consistency helps the pending change that
merges addReloc and applyFixup.
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5769:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass)
^
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5769:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass)
^~~~~~~~~~~~~~~~~~~~~
&
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6193:19: error: loop variable '[Reg, N]' creates a
copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass) {
^
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6193:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass) {
^~~~~~~~~~~~~~~~~~~~~
&
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6806:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct]
for (const auto [Reg, N] : RegsToPass) {
^
/data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6806:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying
for (const auto [Reg, N] : RegsToPass) {
^~~~~~~~~~~~~~~~~~~~~
&
3 errors generated.
Printing an expression is error-prone without a MCAsmInfo argument.
Remove the operator<< overload and replace callers with
MCAsmInfo::printExpr. Some callers are changed to MCExpr::print, with
the goal of eventually making it private.
so that subclasses can provide the appropriate MCAsmInfo to print
MCExpr objects.
At present, llvm/utils/TableGen/AsmMatcherEmitter.cpp constucts a
generic MCAsmInfo.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.