Copy `nneg` flag when building `UINT_TO_FP` from `uitofp` and use
`nneg` flag in the one place we transform `UINT_TO_FP` -> `SINT_TO_FP`
if the operand is non-negative.
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the moment though, meaning it
will generate a normal instruction. This patch ensures that the right
flags are set if the instruction has non-temporal metadata.
The current APInt::multiplicativeInverse takes a modulus which can be
any value, but all in-tree callers use a power of two. Moreover, most
callers want to use two to the power of the width of an existing APInt,
which is awkward because 2^N is not representable as an N-bit APInt.
Add a new overload of multiplicativeInverse which implicitly uses
2^BitWidth as the modulus.
Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds.
This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements.
There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads?
Fixes#86968
On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1),
(splat_vector i64 0)), where the splat_vectors are implicitly
truncating.
When the vselect is used by a binop we want to pull the vselect out via
foldSelectWithIdentityConstant. But because vectors with an element size
< i64 will truncate, isNeutralConstant will return false.
This patch handles truncating splats by getting the APInt value and
truncating it. We almost don't need to do this since most of the neutral
elements are either one/zero/all ones, but it will make a difference for
smax and smin.
I wasn't able to figure out a way to write the tests in terms of select,
since we need the i1 zext legalization to create a truncating
splat_vector.
This supercedes #87236. Fixed vectors are unfortunately not handled by
this patch (since they get legalized to _VL nodes), but they don't seem
to appear in the wild.
Hi all,
This patch is a follow-up of #79101. It migrates logic from
`visitVSELECT` to `visitVP_SELECT` to simplify `vp.select`. With this
patch we can do the following combinations:
```
vp.select undef, T, F --> T (if T is a constant), F otherwise
vp.select <condition>, undef, F --> F
vp.select <condition>, T, undef --> T
vp.select false, T, F --> F
vp.select <condition>, T, T --> T
```
I'm a total newbie to llvm and I'm sure there's room for improvements in
this patch. Please let me know if you have any advice. Thank you in
advance!
CallSiteInfo is originally used only for argument - register pairs. Make
it struct, in which we can store additional data for call sites.
Also, the variables/methods used for CallSiteInfo are named for its
original use case, e.g., CallFwdRegsInfo. Refactor these for the
upcoming
use, e.g. addCallArgsForwardingRegs() -> addCallSiteInfo().
An upcoming patch will add type ids for indirect calls to propogate them
from
middle-end to the back-end. The type ids will be then used to emit the
call
graph section.
Original RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html
Differential Revision: https://reviews.llvm.org/D107109?id=362888
Co-authored-by: Necip Fazil Yildiran <necip@google.com>
The ID argument of `gc.statepoint` gets incorrectly truncated to 32 bits
during code generation.
This is fixed by using `uint64_t` instead of `unsigned` for the `ID`
member in `SelectionDAGBuilder::StatepointLoweringInfo`, and a
`patchpoint` test case is extended to check for 64 bit ID generation in
stackmaps.
The earlier implementation on AMDGPU used explicit token operands at
SI_CALL and SI_CALL_ISEL. This is now replaced with CONVERGENCECTRL_GLUE
operands, with the following effects:
- The treatment of tokens at call-like operations is now consistent with
the treatment at intrinsics.
- Support for tail calls using implicit tokens at SI_TCRETURN "just
works".
- The extra parameter at call-like instructions is eliminated, thus
restoring those instructions and their handling to the original state.
The new glue node is placed after the existing glue node for the
outgoing call parameters, which seems to not interfere with selection of
the call-like nodes.
The folding of sign/zero extensions into an atomic load by specifying an
extension type is not target specific, and therefore belongs in the
DAGCombiner rather than in the SystemZ backend.
- Handle atomic loads similarly to regular loads by adding
AtomicLoadExtActions with set/get methods.
- Move SystemZ extendAtomicLoad() to DagCombiner.cpp.
We check DAG.haveNoCommonBitsSet so the operands will be known to be
disjoint.
I couldn't think of a codegen test case since most targets aren't
checking hasDisjoint yet, apart from RISCV in the or_is_add pattern, but
it also falls back to computeKnownBits.
We try clamp the index to be within the bounds of the stack object
we create, but if we don't freeze it, poison can propagate into the
clamp code. This can cause the access to leave the bounds of the
stack object.
We have other instances of this issue in type legalization and extract_elt/subvector,
but posting this patch first for direction check.
Fixes#86717
There were 3 temporaries that just renamed the 3 well name arguments to the
function to Tmp1-3. Looks like this was done when the code was extracted from
elsewhere into a separate function 15 years ago.
Before this fix, a duplicate llvm.dbg.value intrinsic referring to an
argument, after an alloca, would be generated with `$noreg`, losing
debug information. Instead, we silently drop the second debug info, so
it doesn't break the first one.
rdar://125375717
Currently patchpoints can only have two result types, `void` and `i64`.
This limits the result to general purpose registers.
This patch makes `patchpoint.i64` an overloadable intrinsic, allowing
result values that can fit in a single register (e.g. integers,
pointers, floats).
Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.
2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.
Original Patch by Harvin Iriawan
Co-authored-by: David Green <david.green@arm.com>
Fixes#84831
When matching carry pattern with `getAsCarry`, it may produce different
type of carryout. This patch checks such case and does early exit.
I'm new to DAG, any suggestion is appreciated.
This is needed to support non-intrinsic functions returning tuple types
which are represented as structs with scalable vector types in IR.
I suspect this may have been broken since
https://reviews.llvm.org/D158115
Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison.
Targets can still perform this themselves for specific special case nodes (e.g. target shuffles).
Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison
Add ones for every high bit that will cleared.
This will allow us to evaluate variables that have their bits known to
see if they have no risk of overflow despite the shift amount being
greater than the difference between the two types.
This reverts commit 353fbeb0a294d2c7cef6d88607fa0fd50ee81462. It crashes
when it encounters an UINT_TO_FP.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:1618 in SDValue llvm::SelectionDAG::getConstant(const ConstantInt &, const SDLoc &, EVT, bool, bool): VT.isInteger() && "Cannot create FP integer constant!"
This patch renames DPLabel to DbgLabelRecord, in accordance with the
ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel
isn't as widely used as DPValue and has no real conflicts in either its
full or abbreviated name. As usual, the entire replacement was done
automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.