This is a more general form of the recently added isel pattern
(seteq (i64 (and GPR:$rs1, 0x8000000000000000)), 0)
-> (XORI (i64 (SRLI GPR:$rs1, 63)), 1)
We can use a shift right for any AND mask that is a negated power
of 2. But for every other constant we need to use seqz instead of
xori. I don't think there is a benefit to xori over seqz as neither
are compressible.
We already do this transform from target independent code when the setcc
constant is a non-zero subset of the AND mask that is not a legal icmp
immediate.
I don't believe any of these patterns comparing MSBs to 0 are
canonical according to InstCombine. The canonical form is (X < 4096).
I'm curious if these appear during SelectionDAG and if so, how.
My goal here was just to remove the special case isel patterns.
This helps the 3 vendor extensions that make sext_inreg i1 legal.
I'm delaying this until after LegalizeDAG since we normally have
sext_inreg i1 up until LegalizeDAG turns it into and+neg.
I also delayed the recently added (sext_inreg (xor (setcc), -1), i1)
combine. Though the xor isn't likely to appear before LegalizeDAG anyway.
If we end up with a extract_element VPInstruction where both operands
are live-ins, we will try to fold the live-ins even though the first
operand is a vector whilst the live-in is scalar.
This fixes it by just returning the vector live-in instead of calling
the folder, and removes the handling for insertelement where we aren't
able to do the fold. From some quick testing we previously never hit
this fold anyway, and were probably just missing test coverage.
Fixes#154045
In FoldArithToVectorOuterProduct pattern, static cast to vector type
causes assertion when a scalar type was encountered. It seems the author
meant to have a dyn_cast instead.
This NFC patch handles it by using dyn_cast.
Previously, HW mode name was appended to decoder namespace name when
enumerating encodings, and then emitTable appended the bit width to it
to form the final table name. Let's do this all in one place.
A nice side effect is that this allows us to avoid having to deal with
std::string.
The changes in the tests are caused by the different order of tables.
This patch introduces `AAAMDGPUUniformArgument` that can infer `inreg` function
argument attribute. The idea is, for a function argument, if the corresponding
call site arguments are always uniform, we can mark it as `inreg` thus pass it
via SGPR.
In addition, this AA is also able to propagate the inreg attribute if feasible.
In some cases, such as using `lto` or `llc`, relax feature is not
available from this `SubtargetInfo` (`LoongArchAsmBackend` is
instantiated too early), causing loss of relocations.
This commit modifiy the condition to check whether the section which
contains the two symbols is relaxable. If not relaxable, no need to
record relocations.
Also starts pruning out these calls if the exception model is
forced to none.
I worked backwards from the logic in addPassesToHandleExceptions
and the pass content. There appears to be some tolerance
for mixing and matching exception modes inside of a single module.
As far as I can tell _Unwind_CallPersonality is only relevant for
wasm, so just add it there.
As usual, the arm64ec case makes things difficult and is
missing test coverage. The set of calls in list form is necessary
to use foreach for the duplication, but in every other context a
dag is more convenient. You cannot use foreach over a dag, and I
haven't found a way to flatten a dag into a list.
This removes the last manual setLibcallImpl call in generic code.
There are several libcall choices for MUL_I64 which depend on the
subtarget, but this is the base case. The manual custom ISelLowering
is still overriding the decision until we have a way to control
lowering choices, but we can still get the calling convention
set for now.
Adds support for accessing individual resources from fixed-size global resource arrays.
Design proposal:
https://github.com/llvm/wg-hlsl/blob/main/proposals/0028-resource-arrays.md
Enables indexing into globally scoped, fixed-size resource arrays to retrieve individual resources. The initialization logic is primarily handled during codegen. When a global resource array is indexed, the
codegen translates the `ArraySubscriptExpr` AST node into a constructor call for the corresponding resource record type and binding.
To support this behavior, Sema needs to ensure that:
- The constructor for the specific resource type is instantiated.
- An implicit binding attribute is added to resource arrays that lack explicit bindings (#152452).
Closes#145424
It's a followup to https://github.com/llvm/llvm-project/pull/154189 ,
which broke test on android bot. Making sure XFAIL only happen on darwin
bots
rdar://158543555
Co-authored-by: Mariusz Borsa <m_borsa@apple.com>
Includes CMake files and placeholder header, library, test tool, regression
test and unit test.
The aim for this project is to create a replacement for the existing ORC
Runtime that currently resides in `llvm-project/compiler-rt/lib/orc`. The new
project will provide a superset of the original features, and the old runtime
will be removed once the new runtime is sufficiently developed.
See discussion at
https://discourse.llvm.org/t/rfc-move-orc-executor-support-into-top-level-project/81049
Currently, VPInterleaveRecipe::execute does not support generating LLVM
IR for interleaved accesses that require a gap mask for scalable VFs.
It would be better to detect and prevent such groups from being
vectorized as interleaved accesses in
LoopVectorizationCostModel::interleavedAccessCanBeWidened, rather than
relying on the TTI function getInterleavedMemoryOpCost to return an
invalid cost.
Same thing as #149584, but for more elements of cc_rules (e.g.
`CcInfo`), and applying it to some files that were added since then
(build files under mlir/examples).
Command: `buildifier --lint=fix
--warnings=native-cc-binary,native-cc-import,native-cc-library,native-cc-objc-import,native-cc-objc-library,native-cc-shared-library,native-cc-test,native-cc-toolchain,native-cc-toolchain-suite,native-cc-fdo-prefetch-hints,native-cc-fdo-profile,native-cc-memprof-profile,native-cc-propeller-optimize,native-cc-common,native-cc-debug-package-info,native-cc-info,native-cc-shared-library-info,native-cc-shared-library-hint-info`
This updates the pointer authentication documentation to include a
complete description of the existing functionaliy and behaviour, details
of the more complex aspects of the semantics and security properties,
and the Apple arm64e ABI design.
Co-authored-by: Ahmed Bougacha
Co-authored-by: Akira Hatanaka
Co-authored-by: John Mccall
---------
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: Akira Hatanaka <ahatanak@gmail.com>
Co-authored-by: John Mccall <rjmccall@apple.com>
This change adds support for calling virtual functions. This includes
adding the cir.vtable.get_virtual_fn_addr operation to lookup the
address of the function being called from an object's vtable.
Poisoning of C++ array redzones is only implemented for x86 CPUs. New
arm64 bots are being brought up, where these test fail.
Original C++ arrays shadow poisoning change:
http://reviews.llvm.org/D4774
rdar://158025391
Co-authored-by: Mariusz Borsa <m_borsa@apple.com>
The "ARRAY=" argument to these intrinsics cannot be scalar, whether
"DIM=" is present or not. (Allowing the "ARRAY=" argument to be scalar
when "DIM=" is absent would be a conceivable extension returning an
empty result array, like SHAPE() does with extents, but it doesn't seem
useful in a programming language without compilation-time rank
polymorphism apart from assumed-rank dummy arguments, and those are
supported.)
Fixes https://github.com/llvm/llvm-project/issues/154044.
NAMELIST input in child I/O is rare, and it's not clear in the standard
whether it should be allowed to advance to later records in the parent
unit. But GNU Fortran supports it, and there's no good reason not to do
so since a NAMELIST input group that isn't terminated on the same line
is otherwise going to be a fatal error.
Fixes https://github.com/llvm/llvm-project/issues/153416.
Ensure that when a connected unit is reopened with POSITION='REWIND' or
'APPEND', and a STATUS='OLD' or unspecified, that it is actually
repositioned as requested.
Fixes https://github.com/llvm/llvm-project/issues/153426.
Because the per-dimension information in a descriptor holds an extent
and a lower bound, but not an upper bound, the calculation of the upper
bound sometimes requires that the extent and lower bound be extracted
from a descriptor and added together, minus 1. This shouldn't be
attempted when the NamedEntity of the descriptor is something that
shouldn't be duplicated and used twice; specifically, it shouldn't apply
to NamedEntities containing references to impure functions as parts of
subscript expressions.
Fixes https://github.com/llvm/llvm-project/issues/153031.
- Add `cl::cat(ProfGenCategory)` to non-hidden options so they show up
in `--help` output.
- Introduce `Options.h` for options referenced in multiple files.
When the left-hand side of an assignment statement is an allocatable
that has a monomorphic derived type, and the right-hand side of the
assignment has a type that is an extension of that type, *don't* change
the incoming type or element size of the descriptor before allocating
it.
Fixes https://github.com/llvm/llvm-project/issues/152758.