If the zext has the nneg flag and we can prove the promoted input
is sign extended, we can avoid generating an AND that we might not
be able to remove. RISC-V emits a lot of sext_inreg operations during
i32->i64 promotion that makes this likely.
I've restricted this to the case where the promoted type is the same
as the result type so we don't need to create an additional extend.
I've also restricted it to cases where the target has stated a
preference for sext like i32->i64 on RV64. This is largely to avoid
wasting time in computeNumSignBits until we have a test case that
benefits.
Patch fixes the sync-on-timeout logic in lldb and switches to qEcho
based ping, instead of qC. This fixes vRun message case, when there is
no process yet and qC returns an error.
The adds a check that replaces specific numeric literals like `32767`
with the equivalent call to `std::numeric_limits` (such as
`std::numeric_limits<int16_t>::max())`.
Partially addresses #34434, but notably does not handle cases listed in
the title post such as `~0` and `-1`.
This PR fixes the feature detection for RISC-V floating-point support in
LLVM's libc implementation.
The `__riscv_flen` macro represents the floating-point register width in
bits (32, 64, or 128). Since Extension D is specifically documented as
implying F, we can use simple >= comparisons to detect them.
For half-precision support, the implementation checks for the Zfhmin
extension as RVA22 and RVA23 profiles only require Zfhmin rather than
the full Zfh extension. Zfh also implies Zfhmin, so checking for Zfhmin
should cover all cases.
With proper co-author.
Original message:
We need to pass the operand of LLA to GetSupportedConstantPool.
This replaces #142292 with test from there added as a pre-commit
for both medlow and pic.
Co-authored-by: Carl Nettelblad carl.nettelblad@rapidity-space.com
We need to pass the operand of LLA to GetSupportedConstantPool.
This replaces #142292 with test from there added as a pre-commit
for both medlow and pic.
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch takes care of the mlir side of the migration, starting with
straightforward places like "return std::nullopt;" and ternally
expressions involving std::nullopt.
Explicitly pass the operand we are checking to canNarrowLoad. This
simplifies the check if the operands match across recipes and enables
future optimizations.
A shuffle will take two input vectors and a mask, to produce a new
vector of size <MaskElts x SrcEltTy>. Historically it has been assumed
that the SrcTy and the DstTy are the same for getShuffleCost, with that
being relaxed in recent years. If the Tp passed to getShuffleCost is the
SrcTy, then the DstTy can be calculated from the Mask elts and the src
elt size, but the Mask is not always provided and the Tp is not reliably
always the SrcTy. This has led to situations notably in the SLP
vectorizer but also in the generic cost routines where assumption about
how vectors will be legalized are built into the generic cost routines -
for example whether they will widen or promote, with the cost modelling
assuming they will widen but the default lowering to promote for integer
vectors.
This patch attempts to start improving that - it originally tried to
alter more of the cost model but that too quickly became too many
changes at once, so this patch just plumbs in a DstTy to getShuffleCost
so that DstTy and SrcTy can be reliably distinguished. The callers of
getShuffleCost have been updated to try and include a DstTy that is more
accurate. Otherwise it tries to be fairly non-functional, keeping the
SrcTy used as the primary type used in shuffle cost routines, only using
DstTy where it was in the past (for InsertSubVector for example).
Some asserts have been added that help to check for consistent values
when a Mask and a DstTy are provided to getShuffleCost. Some of them
took a while to get right, and some non-mask calls might still be
incorrect. Hopefully this will provide a useful base to build more
shuffles that alter size.
This PR is 2nd part of
[P1857R3](https://github.com/llvm/llvm-project/pull/107168)
implementation, and mainly implement the restriction `A module directive
may only appear as the first preprocessing tokens in a file (excluding
the global module fragment.)`:
[cpp.pre](https://eel.is/c++draft/cpp.pre):
```
module-file:
pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]
```
We also refine tests use `split-file` instead of conditional macro.
Signed-off-by: yronglin <yronglin777@gmail.com>
…eFromDylib.
When comparing CPU subtypes from slices in a MachO universal binary we
need to apply the MachO::CPU_SUBTYPE_MASK to mask out any flags in the
high bits, otherwise we might fail to correctly match a slice and return
a spurious "does not contain slice" error.
rdar://153913779
Split off EMIT-SCALAR printing changes from already approved
https://github.com/llvm/llvm-project/pull/140623.
Currently all casts are single scalars, this brings printing in line
with printing for other VPInstructions.
Add missing listener notifications when erasing nested
blocks/operations.
This commit also moves some of the functionality from
`ConversionPatternRewriter` to `ConversionPatternRewriterImpl`. This is
in preparation of the One-Shot Dialect Conversion refactoring: The
implementations in `ConversionPatternRewriter` should be as simple as
possible, so that a switch between "rollback allowed" and "rollback not
allowed" can be inserted at that level. (In the latter case,
`ConversionPatternRewriterImpl` can be bypassed to some degree, and
`PatternRewriter::eraseBlock` etc. can be used.)
Depends on #145018.
This does a few small things:
- inline `__libcpp_compute_min`, since we can don't have to put the
arithmetic behind a constraint. Simple arithmetic also tends to be
faster to compile than instantiating a type.
- Remove an unused include (and add missing includes elsewhere)
- Remove `__min` and `__max` from the `bool` specialization
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Since we've removed allocator support, we can remove a few support
structures. This only affects the policy implementation, so this
shouldn't even be ABI sensitive.
If the pointer is aligned to more than the size of the vector, we can
widen the load up to next power of 2 size, as SDAG performs.
Some of the v3 tests are currently worse - those should be addressed in
other issues.
We can, through legalization of certain operations, end up generating
G_INDEXED_LOAD into FPR registers that require entensions. SExt and ZExt
will always opt for GPR, but anyext/noext can curently be set to FPR
registers in regbankselect. As writing a subregister will set higher
bits in the same register to 0, we can successfully handle zext and
anyext on FPR registers, which is what this patch attempts to add.
String table size is too big for large binary when symbol table is
enabled. Some strings in strtab is same so it can be reused.
This patch revives 9ffeaaa authored by mstorsjo with the prioritized
string table builder to fix debug section name issue (see 4d2eda2
for more details).
---------
Co-authored-by: Wen Haohai <whh108@live.com>
Co-authored-by: James Henderson <James.Henderson@sony.com>