The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed
isd nodes.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D145900
D132262 tried to simplify `IsMetadataOrEHFrameSection` originally introduced in
D127549 but caused a regression as `.quad` directives in
```
.section .note,"a",@note; note:
.quad extern-note # extern is undefined
.section .rodata,"a",@progbits; rodata:
.quad extern-rodata # extern is undefined
.section .nonalloc,"",@progbits; nw:
.quad extern-nw
```
are incorrectly rejected: these differences may be link-time constants and
are allowed in GNU assembler and LLVM MC's non-RISC-V ports.
Relax the conditions to allow these cases. For A-B, A may be defined later, but
this requiresFixups call has to eagerly make a decision. For now, emit ADD/SUB
unless A is `.L*`. This euristic handles many temporary label differences for
.debug_* and .apple_types sections. Ideally we should delay the decision of
PC-relative vs ADD/SUB until A is defined.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D145474
Follow-up to D143226
Currently we incorrectly emit R_RISCV_ADD32/R_RISCV_SUB32.
Emit R_RISCV_PLT32 instead. The new behavior matches x86-64 and AArch64.
The FP16 broadcast and transpose can always use the same instructions as are
used for i16 vectors, with or without +fullfp16. This fills in some extra costs
to make sure we get them right.
Differential Revision: https://reviews.llvm.org/D146035
We currently have 3 functions and 3 lookup tables. This was the
most expediant and obvious way to fix several bugs.
This patch uses a single function and single lookup
table. It uses APFloat::convert to convert from the half or double
to single precision. If the conversion doesn't have any errors or
lose any information we use the f32 table to finish the lookup.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D145897
Use the predicate condition instead of checkFeatures in *GenDAGISel.inc.
This makes the code similar to isel pattern predicates.
checkFeatures is still used by code created by SubtargetEmitter so
we can't remove the string. Backends need to be careful to keep
the string and predicates in sync, but I don't think that's a big issue.
I haven't measured it, but this should be a compile time improvement
for isel since we don't have to do any of the string processing that's
inside checkFeatures.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D146012
The SystemZ backend will try to reuse an existing subtraction of two values
whenever they are to be compared for equality. This depends on the SystemZ
subtraction instruction setting the condition code, which can also signal
overflow.
A later pass will remove the compare and reuse the CC from the subtraction
directly. However, if that subtraction has the NSW flag set it will not
include the overflow bit in the updated CC user. That was a bug which can
lead to wrong results, as shown by a csmith program.
Fixes: https://github.com/llvm/llvm-project/issues/61268
Reviewed By: nikic, uweigand
Differential Revision: https://reviews.llvm.org/D145811
D144535 enabled machine copy propagation for RISC-V and added it to the
pass pipeline in addPreEmitPass2 (after the MachineOutliner).
Unfortunately, the MachineCopyPropagation pass is unable to correctly
analyse outlined functions, and will delete copy instructions where a
register is set that is intended to be live-out.
RISCVInstrInfo::buildOutlinedFrame will directly insert a JALR, while a
similar function going through the normal codegen path would have a
PseudoRet with operands indicating registers that are live-out.
This patch does the simplest fix, which is to run MachineCopyPropagation
before the MachineOutliner.
Differential Revision: https://reviews.llvm.org/D146037
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D145918
i8 and i16 are not using overflow.
Reduce the number of zero extension instructions.
To reduce the uncertainty of the unknown,
most of the checks of the virtual function are kept
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D143646
This adds two new methods to ShuffleVectorInst, isInterleave and
isInterleaveMask, so that the logic to check if a shuffle mask is an
interleave can be shared across the TTI, codegen and the interleaved
access pass.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D145971
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D145918
After this patch all arbitrary size integers (smaller than 64 bits) in
LLVM IR will be promoted to regular size type in SPIR-V (OpTypeInt
8/16/32/64).
Differential Revision: https://reviews.llvm.org/D145137
This is an initial attempt at preserving debug information in the pseudo
instruction expansion of the AArch backend. In particular, we preserve
the instruction number required by the InstrRef implementation of live
debug values.
There are many other expansions that need to be considered, but the ones
addressed in this commit should be extremely common, as they handle most
arithmetic and logical instructions.
Differential Revision: https://reviews.llvm.org/D145943
The existing scalable costing was just bad. No LMUL cost, no i1 specific costing, etc.. We had updated the fixed cost model, but none of the code is actually fixed length specific. Moving it down handles the scalable cases too.
Fixed vector costs may be more precise, but the actual lowering will use scalable vectors if nothing better is available. During review, we noticed a case where fixed vector reverse can be improved cost model wise, that will follow seperately.
Differential Revision: https://reviews.llvm.org/D145953
Commit 3671bdbcd214("[BPF] Fix a BTF type pruning bug") fixed a
pruning bug to allow generate more types. But the commit has a bug
which permits to generate more types than necessary. The following
is an example to illustrate the problem.
struct t1 {
int a;
};
struct t2 {
struct t1 *p1;
struct t1 *p2;
int b;
};
int foo(struct t2 *arg) {
return arg->b;
}
The following is the part of BTF generation sequence:
(1). 'struct t2 *arg' -> 'struct t1 *p1'
In this step, the type 'struct t1' will be generated as
a forward decl and the ptr type (to 'struct t1') will
be stored in the internal type table.
(2). now the second field 'struct t1 *p2' will be processed.
Since the ptr type (to 'struct t1') already in the type
table, the existing logic strips out ptr modifier and
is able to generate BTF type for 'struct t1'.
In the above step (2), if CheckPointer is true (the type traversal
chain including a struct member), 'ptr' modifier should be checked
and the subsequent type generation should be skipped since
the same case has been processed in visitDerivedType().
The issue is exposed when I am trying to use llvm15 to compile
some internal bpf programs. The bpf skeleton put the whole
ELF section (after striping some sections like dwarf) as a string.
The large BTF section triggered the following error:
bpf_object_with_struct_ops_test_prog_bpf/BpfObjectWithStructOpsTestProg.skel.h:222:23:
error: string literal of length 140144 exceeds maximum length 65536 that C++ compilers
are required to support [-Werror,-Woverlength-strings]
return (const void *)"\
^~
1 error generated.
Although adding -Wno-overlength-strings could workaround the issue,
improving llvm BTF generation sounds better esp. for users using vmlinux.h.
Differential Revision: https://reviews.llvm.org/D145816
This should be a typo in `emitConstantSizeRepmov`. Both its caller and
callee store the alignment in a 64-bit variables, no reason to truncate
it to 32-bit. It results in alignment turns into 0 when larger than
0x100000000.
Fixes#61348
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D145863
This slightly increases the costs of InsertElement instructions that are part
of a vector splat sequence, i.e. a load, InsertElement and a shuffle (load +
dup). The resulting LD1R is a high latency instruction, and this slight
increase in costs avoids SLP vectorisation for a couple of cases where this
isn't profitable.
Fixes: https://github.com/llvm/llvm-project/issues/61047
Differential Revision: https://reviews.llvm.org/D145578
Without the single use restriction we may replace the and with a
more costly duplicated compare.
Differential Revision: https://reviews.llvm.org/D145755
I don't have a test case that fails for this, but it seemed like
we should only handle legal types. The callers I looked at in
DAGCombine either check the type is legal or don't even call
isFPImmLegal unless LegalOperations is true.
Written in a slightly odd way because switches on EVT require
an additional isSimple check so an if/else chain is easier. Used a bool
to shorten the code instead of having multiple ifs and returns.
AArch64 uses a similarish structure.
Fix mistake in f16 table in previous patch.
Original commit message:
Use separate lookup tables instead of trying to reuse the fli.s
table.
We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
Use separate lookup tables instead of trying to reuse the fli.s
table.
We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
D145471 added overrides of the other signature to return MemBytes,
but shouldn't have removed these overrides.
These signatures will now call the MemBytes signature and ignore
the MemBytes. This matches X86.
D102894 introduced common code for the emission of ELF attributes. Our
implementation in RISC-V predates this, and basically copies the Arm
logic at the time. This patch removes that duplication and uses the
shared logic instead.
Differential Revision: https://reviews.llvm.org/D145570
Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently used to build assembler
tables from their addresses.
This is a precursor to lowering dynamic/external LDS accesses from non-kernel
functions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D144221
D143767 will change the intrinsics used to lower floating-point
svadd_x, svmul_x and svsub_x builtins. This will result in the
combines added as part of D140200 to no longer fire in all cases.
This patch extends the existing combines for contraction to cover
fadd_u, fmul_u and fsub_u intrinsics.
Differential Revision: https://reviews.llvm.org/D144413
This came about while investigating ways to handle D145468 in a more generic manner, which involves trying harder to fold and(zext(x),c) -> zext(and(x,c))
Alive2: https://alive2.llvm.org/ce/z/7fXtDt (generic fold)
Differential Revision: https://reviews.llvm.org/D145855
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.
For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D142461