There's a long comment explaining this approach in RISCVInstrInfoXqci.td
This change also fixes some problems when fixups are able to be resolved for `qc.e.li` and `qc.li`.
Instead of allowing a parsed MCInst to have a either uimm10 or simm10,
always render as simm10. This avoids a mismatch between parsed MCInst
and disassembled MCInst when a uimm10 value is used.
getNode has logic to intersect flags correctly if the new node happens
to CSE with an existing node. Setting node flags after getNode bypasses
this logic and may change the node for other uses where the flags don't
hold.
Merge the attributes of object files being linked together. The
`.hexagon.attributes` section can be used by loaders and analysis tools.
This is similar to the .riscv.attributes, introduced in
8a900f2438b4a167b98404565ad4da2645cc9330 /
https://reviews.llvm.org/D138550.
Adding an extra argument before a `fp128` only changes the stack offset
by four bytes, while it should instead go in the next 16-aligned slot.
Add a test demonstrating the current behavior.
`no_x86_scrub_sp` is added because offset from the stack pointer is
needed to show the problem.
Relevant issue: https://github.com/llvm/llvm-project/issues/77401
LLDB uses the LLVM disassembler to determine the size of instructions and
to do the actual disassembly. Currently, if the LLVM disassembler can't
disassemble an instruction, LLDB will ignore the instruction size, assume
the instruction size is the minimum size for that device, print no useful
opcode, and print nothing for the instruction.
This patch changes this behavior to separate the instruction size and
"can't disassemble". If the LLVM disassembler knows the size, but can't
dissasemble the instruction, LLDB will use that size. It will print out
the opcode, and will print "<unknown>" for the instruction. This is much
more useful to both a user and a script.
The impetus behind this change is to clean up RISC-V disassembly when
the LLVM disassembler doesn't understand all of the instructions.
RISC-V supports proprietary extensions, where the TD files don't know
about certain instructions, and the disassembler can't disassemble them.
Internal users want to be able to disassemble these instructions.
With llvm-objdump, the solution is to pipe the output of the disassembly
through a filter program. This patch modifies LLDB's disassembly to look
more like llvm-objdump's, and includes an example python script that adds
a command "fdis" that will disassemble, then pipe the output through a
specified filter program. This has been tested with crustfilt, a sample
filter located at https://github.com/quic/crustfilt .
Changes in this PR:
- Decouple "can't disassemble" with "instruction size".
DisassemblerLLVMC::MCDisasmInstance::GetMCInst now returns a bool for
valid disassembly, and has the size as an out paramter.
Use the size even if the disassembly is invalid.
Disassemble if disassemby is valid.
- Always print out the opcode when -b is specified.
Previously it wouldn't print out the opcode if it couldn't disassemble.
- Print out RISC-V opcodes the way llvm-objdump does.
Code for the new Opcode Type eType16_32Tuples by Jason Molenda.
- Print <unknown> for instructions that can't be disassembled, matching
llvm-objdump, instead of printing nothing.
- Update max riscv32 and riscv64 instruction size to 8.
- Add example "fdis" command script.
- Added disassembly byte test for x86 with known and unknown instructions.
- Added disassembly byte test for riscv32 with known and unknown instructions,
with and without filtering.
- Added test from Jason Molenda to RISC-V disassembly unit tests.
This PR introduces the use of pointer authentication to objective-c[++].
This includes:
* __ptrauth qualifier support for ivars
* protection of isa and super fields
* protection of SEL typed ivars
* protection of class_ro_t data
* protection of methodlist pointers and content
Alas reflection pushed p2719 out of C++26, so this PR changes the
diagnostics to reflect that for now type aware allocation is
functionally a clang extension.
If the remote process died with a signal, this will be exposed by ssh
as an exit code in the range 128 < rc < 160. We may be running under
`not --crash` which will expect us to also die with a signal, so send
the signal to ourselves so that wait4() in `not` will detect the signal.
Speculative fix for failing buildbot:
https://lab.llvm.org/buildbot/#/builders/193/builds/9070
Fixes#145924 and #140416
Depends on #146173 being merged first.
This PR moves the scalarizer pass to immediately before the
dxil-flatten-arrays pass to allow the dxil-flatten-arrays pass to turn
scalar GEPs (including i8 GEPs) into flattened array GEPs where
applicable.
A number of LLVM DirectX CodeGen tests have been edited to remove scalar
GEPs and also correct previously uncaught incorrectly-transformed GEPs.
No more validation errors of the form `Access to out-of-bounds memory is
disallowed` or `TGSM pointers must originate from an unambiguous TGSM
global variable` appear anymore after this PR when compiling DML
shaders.
`__has_builtin` is not available on all compilers. Make sure it works
when not defined.
Also fix some formatting issues:
- Use brace initialization where possible
- Fix wrong capitalization of variables.
- Add `std::` for `unit64_t` and `int64_t` as it is mostly done in this
part of the codebase.
Unlike `verbose_abort`, this function merely logs the error but does not
terminate execution. It is intended to make it possible to implement the
`observe` semantic for Hardening.
In tandem with #146800, this PR fixes#145370
This PR simplifies the logic for collapsing GEP chains and replacing
GEPs to multidimensional arrays with GEPs to flattened arrays. This
implementation avoids unnecessary recursion and more robustly computes
the index to the flattened array by using the GEPOperator's
collectOffset function, which has the side effect of allowing "i8 GEPs"
and other types of GEPs to be handled naturally in the flattening /
collapsing of GEP chains.
Furthermore, a handful of LLVM DirectX CodeGen tests have been edited to
fix incorrect GEP offsets, mismatched types (e.g., loading i32s from a
an array of floats), and typos.
Refactored IR2Vec vocabulary handling to improve code organization and error handling. This would help in upcoming PRs related to the IR2Vec tool.
(Tracking issue - #141817)
This was going out of its way to explicitly mark these as
ARM_AAPCS_VFP. This has been explicitly set since 8b40366b54bd4,
where the commit message states that "sincos" (not sincos_stret)
has a special calling convention. However, that commit also sets
the calling convention for all libcalls to ARM_AAPCS_VFP, and
getEffectiveCallingConv returns the same for CCC anyway in tests
using isWatchABI triples.
The net result of this appears to be a change in behavior when
using -float-abi=soft with isWatchABI, which have no tests so
I assume this is a theoretical combination.
If I assert
```
if (getTargetMachine().getTargetTriple().isWatchABI()) {
assert(!useSoftFloat());
assert(getEffectiveCallingConv(CallingConv::C, false) == CallingConv::ARM_AAPCS_VFP);
}
```
Only 2 tests fail the second condition, which look like copy paste
accidents
using v7k triples with linux and only needed a filler triple. This is a
consequence
of strangely using the target architecture in place of the OS ABI check,
as was done in 042a6c1fe19caf48af7e287dc8f6fd5fec158093
We have some 48-bit instructions in the `Xqci` spec that currently
cannot be compressed to their 32-bit variants due to the constraint in
`CompressInstEmitter` on destination instruction operands not being
allowed to mismatch with the DAG operands.
For eg. the` QC_E_ADDI` instruction can be compressed to the `ADDI`
instruction when the immediate is signed-12 bit but this is currently
not possible since the `QC_E_ADDI` instruction has `GPRNoX0` register
operands while the `ADDI` instruction has `GPR` register operands
leading to an operand type validation error.
I think we can remove the check that only source instruction operands
can mismatch with the corresponding DAG operands and rely on the fact
that we check if the DAG register operand type is a subclass of the
instruction register operand type.
This enables producing a "variable is uninitialized" warning when a
value is passed to a pointer-to-const argument:
```
void foo(const int *);
void test() {
int *v;
foo(v);
}
```
Fixes#37460
This option is similar to -Wuninitialized-const-reference, but diagnoses
the passing of an uninitialized value via a const pointer, like in the
following code:
```
void foo(const int *);
void test() {
int v;
foo(&v);
}
```
This is an extract from #147221 as suggested in [this
comment](https://github.com/llvm/llvm-project/pull/147221#discussion_r2190998730).
Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291
Fix for the build failure reported in :
https://lab.llvm.org/buildbot/#/builders/116/builds/15477
-----
This crash is caused by mismatch of distributed type returned by
`getDistributedType` and intended distributed type for forOp results.
Solution diff:
20c2cf6766
Example:
```
func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> {
%c128 = arith.constant 128 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) {
%ini = "some_def"() : () -> (vector<1xf32>)
%0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) {
%1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>)
scf.yield %1 : vector<1xf32>
}
gpu.yield %0 : vector<1xf32>
}
return %2 : vector<1xf32>
}
```
In this case the distributed type for forOp result is `vector<1xf32>`
(result is not distributed and broadcasted to all lanes instead).
However, in this case `getDistributedType` will return NULL type.
Therefore, if the distributed type can be recovered from warpOp, we
should always do that first before using `getDistributedType`
Enabled AArch64 runs for these tests where it made sense.
Also removed the "temporary" suffixes filter that was added over 10
years ago, I believe the "misched-copy.s output file" has been cleaned
from the runners by now...
Addresses
<https://github.com/llvm/llvm-project/pull/147421#discussion_r2191234968>
for x86
If more than one `catchpad ` uses the same `alloca` for their catch
objects, then we will allocate more than one object in the fixed area
resulting in wasted stack space.
As a follow up, Clang could be updated to re-use the same `alloca` for
all by-reference and by-pointer catch objects.