This is a follow up change to eliminating uselists for ConstantData.
In the previous revision, ConstantData had a replacement reference count
instead of a uselist. This reference count was misleading, and not useful
in the same way as it would be for another value. The references may not
have even been in the current module, since these are shared throughout
the LLVMContext.
This doesn't space leak any more than we previously did; nothing was
attempting to garbage collect unused constants.
Previously the use_empty, and hasNUses type of APIs were supported through
the reference count. These now behave as if the uses are always empty.
Ideally it would be illegal to inspect these, but this forces API complexity
into quite a few places. It may be doable to make it illegal to check these
counts, but I would like there to be a targeted fuzzing effort to make sure
every transform properly deals with a constant in every operand position.
All tests pass if I turn the hasNUses* and getNumUses queries into assertions,
only hasOneUse in particular appears to hit in some set of contexts. I've
added unit tests to ensure logical consistency between these cases
This is a resurrected version of the patch attached to this RFC:
https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606
In this adaptation, there are a few differences. In the original patch, the Use's
use list was replaced with an unsigned* to the reference count in the value. This
version leaves them as null and leaves the ref counting only in Value.
Remove use-lists from instances of ConstantData (which are shared
across modules and have no operands).
To continue supporting most of the use-list API, store a ref-count in
place of the use-list; this is for API like Value::use_empty and
Value::hasNUses. Operations that actually need the use-list -- like
Value::use_begin -- will assert.
This change has three benefits:
1. The compiler output cannot in any way depend on the use-list order
of instances of ConstantData.
2. There's no use-list traffic when adding and removing simple
constants from operand lists (although there is ref-count traffic;
YMMV).
3. It's cheaper to serialize use-lists (since we're no longer
serializing the use-list order of things like i32 0).
The downside is that you can't look at all the users of ConstantData,
but traversals of users of i32 0 are already ill-advised.
Possible follow-ups:
- Track if an instance of a ConstantVector/ConstantArray/etc. is known
to have all ConstantData arguments, and drop the use-lists to
ref-counts in those cases. Callers need to check Value::hasUseList
before iterating through the use-list.
- Remove even the ref-counts. I'm not sure they have any benefit
besides minimizing the scope of this commit, and maintaining the
counts is not free.
Fixes#58629
Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
Something to do with control code handling in Windows terminals breaks
the statusline in various ways. It makes LLDB unusable and even if you
set the setting to disable statusline, it's too late, and the terminal
session is now in a weird state.
See https://github.com/llvm/llvm-project/issues/134846 for more details.
Until we figure this out, don't allow it to be used on Windows.
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
We currently accept label arguments to inline asm calls. This support
predates both blockaddresses and callbr and is only covered by one X86
test. Remove it in favor of callbr (or at least blockaddress, though
that cannot guarantee correct codegen, just like using block labels
directly can't).
I didn't bother implementing bitcode upgrade support for this, but I can
add it if desired.
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.
These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.
Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
Since `LLVMDIBuilderCreateEnumerator` only supports up to 64 bits, this
PR adds a new `LLVMDIBuilderCreateEnumeratorOfArbitraryPrecision`
function that takes an arbitrary number of words, based on
`LLVMConstIntOfArbitraryPrecision`. This allows even larger enumeration
types to represent their values exactly. (It seems LLVM should already
support i128 enums since 13.0.0.)
This Change adds support for two SiFive vendor attributes in clang:
- "SiFive-CLIC-preemptible"
- "SiFive-CLIC-stack-swap"
These can be given together, and can be combined with "machine", but
cannot be combined with any other interrupt attribute values.
These are handled primarily in RISCVFrameLowering:
- "SiFive-CLIC-stack-swap" entails swapping `sp` with `sf.mscratchcsw`
at function entry and exit, which holds the trap stack pointer.
- "SiFive-CLIC-preemptible" entails saving `mcause` and `mepc` before
re-enabling interrupts using `mstatus`. To save these, `s0` and `s1`
are first spilled to the stack, and then the values are read into
these registers. If these registers are used in the function, their
values will be spilled a second time onto the stack with the generic
callee-saved-register handling. At the end of the function interrupts
are disabled again before `mepc` and `mcause` are restored.
This Change also adds support for the following two experimental
extensions, which only contain CSRs:
- XSfsclic - for SiFive's CLIC Supervisor-Mode CSRs
- XSfmclic - for SiFive's CLIC Machine-Mode CSRs
The latter is needed for interrupt support.
The CFI information for this implementation is not correct, but I'd
prefer to correct this in a follow-up. While it's unlikely anyone wants
to unwind through a handler, the CFI information is also used by
debuggers so it would be good to get it right.
Co-authored-by: Ana Pazos <apazos@quicinc.com>
Resolves#129439.
The addition to `echo.ll` is for testing `ConstantArray`, because every
other array in that file is in fact a `ConstantDataArray` and now takes
the new code path in `echo.cpp`.
This introduces the options "-F/--forward" and "-R/--reverse" to
`process continue`.
These only work if you're running with a gdbserver backend that supports
reverse execution, such as rr. For testing we rely on the fake
reverse-execution functionality in `lldbreverse.py`.
Static analysis detected that Demangler::demangleStringLiteral had a
potential overflow if not checking StringByteSize properly.
Added check to ensure that for wide string it is always even and that
there were the byte count did not mismatch the actual size of the
literal.
Fixes: https://github.com/llvm/llvm-project/issues/129970
Since lldb 20, these have had no effect:
https://releases.llvm.org/20.1.0/docs/ReleaseNotes.html#changes-to-lldb
> lldb-server now listens to a single port for gdbserver connections and
> provides that port to the connection handler processes. This means
that
> only 2 ports need to be opened in the firewall (one for the
lldb-server
> platform, one for gdbserver connections). In addition, due to this
work,
lldb-server now works on Windows in the server mode.
Remove them.
This extension adds two external input output instructions for
non-memory-mapped device.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
This implements [the `.option exact` and `.option noexact`
proposal](https://github.com/riscv-non-isa/riscv-asm-manual/pull/122)
for RISC-V.
`.option exact` turns off:
- Compression
- Branch Relaxation
- Linker Relaxation
`.option noexact` turns these back on, and is also the default, matching
the current behaviour.
This is required to support `li`, but the code is also shared with
CodeGen so the compiler will now emit instructions from Xqcili when that
extension is enabled during compilation.
Also implemented some missed verifiers in
`RISCVInstrInfo::verifyInstruction`, some of which are required for this
change.
With a minor fix for the build failures.
Original message:
This extension adds nine instructions, eight for non-memory-mapped devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli quic_svs@quicinc.com
This extension adds nine instructions, eight for non-memory-mapped
devices synchronization and delay instruction.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
Set the default processor version to v68 when the user does not specify
one in the command line. This includes changes in the LLVM backed and
linker (lld). Since lld normally sets the version based on inputs, this
change will only affect cases when there are no inputs.
Fixes#127558
The i6400 and i6500 are high performance multi-core microprocessors from
MIPS that provide best in class power efficiency for use in
system-on-chip (SoC) applications. i6400 and i6500 implements Release 6
of the MIPS64 Instruction Set Architecture with full hardware
multithreading and hardware virtualization support.
This extension adds 10 instructions that provide hints to the interface
simulation environment.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/
This patch adds assembler only support.
This extension adds twelve conditional branch instructions that use an
immediate operand for the source.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
From GFX10 onwards it is possible to employ benevolent scheduling of
waves. This patch unconditionally enables, for the `amdhsa` OS, the bit
which controls that capability, as it is beneficial for algorithms that
rely on more complex concurrent coordination and it is generally
performance neutral otherwise.
…elative
The semantics of `llvm.type.checked.load.relative` seem to be a little
different from that of `llvm.load.relative`. It looks like the semantics
for `llvm.type.checked.load.relative` is `ptr + offset + *(ptr +
offset)` whereas the semantics for `llvm.load.relative` is `ptr + *(ptr
+ offset)`. That is, the offset for the former is added to the offset
address whereas the later has the offset added to the original pointer.
It really feels like the checked intrinsic was meant to match the
semantics of the non-checked intrinsic, but I think for all cases the
checked intrinsic is used (swift being the only use I know of), the
calculation just happens to be the same because swift always uses an
offset of zero. Likewise, all llvm tests for this intrinsic happen to
use an offset of zero.
Relative vtables in clang happens to be the first time where we're using
this intrinsic and using it with non-zero values. This updates the
semantics of the checked intrinsic to match the non-checked one.
Effectively this shouldn't change any codegen by any users of this since
all current users seem to use a zero offset.
This PR also updates some tests with non-zero offsets.
The Xqcili extension includes a two instructions that load large
immediates than is available with the base RISC-V ISA.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0
This patch adds assembler only support.
This change means that llvm-strip no longer exits immediately upon
encountering an error when modifying a file and will instead continue
modifying the other inputs. Fixes#129412
LLVM currently expects `__float128` to be both passed and returned in
xmm registers on Windows. However, this disagrees with the Windows
x86-64 calling convention [1], which indicates values larger than 64
bits should be passed indirectly.
Update LLVM's default Windows calling convention to pass `fp128`
directly. Returning in xmm0 is unchanged since this seems like a
reasonable extrapolation of the ABI. With this patch, the calling
convention for `i128` and `f128` is the same.
GCC passes `__float128` indirectly, which this also matches. However, it
also returns indirectly, which is not done here. I intend to attempt a
GCC change to also return in `xmm0` rather than making that change here,
given the consistency with `i128`.
This corresponds to the frontend change in [2], see more details there.
[1]:
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
[2]: https://github.com/llvm/llvm-project/pull/115052
This extension adds thirty eight bit manipulation instructions.
The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.6
This patch adds assembler only support.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
This adds support for Xqccmp to the following passes:
- Prolog Epilog Insertion - reusing much of the existing push/pop logic,
but extending it to cope with frame pointers and reorder the CFI
information correctly.
- Move Merger - extending it to support the `qc.` variants of the
double-move instructions.
- Push/Pop Optimizer - extending it to support the `qc.` variants of the
pop instructions.
The testing is based on existing Zcmp tests, but I have put them in
separate files as some of the Zcmp tests were getting quite long.
Xqccmp is a new spec by Qualcomm that makes a vendor-specific effort to
solve the push/pop + frame pointers issue. Broadly, it takes the Zcmp
instructions and reverse the order they push/pop registers in, which
ends up matching the frame pointer convention.
This extension adds a new instruction not present in Zcmp,
`qc.cm.pushfp`, which will set `fp` to the incoming `sp` value after it
has pushed the registers.
This change duplicates the Zcmp implementation, with minor changes to
mnemonics (for the `qc.` prefix), predicates, and the addition of
`qc.cm.pushfp`. There is also new logic to prevent combining Xqccmp and
Zcmp. Xqccmp is kept separate to Xqci for decoding/encoding etc, as the
specs are separate today.
Specification:
https://github.com/quic/riscv-unified-db/releases/tag/Xqccmp_extension-0.1.0