5 Commits

Author SHA1 Message Date
Reid Kleckner
f3efbce4a7
[llvm] Move data layout string computation to TargetParser (#157612)
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.

Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.

By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.

This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.

The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-11 11:05:29 -07:00
Matt Arsenault
3e5d8a1439 Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2.

Check if llvm-nm exists before building the benchmark.
2025-08-16 09:53:50 +09:00
gulfemsavrun
334e9bf2dd
Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
…210)"

This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3.

Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)"

This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de.

Revert "TableGen: Emit statically generated hash table for runtime
libcalls (#150192)"

This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05.

Reverted three changes because of a CMake error while building llvm-nm
as reported in the following PR:
https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073
2025-08-15 13:32:27 -07:00
Matt Arsenault
cb1228fbd5
RuntimeLibcalls: Return StringRef for libcall names (#153209)
Does not yet fully propagate this down into the TargetLowering
uses, many of which are relying on null checks on the returned
value.
2025-08-15 09:55:39 +09:00
Matt Arsenault
769a9058c8
TableGen: Emit statically generated hash table for runtime libcalls (#150192)
a96121089b9c94e08c6632f91f2dffc73c0ffa28 reverted a change
to use a binary search on the string name table because it
was too slow. This replaces it with a static string hash
table based on the known set of libcall names. Microbenchmarking
shows this is similarly fast to using DenseMap. It's possibly
slightly slower than using StringSet, though these aren't an
exact comparison. This also saves on the one time use construction
of the map, so it could be better in practice.

This search isn't simple set check, since it does find the
range of possible matches with the same name. There's also
an additional check for whether the current target supports
the name. The runtime constructed set doesn't require this,
since it only adds the symbols live for the target.

Followed algorithm from this post
http://0x80.pl/notesen/2023-04-30-lookup-in-strings.html

I'm also thinking the 2 special case global symbols should
just be added to RuntimeLibcalls. There are also other global
references emitted in the backend that aren't tracked; we probably
should just use this as a centralized database for all compiler
selected symbols.
2025-08-15 09:02:56 +09:00