114 Commits

Author SHA1 Message Date
Kai Nacke
c9f573463e
[Analysis] Move TargetLibraryInfo data to TableGen (#165009)
The collection of library function names in TargetLibraryInfo faces
similar challenges as RuntimeLibCalls in the IR component. The number of
function names is large, there are numerous customizations based on the
triple (including alternate names), and there is a lot of replicated
data in the signature table.

The ultimate goal would be to capture all lbrary function related
information in a .td file. This PR brings the current .def file to
TableGen, almost as a 1:1 replacement. However, there are some
improvements which are not possible in the current implementation:

- the function names are now stored as a long string together with an
offset table.
- the table of signatures is now deduplicated, using an offset table for
access.

The size of the object file decreases about 34kB with these changes. The
hash table of all function names is still constructed dynamically. A
static table like for RuntimeLibCalls is the next logical step.

The main motivation for this change is that I have to add a large number
of custom names for z/OS (like in RuntimeLibCalls.td), and the current
infrastructur does not support this very well.
2025-11-19 16:05:00 -05:00
Rahul Joshi
8f67759585
[NFC][TableGen] Remove close member from various CodeGenHelpers (#167904)
Always rely on local scopes to enforce the lifetime of these helper
objects and by extension where the "closing" of various C++ code
constructs happens.
2025-11-18 11:11:19 -08:00
Ivan Kosarev
3c87119a91
[TableGen][NFCI] Change TableGenMain() to take function_ref. (#167888)
It was switched from a function pointer to std::function in

TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional.
f675ec6165ab6add5e57cd43a2e9fa1a9bc21d81

but there's no mention of any particular reason for that.
2025-11-18 12:43:10 +00:00
CarolineConcatto
200793ac21
Extend MemoryEffects to Support Target-Specific Memory Locations (#148650)
This patch introduces preliminary support for additional memory
locations.
They are: target_mem0 and target_mem1 and they model memory locations
that cannot be represented with existing memory locations.

It was a solution suggested in :
https://discourse.llvm.org/t/rfc-improving-fpmr-handling-for-fp8-intrinsics-in-llvm/86868/6

Currently, these locations are not yet target-specific. The goal is to
enable the compiler to express read/write effects on these resources.
2025-11-18 11:10:58 +00:00
Dharuni R Acharya
39e7712ac5
[LLVM-Tablegen] Pretty Printing Arguments in LLVM Intrinsics (#162629)
This patch adds LLVM infrastructure to support pretty printing of the
intrinsic arguments.
The motivation is to improve the readability of LLVM intrinsics and
facilitate easy
modifications and debugging of LLVM IR.

This feature adds a property `ArgInfo<ArgIndex, [ArgName<"argName">,
ImmArgPrinter<"functionName">]>`
to the intrinsic arguments to print self-explanatory inline comments for
the arguments.

The addition of pretty print support can provide a simple, low-overhead
feature that
enhances the usability of LLVM intrinsics without disrupting existing
workflows.

Link to the RFC, where this feature was discussed:

https://discourse.llvm.org/t/rfc-pretty-printing-immediate-arguments-in-llvm-intrinsics/88536

---------

Signed-off-by: Dharuni R Acharya <dharunira@nvidia.com>
Co-authored-by: Rahul Joshi <rjoshi@nvidia.com>
2025-11-17 09:00:40 -08:00
Sergei Barannikov
e413343ca7
[SelectionDAG] Verify SDTCisVT and SDTCVecEltisVT constraints (#150125)
Teach `SDNodeInfoEmitter` TableGen backend to process `SDTypeConstraint`
records and emit tables for them. The tables are used by
`SDNodeInfo::verifyNode()` to validate a node being created.

This PR only adds validation code for `SDTCisVT` and `SDTCVecEltisVT`
constraints to keep it smaller.

Pull Request: https://github.com/llvm/llvm-project/pull/150125
2025-11-16 18:26:03 +03:00
Ivan Kosarev
71eaf14094
[TableGen] Split *GenRegisterInfo.inc. (#167700)
Reduces memory usage compiling backend sources, most notably for
AMDGPU by ~98 MB per source on average.

AMDGPUGenRegisterInfo.inc is tens of megabytes in size now, and
is even larger downstream. At the same time, it is included in
nearly all backend sources, typically just for a small portion of
its content, resulting in compilation being unnecessarily
memory-hungry, which in turn stresses buildbots and wastes their
resources.

Splitting .inc files also helps avoiding extra ccache misses
where changes in .td files don't cause changes in all parts of
what previously was a single .inc file.

It is thought that rather than building on top of the current
single-output-file design of TableGen, e.g., using `split-file`,
it would be more preferable to recognise the need for multi-file
outputs and give it a proper first-class support directly in
TableGen.
2025-11-14 16:30:51 +00:00
Kazu Hirata
02976f5ffa
[TableGen] Use "using" instead of "typedef" (NFC) (#167168)
Identified with modernize-use-using.
2025-11-08 13:09:03 -08:00
Matt Arsenault
0469ff0a21
TableGen: Split RuntimeLibcallsEmitter into separate utility header (#166583)
This information will be needed in more emitters, so start factoring
it to be more reusable.
2025-11-05 11:55:59 -08:00
Matt Arsenault
056d2c12f7
RuntimeLibcalls: Split lowering decisions into LibcallLoweringInfo (#164987)
Introduce a new class for the TargetLowering usage. This tracks the
subtarget specific lowering decisions for which libcall to use.
RuntimeLibcallsInfo is a module level property, which may have multiple
implementations of a particular libcall available. This attempts to be
a minimum boilerplate patch to introduce the new concept.

In the future we should have a tablegen way of selecting which
implementations should be used for a subtarget. Currently we
do have some conflicting implementations added, it just happens
to work out that the default cases to prefer is alphabetically
first (plus some of these still are using manual overrides
in TargetLowering constructors).
2025-11-05 17:10:36 +00:00
Rahul Joshi
d568601d5a
[NFC][TableGen] Adopt NamespaceEmitter in DirectiveEmitter (#165600) 2025-11-05 07:46:12 -08:00
Rahul Joshi
a2495ff991
[NFC][TableGen] Emit empty lines after/before namespace scope (#166217)
Emit empty line after a namespace scope is opened and before its closed.
Adjust DirectiveEmitter code empty line emission in response to this to
avoid lot of unit test changes.
2025-11-04 07:11:26 -08:00
Jay Foad
f037f41350
[IR] Add new function attribute nocreateundeforpoison (#164809)
Also add a corresponding intrinsic property that can be used to mark
intrinsics that do not introduce poison, for example simple arithmetic
intrinsics that propagate poison just like a simple arithmetic
instruction.

As a smoke test this patch adds the new property to
llvm.amdgcn.fmul.legacy.
2025-11-04 12:00:44 +00:00
Kazu Hirata
fe01594a65
[TableGen] Use std::move properly (NFC) (#166104)
This patch removes const to allow std::move a couple of lines below to
perform a move operation as intended.

Identified with performance-move-const.
2025-11-02 22:42:32 -08:00
Kazu Hirata
83195d9541
[TableGen] Use "= default" (NFC) (#165968)
Identified with modernize-use-equals-default.
2025-11-01 09:25:04 -07:00
Rahul Joshi
2bb4226c7c
[LLVM][Intrinsics] Print note if manual name matches default name (#164716)
Print a note when the manually specified name in an intrinsic matches
the default name it would have been assigned based on the record name,
in which case the manual specification is redundant and can be
eliminated.

Also remove existing redundant manual names.
2025-10-23 12:00:03 -07:00
Matt Arsenault
b74801ad87
TableGen: Use IfDefEmitter (#164209) 2025-10-21 10:09:05 +09:00
Craig Topper
422c0f37af
[TableGen][RISCV] Don't use auto where the type isn't obvious. NFC (#163611) 2025-10-15 19:17:24 +00:00
Rahul Joshi
43f9017745
[NFC][TableGen] Add IncludeGuardEmitter to emit header include guards (#163283)
Add a RAII class `IncludeGuardEmitter` which is similar to
`IfDefEmitter` but emits header include guards and adopt it in
DirectiveEmitter.
2025-10-14 12:51:38 -07:00
Matt Arsenault
aaf5493fd3
TableGen: Account for Unsupporte LibcallImpl in bitset size (#163083)
The Unsupported case is special and doesn't have an entry in the
vector, and is directly emitted as the 0 case. This should be
harmless as it is, but could break if the right number of new
libcalls is added.
2025-10-13 11:32:25 +09:00
Matt Arsenault
aa406aaa67
RuntimeLibcalls: Add bitset for available libcalls (#150869)
This is a step towards separating the set of available libcalls
from the lowering decision of which call to use. Libcall recognition
now directly checks availability instead of indirectly checking through
the lowering table.
2025-10-10 08:27:30 +00:00
Matt Arsenault
ccae485f2b
TableGen: Go back to using range loop over runtime libcall sets (#162221)
This reverts 9c361cc and replaces f490dbdc. Instead of using the lambda
to try avoid naming the variables, just disambiguate the different
AlwaysAvailable
sets with the calling convention name.
2025-10-07 20:46:01 +09:00
Nadharm
f490dbdc32
[TableGen] Reduce stack usage of setTargetRuntimeLibcallSets (#162194)
This change resolves a stack usage issue seen in the TableGen'd function
`setTargetRuntimeLibcallSets` when compiled with MSVC. This change
reduces the frame size from 47720 bytes to 48 bytes.
2025-10-07 09:12:29 +09:00
Craig Topper
c830c843ab
[RISCV][TableGen] Correct vTtoGetLlvmTyString for RISC-V tuples. (#162152)
RISC-V tuples use "NF" not "nElem" to store the number of fields in the
segment.

This fixes a crash when lowering a function with tuple return.
getReturnInfo in CallLowering.cpp does Type*->EVT->Type* and we were
incorrectly converting EVT to Type*.
2025-10-06 21:09:14 +00:00
Rahul Joshi
e9330fd244
[NFC][TableGen] Migrate IfDef/Namespace emitter from MLIR to LLVM (#161744) 2025-10-04 05:28:56 -07:00
Rahul Joshi
0d2b404a35
[LLVM] Fix a bug in Intrinsic::getFnAttributes (#161248) 2025-09-30 05:29:34 -07:00
Michael Liao
9d7628de87 [Intrinsic] Unify IIT_STRUCT{2-9} into ITT_STRUCT to support upto 257 return values
- Currently, Intrinsic can only have up to 9 return values. In case new
  intrinsics require more than 9 return values, additional ITT_STRUCTxxx
  values need to be added to support > 9 return values.  Instead, this
  patch unifies them into a single IIT_STRUCT followed by a BYTE
  specifying the minimal 2 (encoded as 0) and maximal 257 (encoded as
  255) return values.
2025-09-26 13:35:44 -04:00
Elvin Wang
d41bc6834b
[IntrinsicEmitter] Make AttributesMap PackedID type-adaptive (#158383) 2025-09-18 18:33:42 -07:00
Elvin Wang
9b681ea50d
[IntrinsicEmitter] Make AttributesMap bound inclusive (#158714)
This is a minor fix from comment
https://github.com/llvm/llvm-project/pull/157965/files#r2347317186
introduced in #157965.
2025-09-18 08:10:08 -07:00
Elvin Wang
6af94c566e
[IntrinsicEmitter] Make AttributesMap bits adaptive (#157965)
Make IntrinsicsToAttributesMap's func. and arg. fields be able to have
adaptive sizes based on input other than hardcoded 8bits/8bits.
This will ease the pressure for adding new intrinsics in private
downstreams.

func. attr bitsize will become 7(127/128) vs 8(255/256)
2025-09-12 20:42:08 +02:00
Owen Anderson
0f13cae7ff
[CodeGen, CHERI] Add capability types to MVT. (#156616)
This adds value types for representing capability types, enabling their use in instruction selection and other parts of the backend.

These types are distinguished from each other only by size. This is sufficient, at least today, because no existing CHERI configuration supports multiple capability sizes simultaneously. Hybrid configurations supporting intermixed integral pointers and capabilities do exist, and are one of the reasons why these value types are needed beyond existing integral types.

Co-authored-by: David Chisnall <theraven@theravensnest.org>
Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
2025-09-11 17:44:30 +08:00
Rahul Joshi
bcb1a896d8
[NFC][IntrinsicEmitter] Include source location with enum definition (#156800) 2025-09-04 05:57:37 -07:00
Daniel Paoliello
f99b0f3de4
[NFC] RuntimeLibcalls: Prefix the impls with 'Impl_' (#153850)
As noted in #153256, TableGen is generating reserved names for
RuntimeLibcalls, which resulted in a build failure for Arm64EC since
`vcruntime.h` defines `__security_check_cookie` as a macro.

To avoid using reserved names, all impl names will now be prefixed with
`Impl_`.

`NumLibcallImpls` was lifted out as a `constexpr size_t` instead of
being an enum field.

While I was churning the dependent code, I also removed the TODO to move
the impl enum into its own namespace and use an `enum class`: I
experimented with using an `enum class` and adding a namespace, but we
decided it was too verbose so it was dropped.
2025-09-02 09:57:33 -07:00
Matt Arsenault
41aba9ef3b
RuntimeLibcalls: Fix building hash table with duplicate entries (#153801)
We were sizing the table appropriately for the number of LibcallImpls,
but many of those have identical names which were pushing up the
collision count unnecessarily. This ends up decreasing the table size
slightly, and makes it a bit faster.

BM_LookupRuntimeLibcallByNameRandomCalls improves by ~25% and
BM_LookupRuntimeLibcallByNameSampleData by ~5%.

As a secondary change, align the table size up to the next
power of 2. This makes the table larger than before, but improves
the sample data benchmark by an additional 5%.
2025-08-25 20:56:43 +09:00
Matt Arsenault
19ebfa6d0b
RuntimeLibcalls: Move exception call config to tablegen (#151948)
Also starts pruning out these calls if the exception model is
forced to none.

I worked backwards from the logic in addPassesToHandleExceptions
and the pass content. There appears to be some tolerance
for mixing and matching exception modes inside of a single module.
As far as I can tell _Unwind_CallPersonality is only relevant for
wasm, so just add it there.

As usual, the arm64ec case makes things difficult and is
missing test coverage. The set of calls in list form is necessary
to use foreach for the duplication, but in every other context a
dag is more convenient. You cannot use foreach over a dag, and I
haven't found a way to flatten a dag into a list.

This removes the last manual setLibcallImpl call in generic code.
2025-08-19 10:35:59 +09:00
Matt Arsenault
3e5d8a1439 Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2.

Check if llvm-nm exists before building the benchmark.
2025-08-16 09:53:50 +09:00
gulfemsavrun
334e9bf2dd
Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
…210)"

This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3.

Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)"

This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de.

Revert "TableGen: Emit statically generated hash table for runtime
libcalls (#150192)"

This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05.

Reverted three changes because of a CMake error while building llvm-nm
as reported in the following PR:
https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073
2025-08-15 13:32:27 -07:00
Matt Arsenault
9a14b1d254
RuntimeLibcalls: Generate table of libcall name lengths (#153210)
Avoids strlen when constructing the returned StringRef. We were emitting
these in the libcall name lookup anyway, so split out the offsets for
general use.

Currently emitted as a separate table, not sure if it would be better
to change the string offset table to store pairs of offset and width
instead.
2025-08-15 23:29:10 +09:00
Matt Arsenault
769a9058c8
TableGen: Emit statically generated hash table for runtime libcalls (#150192)
a96121089b9c94e08c6632f91f2dffc73c0ffa28 reverted a change
to use a binary search on the string name table because it
was too slow. This replaces it with a static string hash
table based on the known set of libcall names. Microbenchmarking
shows this is similarly fast to using DenseMap. It's possibly
slightly slower than using StringSet, though these aren't an
exact comparison. This also saves on the one time use construction
of the map, so it could be better in practice.

This search isn't simple set check, since it does find the
range of possible matches with the same name. There's also
an additional check for whether the current target supports
the name. The runtime constructed set doesn't require this,
since it only adds the symbols live for the target.

Followed algorithm from this post
http://0x80.pl/notesen/2023-04-30-lookup-in-strings.html

I'm also thinking the 2 special case global symbols should
just be added to RuntimeLibcalls. There are also other global
references emitted in the backend that aren't tracked; we probably
should just use this as a centralized database for all compiler
selected symbols.
2025-08-15 09:02:56 +09:00
Matt Arsenault
bbcac029db
ARM: Move more aeabi libcall config into tablegen (#152109) 2025-08-14 15:43:15 +09:00
Matt Arsenault
0a9ca5d7da
TableGen: Add failing function to libcall emitter error message (#153390) 2025-08-13 23:15:47 +09:00
Benjamin Maxwell
9c361cc068
[TableGen] Avoid duplicate variable names in RuntimeLibcallsEmitter (partial reland of #152505) (#153398) 2025-08-13 12:43:21 +00:00
Nikita Popov
48beed5b71
Revert "[AArch64][SME] Port all SME routines to RuntimeLibcalls" (#153392)
This introduced a 5% compile-time regression on AArch64, see
https://llvm-compile-time-tracker.com/compare.php?from=b9138bde3562de5c28a239dbd303caf2406678c6&to=271688b87abe7cf45aceaff8266270a25eb7b436&stat=instructions:u.

Reverts llvm/llvm-project#152505.
2025-08-13 11:54:39 +00:00
Benjamin Maxwell
271688b87a
[AArch64][SME] Port all SME routines to RuntimeLibcalls (#152505)
This updates everywhere we emit/check an SME routines to use
RuntimeLibcalls to get the function name and calling convention.

Note: RuntimeLibcallEmitter had some issues with emitting non-unique
variable names for sets of libcalls, so I tweaked the output to avoid
the need for variables.
2025-08-13 08:48:59 +01:00
Rahul Joshi
89ea9df6a2
[NFCI[TableGen] Minor improvements to Intrinsic::getAttributes (#152761)
This change implements several small improvements to
`Intrinsic::getAttributes`:

1. Use `SequenceToOffsetTable` to emit `ArgAttrIdTable`. This enables
reuse of entries when they share a common prefix. This reduces the size
of this table from 546 to 484 entries, which is 248 bytes.
2. Fix `AttributeComparator` to purely compare argument attributes and
not look at function attributes. This avoids unnecessary duplicates in
the uniqueing process and eliminates 2 entries from
`ArgAttributesInfoTable`, saving 8 bytes.
3. Improve `Intrinsic::getAttributes` code to not initialize all entries
of `AS` always. Currently, we initialize all entries of the array `AS`
even if we may not use all of them. In addition to the runtime cost, for
Clang release builds, since the initialization loop is unrolled, it
consumes ~330 bytes of code to initialize the `AS` array. Address this
by declaring the storage for AS using just a char array with appropriate
`alignas` (similar to how `SmallVectorStorage` defines its inline
elements).
2025-08-12 07:15:08 -07:00
Rahul Joshi
7f0e4079c8
[NFCI][TableGen] Make Intrinsic::getAttributes table driven (#152349)
This a follow on to https://github.com/llvm/llvm-project/pull/152219 to
reduce both code and frame size of `Intrinsic::getAttributes` further.

Currently, this function consists of several switch cases (one per
unique argument attributes) that populates the local `AS` array with all
non-empty argument attributes for that intrinsic by calling
`getIntrinsicArgAttributeSet`. This change makes this code table driven
and implements `Intrinsic::getAttributes` without any switch cases,
which reduces the code size of this function on all platforms and in
addition reduces the frame size by a factor of 10 on Windows.

This is achieved by:
1. Emitting table `ArgAttrIdTable` containing a concatenated list of
`<ArgNo, AttrID>` entries across all unique arguments.
2. Emitting table `ArgAttributesInfoTable` (indexed by unique
arguments-ID) to store the starting index and number of non-empty arg
attributes.
3. Reserving unique function-ID 255 to indicate that the intrinsic has
no function attributes (to replace `HasFnAttr` setup in each switch
case).
4. Using a simple table lookup and for loop to build the list of
argument and function attributes for a given intrinsic.

Experimental data shows that with release builds and assertions
disabled, this change reduces the code size for GCC and Clang builds on
Linux by ~9KB for a modest (80/152 byte) increase in frame size. For
Windows, it reduces the code size by 20KB and frame size from 4736 bytes
to 461 bytes which is 10x reduction. Actual data is as follows:

```
 Current trunk:
  Compiler                              gcc-13.3.0      clang-18.1.3      MSVC 19.43.34810.0
  code size                             0x35a9          0x370c            0x5581
  frame size                            0x120           0x118             0x1280

 table driven Intrinsic::getAttributes:
  code size                             0xcfb            0xcd0            0x1cf
  frame size                            0x1b8            0x188            0x1A0
  Total savings (code + data)           9212 bytes       9790 bytes       20119 bytes
```

Total savings above accounts for the additional data size for the 2 new
tables, which in this experiment was: `ArgAttributesInfoTable` = 314
bytes and `ArgAttrIdTable` = 888 bytes. Coupled with the earlier
https://github.com/llvm/llvm-project/pull/152219, this achieves a 46x
reduction in frame size for this function in Windows release builds.
2025-08-08 06:02:43 -07:00
Rahul Joshi
22af0cd6f9
[LLVM][Intrinsics] Reduce stack size for Intrinsic::getAttributes (#152219)
This change fixes a stack size regression that got introduced in
0de0354aa8.
That change did 2 independent things:

1. Uniquify argument and function attributes separately so that we
generate a smaller number of unique sets as opposed to uniquifying them
together. This is beneficial for code size.
2. Eliminate the fixed size array `AS` and `NumAttrs` variable and
instead build the returned AttribteList in each case using an
initializer list.

The second part seems to have caused a regression in the stack size
usage of this function for Windows. This change essentially undoes part
2 and reinstates the use of the fixed size array `AS` which fixes this
stack size regression. The actual measured stack frame size for this
function before/after this change is as follows:

```
  Current trunk data for release build (x86_64 builds for Linux, x86 build for Windows):
  Compiler                              gcc-13.3.0      clang-18.1.3      MSVC 19.43.34810.0
  DLLVM_ENABLE_ASSERTIONS=OFF           0x120           0x110             0x54B0   
  DLLVM_ENABLE_ASSERTIONS=ON            0x2880          0x110             0x5250

  After applying the fix:
  Compiler                              gcc-13.3.0      clang-18.1.3      MSVC 19.43.34810.0
  DLLVM_ENABLE_ASSERTIONS=OFF           0x120           0x118             0x1240h                                               
  DLLVM_ENABLE_ASSERTIONS=ON            0x120           0x118             0x1240h  
```
Note that for Windows builds with assertions disabled, the stack frame
size for this function reduces from 21680 to 4672 which is a 4.6x
reduction. Stack frame size for GCC build with assertions also improved
and clang builds are unimpacted. The speculation is that clang and gcc
is able to reuse the stack space across these switch cases better with
existing code, but MSVC is not, and re-introducing the `AS` variable
forces all cases to use the same local variable, addressing the stack
space regression.
2025-08-06 07:09:52 -07:00
Matt Arsenault
b2f0ffd659
RuntimeLibcalls: Really move default libcall handling to tablegen (#148780)
Hack in the default setting so it's consistently generated like
the other cases. Maintain a list of targets where this applies.
The alternative would require new infrastructure to sort the system
library initialization in some way.

I wanted the unhandled target case to be treated as a fatal
error, but it turns out there's a hack in IRSymtab using
RuntimeLibcalls, which will fail out in many tests that
do not have a triple set. Many of the failures are simply
running llvm-as with no triple, which probably should not
depend on knowing an accurate set of calls.
2025-08-04 08:32:00 +09:00
Matt Arsenault
bd7db75489
TableGen: Sort RuntimeLibcallImpls secondarily by enum names (#150728)
Extracted from #150192, this hopefully fixes occasional EXPENSIVE_CHECKS
failures.
2025-07-26 11:22:59 +09:00
Matt Arsenault
0b7a95a6fd Partially Reapply "RuntimeLibcalls: Add methods to recognize libcall names (#149001)"
This partially reverts commit a96121089b9c94e08c6632f91f2dffc73c0ffa28.

Drop the IRSymtab changes for now
2025-07-18 18:06:26 +09:00