llvm-project

Author	SHA1	Message	Date
Nerixyz	91b90652bb	Reland "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR`" (#189401 ) Initially added in #187709. It was reverted in #188833, because [llvm-clang-x86_64-sie-win](https://lab.llvm.org/buildbot/#/builders/46/builds/32873) was failing in `cross-project-tests/debuginfo-tests/dexter-tests/nrvo.cpp`. The test passed for me locally. After checking on another machine, I found that `S_DEFRANGE_REGISTER_REL_INDIR` is only supported by dbgeng/WinDbg from Windows 10.0 Build 19041 (released 2020) onwards. SDKs before this will fail to read the value. That buildbot is on Windows 10.0 Build 17763. I'm not sure if we should make the generation of that record conditional. Debuggers that can't read the record will skip it. They'll still see that there's some local variable, but won't be able to display the value. As far as I know, users of older Windows 10 builds should be able to install a newer Windows SDK and use the WinDbg from that version. But I haven't tested that.	2026-04-02 12:15:11 +02:00
Nerixyz	48e9c76d88	Revert "[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR` (#187709 )" (#188833 ) This reverts commit 08a4085. The change breaks `nvro.cpp` in the debugging tests on the buildbot (https://lab.llvm.org/buildbot/#/builders/46/builds/32873) but works locally for me. It might be because the buildbot is using an older Windows SDK. In addition, it reverts parts of #188769 (using `.` over `->`).	2026-03-26 20:20:36 +00:00
Nerixyz	08a408500e	[CodeView] Generate `S_DEFRANGE_REGISTER_REL_INDIR` (#187709 ) In CodeView we had the limitation that we couldn't express locations like `DW_OP_deref, DW_OP_plus_uconst 8` (i.e. indirect loads with an offset). `S_DEFRANGE_REGISTER_REL_INDIR` allows us to represent this. It's essentially `S_DEFRANGE_REGISTER_REL` (`Register + Offset`) with an additional load afterward (`*(Register + Offset) + OffsetInUdt`). These indirect locations are used in C++ 17 structured bindings and the compiler generated C++ 20 coroutine stubs. Before, locations that would only do a dereference without an added offset afterward were represented by `S_DEFRANGE_(REGISTER\|FRAMEPOINTER)_REL` where the local had a reference type: ```cpp struct Foo { int a; int b; }; int main() { Foo f{1, 2}; auto &[a, b] = f; // │ ╰─ Not present // ╰─ S_LOCAL{ type: int&, FRAMEPOINTER_REL{ offset=0 } } return a + b; } ``` With this PR, both `a` and `b` will be present as non-reference types: ```cpp // ... int main() { Foo f{1, 2}; auto &[a, b] = f; // │ ╰─ S_LOCAL{ type: int, REGISTER_REL_INDIR{ register: RSP, offset: 0, offset in udt: 4 } } // ╰─ S_LOCAL{ type: int, REGISTER_REL_INDIR{ register: RSP, offset: 0, offset in udt: 0 } } return a + b; } ``` One downside of this is that all variables like `a` now need a larger record. If it used `FRAMEPOINTER_REL` before, it now takes 8 bytes more (there's no `FRAMEPOINTER_REL_INDIR` where we could omit the register). I removed the `UseReferenceType` workaround. If we need three dereferences, that could be added back, but I don't know any construct that uses this. Closes #34392.	2026-03-26 15:01:57 +01:00
Vladislav Dzhidzhoev	b9cecee3fb	Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" (#165032 ) This is an attempt to merge https://reviews.llvm.org/D144006 with LTO fix. The last merge attempt was https://github.com/llvm/llvm-project/pull/75385. The issue with it was investigated in https://github.com/llvm/llvm-project/pull/75385#issuecomment-2386684121. The problem happens when 1. Several modules are being linked. 2. There are several DISubprograms that initially belong to different modules but represent the same source code function (for example, a function included from the same source code file). 3. Some of such DISubprograms survive IR linking. It may happen if one of them is inlined somewhere or if the functions that have these DISubprograms attached have internal linkage. 4. Each of these DISubprograms has a local type that corresponds to the same source code type. These types are initially from different modules, but have the same ODR identifier. If the same (in the sense of ODR identifier/ODR uniquing rules) local type is present in two modules, and these modules are linked together, the type gets uniqued. A DIType, that happens to be loaded first, survives linking, and the references on other types with the same ODR identifier from the modules loaded later are replaced with the references on the DIType loaded first. Since defintion subprograms, in scope of which these types are located, are not deduplicated, the linker output may contain multiple DISubprogram's having the same (uniqued) type in their retainedNodes lists. Further compilation of such modules causes crashes. To tackle that, * previous solution to handle LTO linking with local types in retainedNodes is removed (cloneLocalTypes() function), * for each loaded distinct (definition) DISubprogram, its retainedNodes list is scanned after loading, and DITypes with a scope of another subprogram are removed. If something from a Function corresponding to the DISubprogram references uniqued type, we rely on cross-CU links. Additionally: * a check is added to Verifier to report about local types located in a wrong retainedNodes list, Original commit message follows. --------- RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com> Co-authored-by: Jeremy Morse <jeremy.morse@sony.com>	2026-02-04 00:34:52 +01:00
Tom Tromey	efbbca62d1	[llvm][DebugInfo] Allow DIDerivedType as a bound in DISubrangeType (#165880 ) Consider this Ada type: ``` type Array_Type is array (Natural range <>) of Integer; type Record_Type (L1, L2 : Natural) is record I1 : Integer; A1 : Array_Type (1 .. L1); I2 : Integer; A2 : Array_Type (1 .. L2); I3 : Integer; end record; ``` Here, the array fields have lengths that depend on the discriminants of the record type. However, in this case the array lengths cannot be expressed as DWARF location expressions, with the issue being that "A2" has a non-constant offset, but an expression involving DW_OP_push_object_address will push the address of the field -- with no way to find the location of "L2". In a case like this, I believe the correct DWARF is to emit the array ranges using a direct reference to the discriminant, like: ``` <3><1156>: Abbrev Number: 1 (DW_TAG_member) <1157> DW_AT_name : l1 ... <3><1177>: Abbrev Number: 6 (DW_TAG_array_type) <1178> DW_AT_name : (indirect string, offset: 0x1a0b): vla__record_type__T4b <117c> DW_AT_type : <0x1287> <1180> DW_AT_sibling : <0x118e> <4><1184>: Abbrev Number: 7 (DW_TAG_subrange_type) <1185> DW_AT_type : <0x1280> <1189> DW_AT_upper_bound : <0x1156> ``` (FWIW this is what GCC has done for years.) This patch makes this possible in LLVM, by letting a DISubrangeType refer to a DIDerivedType. gnat-llvm can then arrange for the DIE reference to be correct by setting the array type's scope to be the record.	2025-12-04 09:38:14 +09:00
Vladislav Dzhidzhoev	e2a2c03eef	[DebugInfo] Add Verifier check for incorrectly-scoped retainedNodes (#166855 ) These checks ensure that retained nodes of a DISubprogram belong to the subprogram. Tests with incorrect IR are fixed. We should not have variables of one subprogram present in retained nodes of other subprograms. Also, interface for accessing DISubprogram's retained nodes is slightly refactored. `DISubprogram::visitRetainedNodes` and `DISubprogram::forEachRetainedNode` are added to avoid repeating checks like ``` if (const auto LV = dyn_cast<DILocalVariable>(N)) ... else if (const auto L = dyn_cast<DILabel>(N)) ... else if (const auto *IE = dyn_cast<DIImportedEntity>(N)) ... ```	2025-11-10 13:13:49 +01:00
Laxman Sole	b1bd74e1cc	[LLVM][DebugInfo] Allow ExtraData field to be a node reference (#165023 ) This change enhances debug info metadata handling to support node references in the `extraData` field for `DW_TAG_member`, `DW_TAG_variable`, and `DW_TAG_inheritance` tags. The change enables LLVM to handle both direct constant values (e.g., extraData: i8 1) and node references (e.g., extraData: !18 where !18 = !{ i8 1 }).	2025-11-06 07:38:51 -08:00
Orlando Cazalet-Hyams	aa5fe56db4	[DebugInfo] Add dataSize to DIBasicType to add DW_AT_bit_size to _BitInt types (#164372 ) DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size attributes "If the value of an object of the given type does not fully occupy the storage described by a byte size attribute" * Add DataSizeInBits to DIBasicType (`DIBasicType(... dataSize: n ...)` in IR). * Change Clang to add DataSizeInBits to _BitInt type metadata. * Change LLVM to add DW_AT_bit_size to base_type DIEs that have non-zero DataSizeInBits. TODO: Do we need to emit DW_AT_data_bit_offset for big endian targets? See discussion on the PR. Fixes [#61952](https://github.com/llvm/llvm-project/issues/61952) --------- Co-authored-by: David Stenberg <david.stenberg@ericsson.com>	2025-10-29 15:23:46 +00:00
Michael Buch	6cba572d9e	[llvm][DebugInfo][NFC] Abstract DICompileUnit::SourceLanguage to allow alternate DWARF SourceLanguage encoding (#162255 ) This patch sets up `DICompileUnit` to support the DWARFv6 `DW_AT_language_name` and `DW_AT_language_version` attributes (which are set to replace `DW_AT_language`). This patch changes the `DICompileUnit::SourceLanguage` field type to a `DISourceLanguageName` that encapsulates the notion of "versioned vs. unversioned name". A "versioned" name is one that has an associated version stored separately in `DISourceLanguageName::Version`. This patch just changes all the clients of the `getSourceLanguage` API to the expect a `DISourceLanguageName`. Currently they all just `assert` (via `DISourceLanguageName::getUnversionedName`) that we're dealing with "unversioned names" (i.e., the pre-DWARFv6 language codes). In follow-up patches (e.g., draft is at https://github.com/llvm/llvm-project/pull/162261), when we start emitting versioned language codes, the `getUnversionedName` calls can then be adjusted to `getName`. Implementation considerations * We could have added a new member to `DICompileUnit` alongside the existing `SourceLanguage` field. I don't think this would have made the transition any simpler (clients would still need to be aware of "versioned" vs. "unversioned" language names). I felt that encapsulating this inside a `DISourceLanguageName` was easier to reason about for maintainers. * Currently DISourceLanguageName is a `12` byte structure. We could probably pack all the info inside a `uint64_t` (16-bits for the name, 32-bits for the version, 1-bit for answering the `hasVersionedName`). Just to keep the prototype simple I used a `std::optional`. But since the guts of the structure are hidden, we can always change the layout to a more compact representation instead. How to review * The new `DISourceLanguageName` structure is defined in `DebugInfoMetadata.h`. All the other changes fall out from changing the `DICompileUnit::SourceLanguage` from `unsigned` to `DISourceLanguageName`.	2025-10-08 18:27:22 +01:00
Tom Tromey	296fddc89e	Allow DW_OP_rot, DW_OP_neg, and DW_OP_abs in DIExpression (#160757 ) The Ada front end can emit somewhat complicated DWARF expressions for the offset of a field. While working in this area I found that I needed DW_OP_rot (to implement a branch-free computation -- it looked more difficult to add support for branching); and DW_OP_neg and DW_OP_abs (just basic functionality).	2025-10-03 14:36:17 +01:00
Stephen Tozer	3946c5061d	Add DebugSSAUpdater class to track debug value liveness (#135349 ) This patch adds a class that uses SSA construction, with debug values as definitions, to determine whether and which debug values for a particular variable are live at each point in an IR function. This will be used by the IR reader of llvm-debuginfo-analyzer to compute variable ranges and coverage, although it may be applicable to other debug info IR analyses.	2025-09-16 11:22:02 +01:00
Orlando Cazalet-Hyams	1778669739	[KeyInstr] Remove LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS CMake flag (#152735 ) The CMake flag has been on by default for a month without any issues. This makes the feature support in LLVM unconditional (but does not enable the feature by default).	2025-08-08 17:03:28 +01:00
Kazu Hirata	228e96b28a	[llvm] Use std::make_optional (NFC) (#151627 ) std::make_optional<T> is a lot like std::make_unique<T> in that it performs perfect forwarding of arguments for T's constructor. As a result, we don't have to repeat type names twice.	2025-08-01 00:24:40 -07:00
Jeremy Morse	2a1869b981	[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136 ) At this stage I'm just opportunistically deleting any code using debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll get to deleting that in probably one or more two commits.	2025-07-18 08:25:10 +01:00
Adrian Vogelsgesang	de3c8410d8	[debuginfo][coro] Emit debug info labels for coroutine resume points (#141937 ) RFC on discourse: https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606 With this commit, we add `DILabel` debug infos to the resume points of a coroutine. Those labels can be used by debugging scripts to figure out the exact line and column at which a coroutine was suspended by looking up current `__coro_index` value inside the coroutines frame, and then searching for the corresponding label inside the coroutine's resume function. The DWARF information generated for such a label looks like: ``` 0x00000f71: DW_TAG_label DW_AT_name ("__coro_resume_1") DW_AT_decl_file ("generator-example.cpp") DW_AT_decl_line (5) DW_AT_decl_column (3) DW_AT_artificial (true) DW_AT_LLVM_coro_suspend_idx (0x01) DW_AT_low_pc (0x00000000000019be) ``` The labels can be mapped to their corresponding `__coro_idx` values either via their naming convention `__coro_resume_<N>` or using the new `DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can be looked up using `info line -function my_coroutine -label __coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label debug information, yet. Given this is an artificial compiler-generated label, I did apply the DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on type and variable definitions, but this is a natural extension and was also blessed in the RFC on discourse. Also, this commit adds `DW_AT_decl_column` to labels, not only for coroutines but also for normal C and C++ labels. While not strictly necessary, I am doing so now because it would be harder to do so later without breaking the binary LLVM-IR format Drive-by fixes: While reading the existing test cases to understand how to write my own test case, I did a couple of small typo fixes and comment improvements	2025-07-04 10:44:35 +02:00
Orlando Cazalet-Hyams	140e1894f2	[KeyInstr] Add DISubprogram::keyInstructions bit (#144107 ) Patch 1/4 adding bitcode support. Store whether or not a function is using Key Instructions in its DISubprogram so that we don't need to rely on the -mllvm flag -dwarf-use-key-instructions to determine whether or not to interpret Key Instructions metadata to decide is_stmt placement at DWARF emission time. This makes bitcode support simple and enables well defined mixing of non-key-instructions and key-instructions functions in an LTO context. This patch adds the bit (using DISubprogram::SubclassData1). PR 144104 and 144103 use it during DWARF emission. PR 44102 adds bitcode support. See pull request for overview of alternative attempts.	2025-06-30 08:01:55 +01:00
Tom Tromey	3b90597c2c	Non constant size and offset in DWARF (#141106 ) In Ada, a record type can have a non-constant size, and a field can appear at a non-constant bit offset in a record. To support this, this patch changes DIType to record the size and offset using metadata, rather than plain integers. In addition to a constant offset, both DIVariable and DIExpression are now supported here. One thing of note in this patch is the choice of how exactly to represent a non-constant bit offset, with the difficulty being that DWARF 5 does not support this. DWARF 3 did have a way to support a non-constant byte offset, combined with a constant bit offset within the byte, but this was deprecated in DWARF 4 and removed from DWARF 5. This patch takes a simple approach: a DWARF extension allowing the use of an expression with DW_AT_data_bit_offset. There is a corresponding DWARF issue, see https://dwarfstd.org/issues/250501.1.html. The main reason for this approach is that it keeps API simplicity: just a single value is needed, rather than having separate data describing the byte offset and the bit within the byte.	2025-06-25 11:20:35 -07:00
Andrew Rogers	7dc5dc986a	[llvm] annotate interfaces in llvm/IR for DLL export (#141650 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/IR`, `llvm/IRPrinter`, and `llvm/IRReader` libraries. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Add `#include "llvm/Support/Compiler.h"` to files where it was not auto-added by IDS due to no pre-existing block of include statements. - Add `LLVM_ABI_FRIEND` to friend member functions declared with `LLVM_ABI` - Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported instantiated templates - Add `LLVM_ABI` to a subset of private class methods and fields that require export - Add `LLVM_ABI` to a small number of symbols that require export but are not declared in headers - Reorder `LLVM_ABI` with `[[deprecated]]` and `[[nodiscard]]` attributes. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-06-02 15:58:24 -07:00
Orlando Cazalet-Hyams	ee7f6a5c6f	[KeyInstr] Merge atoms in DILocation::getMergedLocation (#133480 ) NFC for builds with LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS=OFF (default). In an ideal world we would be able to track that the merged location is used in multiple source atoms. We can't do this though, so instead we arbitrarily but deterministically pick one. In cases where the InlinedAt field is unchanged we keep the atom with the lowest non-zero rank (highest precedence). If the ranks are equal we choose the smaller non-zero group number (arbitrary choice). In cases where the InlinedAt field is adjusted we generate a new atom group. Keeping the group wouldn't make sense (a source atom is identified by the group number and InlinedAt pair) but discarding the atom info could result in missed is_stmts. Add unittest in MetadataTest.cpp. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668	2025-05-06 13:48:14 +01:00
Orlando Cazalet-Hyams	43a9d5dfd5	[KeyInstr] Add Atom Group waterline to LLVMContext (#133478 ) Source location atoms are identified by a function-local number and the DILocation's InlinedAt field. The front end is responsible for assigning source atom numbers, but certain optimisations need to assign new atom numbers to some instructions. Most often code duplication optimisations like loop unroll. Tracking a global maximum value (waterline) means we can easily (cheaply) get new numbers that don't clash in any function. The waterline is managed through DILocation creation, LLVMContext::incNextAtomGroup, and LLVMContext::updateAtomGroupWaterline. Add unittest. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668	2025-05-06 11:42:50 +01:00
Snehasish Kumar	5d16a18c4b	[Metadata] Return the valid DebugLoc if one of them is null with -pick-merged-source-locations. (#138148 ) Previously when getMergedLocation was passed nullptr as one of the parameters we returned nullptr. Change that behaviour to instead return a valid DebugLoc. This is beneficial for binaries from which SamplePGO profiles are obtained.	2025-05-02 09:36:52 -07:00
Orlando Cazalet-Hyams	0c7c82af23	[KeyInstr] Add fields to DILocation behind compile time flag (#133477 ) Add AtomGroup and AtomRank to DILocation behind compile time flag EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS. Add IR read-write roundtrip test in a directory that is unsupported unless the CMake flag is enabled. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668	2025-05-01 15:40:39 +01:00
Vladislav Dzhidzhoev	6462fad3d0	[DebugInfo] getMergedLocation: match scopes based on their location (#132286 ) getMergedLocation uses a common parent scope of the two input locations for an output location. It doesn't consider the case when the common parent scope is from a file other than L1's and L2's files. In that case, it produces a merged location with an erroneous scope (https://github.com/llvm/llvm-project/issues/122846). In some cases, such as https://github.com/llvm/llvm-project/pull/125780#issuecomment-2651657856, L1, L2 having a common parent scope from another file indicate that the code at L1 and L2 is included from the same source location. With this commit, getMergedLocation detects that L1, L2, or their common parent scope files are different. If so, it assumes that L1 and L2 were included from some source location, and tries to attach the output location to a scope with the nearest common source location with regard to L1 and L2. If the nearest common location is also from another file, getMergedLocation returns it as a merged location, assuming that L1 and L2 belong to files that were both included in the nearest common location. Fixes https://github.com/llvm/llvm-project/issues/122846.	2025-04-18 13:57:28 +02:00
Kazu Hirata	c4e9901b5b	[llvm] Use llvm::append_range (NFC) (#135931 )	2025-04-16 12:28:47 -07:00
Snehasish Kumar	f9193f3b18	[DebugInfo] Preserve line and column number when merging debug info. (#129960 ) This patch introduces a new option `-preserve-merged-debug-info` to preserve an arbitrary but deterministic version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf. With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile. --------- Co-authored-by: Krzysztof Pszeniczny <kpszeniczny@google.com>	2025-04-04 09:37:25 -07:00
Tom Tromey	68947342b7	Add support for fixed-point types (#129596 ) This adds DWARF generation for fixed-point types. This feature is needed by Ada. Note that a pre-existing GNU extension is used in one case. This has been emitted by GCC for years, and is needed because standard DWARF is otherwise incapable of representing these types.	2025-03-31 07:42:21 -07:00
Tom Tromey	f89129af8a	Add bit stride to DICompositeType (#131680 ) In Ada, an array can be packed and the elements can take less space than their natural object size. For example, for this type: type Packed_Array is array (4 .. 8) of Boolean; pragma pack (Packed_Array); ... each element of the array occupies a single bit, even though the "natural" size for a Boolean in memory is a byte. In DWARF, this is represented by putting a DW_AT_bit_stride onto the array type itself. This patch adds a bit stride to DICompositeType so that gnat-llvm can emit DWARF for these sorts of arrays.	2025-03-25 17:14:07 -07:00
Tom Tromey	e298fc2da9	Add DISubrangeType (#126772 ) An Ada program can have types that are subranges of other types. This patch adds a new DIType node, DISubrangeType, to represent this concept. I considered extending the existing DISubrange to do this, but as DISubrange does not derive from DIType, that approach seemed more disruptive. A DISubrangeType can be used both as an ordinary type, but also as the type of an array index. This is also important for Ada. Ada subrange types can also be stored using a bias. Representing this in the DWARF required the use of an extension. GCC has been emitting this extension for years, so I've reused it here.	2025-02-24 10:11:53 -08:00
Michael Buch	eb8901bda1	[llvm][DebugInfo] Add new DW_AT_APPLE_enum_kind to encode enum_extensibility (#124752 ) When creating `EnumDecl`s from DWARF for Objective-C `NS_ENUM`s, the Swift compiler tries to figure out if it should perform "swiftification" of that enum (which involves renaming the enumerator cases, etc.). The heuristics by which it determines whether we want to swiftify an enum is by checking the `enum_extensibility` attribute (because that's what `NS_ENUM` pretty much are). Currently LLDB fails to attach the `EnumExtensibilityAttr` to `EnumDecl`s it creates (because there's not enough info in DWARF to derive it), which means we have to fall back to re-building Swift modules on-the-fly, slowing down expression evaluation substantially. This happens around `4b3931c8ce/lib/ClangImporter/ImportEnumInfo.cpp (L37-L59)` To speed up Swift exression evaluation, this patch proposes encoding the C/C++/Objective-C `enum_extensibility` attribute in DWARF via a new `DW_AT_APPLE_ENUM_KIND`. This would currently be only used from the LLDB Swift plugin. But may be of interest to other language plugins as well (though I haven't come up with a concrete use-case for it outside of Swift). I'm open to naming suggestions of the various new attributes/attribute constants proposed here. I tried to be as generic as possible if we wanted to extend it to other kinds of enum properties (e.g., flag enums). The new attribute would look as follows: ``` DW_TAG_enumeration_type DW_AT_type (0x0000003a "unsigned int") DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Closed) DW_AT_name ("ClosedEnum") DW_AT_byte_size (0x04) DW_AT_decl_file ("enum.c") DW_AT_decl_line (23) DW_TAG_enumeration_type DW_AT_type (0x0000003a "unsigned int") DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Open) DW_AT_name ("OpenEnum") DW_AT_byte_size (0x04) DW_AT_decl_file ("enum.c") DW_AT_decl_line (27) ``` Absence of the attribute means the extensibility of the enum is unknown and abides by whatever the language rules of that CU dictate. This does feel like a big hammer for quite a specific use-case, so I'm happy to discuss alternatives. Alternatives considered: * Re-using an existing DWARF attribute to express extensibility. E.g., a `DW_TAG_enumeration_type` could have a `DW_AT_count` or `DW_AT_upper_bound` indicating the number of enumerators, which could imply closed-ness. I felt like a dedicated attribute (which could be generalized further) seemed more applicable. But I'm open to re-using existing attributes. * Encoding the entire attribute string (i.e., `DW_TAG_LLVM_annotation ("enum_extensibility((open))")`) on the `DW_TAG_enumeration_type`. Then in LLDB somehow parse that out into a `EnumExtensibilityAttr`. I haven't found a great API in Clang to parse arbitrary strings into AST nodes (the ones I've found required fully formed C++ constructs). Though if someone knows of a good way to do this, happy to consider that too.	2025-02-06 08:58:35 +00:00
Augusto Noronha	67fb2686fb	[DebugInfo] Add a specification attribute to LLVM DebugInfo (#115362 ) Add a specification attribute to LLVM DebugInfo, which is analogous to DWARF's DW_AT_specification. According to the DWARF spec: "A debugging information entry that represents a declaration that completes another (earlier) non-defining declaration may have a DW_AT_specification attribute whose value is a reference to the debugging information entry representing the non-defining declaration." This patch allows types to be specifications of other types. This is used by Swift to represent generic types. For example, given this Swift program: ``` struct MyStruct<T> { let t: T } let variable = MyStruct<Int>(t: 43) ``` The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>: ``` DW_TAG_structure_type DW_AT_name ("MyStruct") // "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to // MyStruct<T> DW_AT_linkage_name ("$s1w8MyStructVyxGD") // other attributes here ``` And a specification for MyStruct<Int>: ``` DW_TAG_structure_type DW_AT_specification (<link to "MyStruct">) // "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to // MyStruct<Int> DW_AT_linkage_name ("$s1w8MyStructVySiGD") DW_AT_byte_size (0x08) // other attributes here ```	2024-11-13 09:55:37 -08:00
Augusto Noronha	f6617d65e4	[DebugInfo] Add num_extra_inhabitants to debug info (#112590 ) An extra inhabitant is a bit pattern that does not represent a valid value for instances of a given type. The number of extra inhabitants is the number of those bit configurations. This is used by Swift to save space when composing types. For example, because Bool only needs 2 bit patterns to represent all of its values (true and false), an Optional<Bool> only occupies 1 byte in memory by using a bit configuration that is unused by Bool. Which bit patterns are unused are part of the ABI of the language. Since Swift generics are not monomorphized, by using dynamic libraries you can have generic types whose size, alignment, etc, are known only at runtime (which is why this feature is needed). This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF as an Apple extension.	2024-11-06 15:48:04 -08:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Kazu Hirata	7df9da7d78	[llvm] Construct SmallVector with ArrayRef (NFC) (#101872 )	2024-08-04 08:54:23 -07:00
Orlando Cazalet-Hyams	7a7d370742	[NFC] Add DIExpression::calculateFragmentIntersect (#97738 ) Patch [3/x] to fix structured bindings debug info in SROA. This function computes a fragment, bit-extract operation if needed, and new constant offset to describe a part of a variable covered by some memory. This generalises, simplifies, and replaces at::calculateFragmentIntersect. That version is still used as a wrapper for now though to keep this change NFC. The new version takes doesn't have a DbgRecord parameter, instead using an explicit address and address offset. The old version only operates on dbg_assigns and this change means it can also operate on dbg_declare records easily, which it will do in a subsequent patch. The new version has a new out-param OffsetFromLocationInBits which is set to the difference between the first bit of the variable location and the first bit of the memory slice. This will be used in a subsequent patch in SROA to determine the new offset to use in the address expression after splitting an alloca.	2024-07-12 08:28:36 +01:00
Orlando Cazalet-Hyams	f50f7a7aa0	[NFC] Add DIExpression::extractLeadingOffset (#97719 ) Patch [2/x] to fix structured bindings debug info in SROA. It extracts a constant offset from the DIExpression if there is one and fills RemainingOps with the ops that come after it. This function will be used in a subsequent patch.	2024-07-08 10:14:59 +01:00
eddyz87	01ce74fe14	Revert "[DebugInfo][BPF] Add 'annotations' field for DIBasicType & DI… (#96172 ) …SubroutineType (#91422)" This reverts commit 3ca17443ef4af21bdb1f3b4fbcfff672cbc6176c. As reported in [1,2] the commit above causes CI failure for powerpc-aix target. There is also a performance regression reported in [3]. Reverting to comply with the developer policy. [1] https://github.com/llvm/llvm-project/pull/91422#issuecomment-2179425473 [2] https://lab.llvm.org/buildbot/#/builders/64/builds/62 [3] https://github.com/llvm/llvm-project/pull/91422#issuecomment-2175631443	2024-06-20 21:28:02 +03:00
eddyz87	3ca17443ef	[DebugInfo][BPF] Add 'annotations' field for DIBasicType & DISubroutineType (#91422 ) Extend `DIBasicType` and `DISubroutineType` with additional field `annotations`, e.g. as below: ``` !5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed, annotations: !6) !6 = !{!7} !7 = !{!"btf:type_tag", !"tag1"} ``` The field would be used by BPF backend to generate DWARF attributes corresponding to `btf_type_tag` type attributes, e.g.: ``` 0x00000029: DW_TAG_base_type DW_AT_name ("int") DW_AT_encoding (DW_ATE_signed) DW_AT_byte_size (0x04) 0x0000002d: DW_TAG_LLVM_annotation DW_AT_name ("btf:type_tag") DW_AT_const_value ("tag1") ``` Such DWARF entries would be used to generate BTF definitions by tools like [pahole](https://github.com/acmel/dwarves). Note: similar fields with similar purposes are already present in DIDerivedType and DICompositeType. Currently "btf_type_tag" attributes are represented in debug information as 'annotations' fields in DIDerivedType with DW_TAG_pointer_type tag. The annotation on a pointer corresponds to pointee having the attributes in the final BTF. The discussion in [thread](https://lore.kernel.org/bpf/87r0w9jjoq.fsf@oracle.com/) came to conclusion, that such annotations should apply to the annotated type itself. Hence the necessity to extend `DIBasicType` & `DISubroutineType` types with 'annotations' field to represent cases like below: ``` int __attribute__((btf_type_tag("foo"))) bar; ``` This was previously tracked as differential revision: https://reviews.llvm.org/D143966	2024-06-18 10:23:25 +03:00
John Brawn	f84056c38f	[DebugInfo] Handle DW_OP_LLVM_extract_bits in SROA (#94638 ) This doesn't need any work to be done in SROA itself, but rather in functions that it uses. Specifically: * DIExpression::createFragmentExpression is made to understand DW_OP_LLVM_extract_bits * valueCoversEntireFragment is made to check the active bits instead of the fragment size, so that it handles extract_bits correctly	2024-06-17 12:01:08 +01:00
John Brawn	1721c14e8e	[DebugInfo] Add DW_OP_LLVM_extract_bits (#93990 ) This operation extracts a number of bits at a given offset and sign or zero extends them, which is done by emitting it as a left shift followed by a right shift. This is being added for use in clang for C++ structured bindings of bitfields that have offset or size that aren't a byte multiple. A new operation is being added, instead of shifts being used directly, as it makes correctly handling it in optimisations (which will be done in a later patch) much easier.	2024-06-07 10:38:23 +01:00
Shubham Sandeep Rastogi	69969c725b	Use DIExpression::foldConstantMath() at the result of an append() (#71719 ) This patch uses `DIExpression::foldConstantMath()` at the end of a `DIExpression::append()`. Which should help in reducing the size of DIExpressions that grow because of salvaging debug info This is part of a stack of patches and comes after: https://github.com/llvm/llvm-project/pull/69768 https://github.com/llvm/llvm-project/pull/71717 https://github.com/llvm/llvm-project/pull/71718	2024-05-29 16:19:53 -07:00
Shubham Sandeep Rastogi	b12f81b53a	Introduce DIExpression::foldConstantMath() (#71718 ) DIExpressions can get very long and have a lot of redundant operations. This function uses simple pattern matching to fold constant math that can be evaluated at compile time. The hope is that other people can contribute other patterns as well. I also couldn't see a good way of combining this with `DIExpression::constantFold` so it stands alone. This is part of a stack of patches and comes after https://github.com/llvm/llvm-project/pull/69768 https://github.com/llvm/llvm-project/pull/71717	2024-05-29 16:09:59 -07:00
Stephen Tozer	ffd08c7759	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216 ) This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.	2024-03-19 20:07:07 +00:00
Daniil Kovalev	924a1dceb5	[Dwarf] Support `__ptrauth` qualifier in metadata nodes (#83862 ) Reland #82363 after fixing build failure https://lab.llvm.org/buildbot/#/builders/5/builds/41428. Memory sanitizer detects usage of `RawData` union member which is not filled directly. Instead, the code relies on filling `Data` union member, which is a struct consisting of signing schema parameters. According to https://en.cppreference.com/w/cpp/language/union, this is UB: "It is undefined behavior to read from the member of the union that wasn't most recently written". Instead of relying on compiler allowing us to do dirty things, do not use union and only store `RawData`. Particular ptrauth parameters are obtained on demand via bit operations. Original PR description below. Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type which has the qualifier applied, and the following parameters representing the signing schema: - `ptrAuthKey` (integer) - `ptrAuthIsAddressDiscriminated` (boolean) - `ptrAuthExtraDiscriminator` (integer) - `ptrAuthIsaPointer` (boolean) - `ptrAuthAuthenticatesNullValues` (boolean) Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2024-03-19 09:13:17 +03:00
David Stenberg	61671e2500	[DebugInfo] Fix faulty DIExpression::appendToStack assert (#85255 ) The appendToStack() function asserts that no DW_OP_stack_value or DW_OP_LLVM_fragment operations are present in the operations to be appended. The function did that by iterating over all elements in the array rather than just the operations, leading it to falsely asserting on the following input produced by getExt(), since 159 (0x9f) is the DWARF code for DW_OP_stack_value: {dwarf::DW_OP_LLVM_convert, 159, dwarf::DW_ATE_signed} Fix this by using expr_op iterators.	2024-03-15 12:51:06 +01:00
Daniil Kovalev	bf08d02868	Revert "[Dwarf] Support `__ptrauth` qualifier in metadata nodes" (#83672 ) Reverts llvm/llvm-project#82363 See a build failure related to an issue discovered by memory sanitizer (use of uninitialized value): https://lab.llvm.org/buildbot/#/builders/37/builds/31965	2024-03-02 14:48:46 +03:00
Daniil Kovalev	8f65e7b917	[Dwarf] Support `__ptrauth` qualifier in metadata nodes (#82363 ) Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type which has the qualifier applied, and the following parameters representing the signing schema: - `ptrAuthKey` (integer) - `ptrAuthIsAddressDiscriminated` (boolean) - `ptrAuthExtraDiscriminator` (integer) - `ptrAuthIsaPointer` (boolean) - `ptrAuthAuthenticatesNullValues` (boolean) Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2024-03-01 19:48:08 +03:00
Orlando Cazalet-Hyams	f34418c73b	[HWASAN] Remove DW_OP_LLVM_tag_offset from DIExpression::isImplicit (#79816 ) According to its doc-comment `isImplicit` is meant to return true if the expression is an implicit location description (describes an object or part of an object which has no location by computing the value from available program state). There's a brief entry for `DW_OP_LLVM_tag_offset` in the LangRef and there's some info in the original commit fb9ce100d19be130d004d03088ccd4af295f3435. From what I can tell it doesn't look like `DW_OP_LLVM_tag_offset` affects whether or not the location is implicit; the opcode doesn't get included in the final location description but instead is added as an attribute to the variable. This was tripping an assertion in the latest application of the fix to #76545, #78606, where an expression containing a `DW_OP_LLVM_tag_offset` is split into a fragment (i.e., describe a part of the whole variable).	2024-02-01 10:29:08 +00:00
Jeremy Morse	f0b5527b79	[DebugInfo][RemoveDIs] Instrument loop-rotate for DPValues (#72997 ) Loop-rotate manually maintains dbg.value intrinsics -- it also needs to manually maintain the replacement for dbg.value intrinsics, DPValue objects. For the most part this patch adds parallel implementations using the new type Some extra juggling is needed when loop-rotate hoists loop-invariant instructions out of the loop: the DPValues attached to such an instruction need to get rotated but not hoisted. Exercised by the new test function invariant_hoist in dbgvalue.ll. There's also a "don't insert duplicate debug intrinsics" facility in LoopRotate. The value and correctness of this isn't clear, but to continue preserving behaviour that's now tested in the "tak_dup" function in dbgvalue.ll. Other things in this patch include a helper DebugVariable constructor for DPValues, a insertDebugValuesForPHIs handler for RemoveDIs (exercised by the new tests), and beefing up the dbg.value checking in dbgvalue.ll to ensure that each record is tested (and that there's an implicit check-not).	2023-11-26 22:57:40 +00:00
Stephen Tozer	f99a020059	Reapply "[DebugInfo] Make DIArgList inherit from Metadata and always unique" This reverts commit 0fd5dc94380d5fe666dc6c603b4bb782cef743e7. The original commit removed DIArgLists from being in an MDNode map, but did not insert a new `delete` in the LLVMContextImpl destructor. This reapply adds that call to delete, preventing a memory leak.	2023-11-17 17:55:41 +00:00
Stephen Tozer	0fd5dc9438	Revert "[DebugInfo] Make DIArgList inherit from Metadata and always unique" (#72682 ) Reverts llvm/llvm-project#72147 Reverted due to buildbot failure: https://lab.llvm.org/buildbot/#/builders/5/builds/38410	2023-11-17 17:44:19 +00:00

1 2 3 4 5 ...

320 Commits