llvm-project

Author	SHA1	Message	Date
Nikita Popov	4d37bbc429	[Bitcode] Store function type IDs rather than function types This resolves one of the type ID propagation TODOs.	2022-02-23 16:36:20 +01:00
Nikita Popov	b6eafba296	[Bitcode] Store type IDs for values This is the next step towards supporting bitcode auto upgrade with opaque pointers. The ValueList now stores the Value* together with its associated type ID, which allows inspecting the original pointer element type of arbitrary values. This is a largely mechanical change threading the type ID through various places. I've left TODOTypeID placeholders in a number of places where determining the type ID is either non-trivial or requires allocating a new type ID not present in the original bitcode. For this reason, the new type IDs are also not used for anything yet (apart from propagation). They will get used once the TODOs are resolved. Differential Revision: https://reviews.llvm.org/D119821	2022-02-22 17:27:06 +01:00
Nikita Popov	093e9489d5	[BitcodeReader] Change order of assignValue() arguments (NFC) Other methods in ValueList generally pass Idx first, and it is more convention for assignment methods to have the target on the LHS rather than RHS.	2022-02-15 10:11:01 +01:00
Nikita Popov	1c456a8220	[Bitcode] Improve support for opaque-pointer bitcode upgrade This is step two of supporting autoupgrade of old bitcode to opaque pointers. Rather than tracking the element type ID of pointers in particular, track all type IDs that a type contains. This allows us to recover the element type in more complex situations, e.g. when we need to determine the pointer element type of a vector element or function type parameter. Differential Revision: https://reviews.llvm.org/D119339	2022-02-15 09:39:48 +01:00
Fangrui Song	edf4780ad1	[BitcodeReader] Fix use-after-move	2022-02-14 10:54:37 -08:00
Momchil Velikov	6398903ac8	Extend the `uwtable` attribute with unwind table kind We have the `clang -cc1` command-line option `-funwind-tables=1\|2` and the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or asynchronous unwind tables (2)`. However, this is encoded in LLVM IR by the presence or the absence of the `uwtable` attribute, i.e. we lose the information whether to generate want just some unwind tables or asynchronous unwind tables. Asynchronous unwind tables take more space in the runtime image, I'd estimate something like 80-90% more, as the difference is adding roughly the same number of CFI directives as for prologues, only a bit simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even more, if you consider tail duplication of epilogue blocks. Asynchronous unwind tables could also restrict code generation to having only a finite number of frame pointer adjustments (an example of not having a finite number of `SP` adjustments is on AArch64 when untagging the stack (MTE) in some cases the compiler can modify `SP` in a loop). Having the CFI precise up to an instruction generally also means one cannot bundle together CFI instructions once the prologue is done, they need to be interspersed with ordinary instructions, which means extra `DW_CFA_advance_loc` commands, further increasing the unwind tables size. That is to say, async unwind tables impose a non-negligible overhead, yet for the most common use cases (like C++ exceptions), they are not even needed. This patch extends the `uwtable` attribute with an optional value: - `uwtable` (default to `async`) - `uwtable(sync)`, synchronous unwind tables - `uwtable(async)`, asynchronous (instruction precise) unwind tables Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114543	2022-02-14 14:35:02 +00:00
Nikita Popov	4d477ba50f	[BitcodeReader] Rename method for element type by ID (NFC) Make it clearer that this method is specifically for pointer element types, and not other element types. This distinction will be relevant in the future. The somewhat unusual spelling is to make sure this does not show up when grepping for getPointerElementType.	2022-02-14 11:15:43 +01:00
Nikita Popov	c28b0b9d18	[Bitcode] Add partial support for opaque pointer auto-upgrade Auto-upgrades that rely on the pointer element type do not work in opaque pointer mode. The idea behind this patch is that we can instead work with type IDs, for which we can retain the pointer element type. For typed pointer bitcode, we will have a distinct type ID for pointers with distinct element type, even if there will only be a single corresponding opaque pointer type. The disclaimer here is that this is only the first step of the change, and there are still more getPointerElementType() calls to remove. I expect that two more patches will be needed: 1. Track all "contained" type IDs, which will allow us to handle function params (which are contained in the function type) and GEPs (which may use vectors of pointers) 2. Track type IDs for values, which is e.g. necessary to handle loads. Differential Revision: https://reviews.llvm.org/D118694	2022-02-11 09:32:46 +01:00
Nikita Popov	ea93ca60ef	[Bitcode] Fix size check for DIImportedEntity record This was using && instead of \|\|.	2022-02-09 14:23:30 +01:00
Nikita Popov	72248712e5	[Bitcode] Check minimum size of constant GEP record Checking this early, because we may end up reading up to two records before the operands.	2022-02-09 14:23:30 +01:00
Nikita Popov	6d52ea885f	[Bitcode] Prevent OOB read for invalid name size	2022-02-08 09:49:39 +01:00
serge-sans-paille	1237c1496f	Cleanup LLVMBitcode headers Major user-facing changes: llvm/Bitcode/BitcodeReader.h no longer includes llvm/IR/ModuleSummaryIndex.h Some statistics: clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Bitcode/Reader/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 493335 before: 539640 Discourse thread on the topic: https://discourse.llvm.org/t/include-what-you-use-include-cleanup/ Differential Revision: https://reviews.llvm.org/D119091	2022-02-07 21:19:22 +01:00
Nikita Popov	f4fca0fbb0	[Bitcode] Replace assertion with check	2022-02-07 12:39:35 +01:00
Nikita Popov	fdf8cb978f	[Bitcode] Handle invalid data layout gracefully	2022-02-07 12:28:15 +01:00
Nikita Popov	0c553bff8e	[Bitcode] Guard against out of bounds value reference We should make sure that the value ID is in bounds, otherwise we will assert / read out of bounds.	2022-02-07 12:16:13 +01:00
Nikita Popov	89017772d9	[Bitcode] Don't assert on invalid attribute group record Report an error instead.	2022-02-07 12:16:12 +01:00
Nikita Popov	8a71854183	[Bitcode] Handle invalid abbrev number error more gracefully Avoid report_fatal_error(), propagate the error upwards instead.	2022-02-07 10:34:34 +01:00
Nikita Popov	f392e9d264	[BitcodeReader] Resolve error handling todo If possible, forward the inner error instead of creating a new one.	2022-02-04 17:14:45 +01:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Chih-Ping Chen	28bfa57a73	[DebugInfo] Add stringLocationExp field to DIStringType DIStringType is used to encode the debug info of a character object in Fortran. A Fortran deferred-length character object is typically implemented as a pair of the following two pieces of info: An address of the raw storage of the characters, and the length of the object. The stringLocationExp field contains the DIExpression to get to the raw storage. This patch also enables the emission of DW_AT_data_location attribute in a DW_TAG_string_type debug info entry based on stringLocationExp in DIStringType. A test is also added to ensure that the bitcode reader is backward compatible with the old DIStringType format. Differential Revision: https://reviews.llvm.org/D117586	2022-01-26 11:56:57 -05:00
Nikita Popov	aa97bc116d	[NFC] Remove uses of PointerType::getElementType() Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.	2022-01-25 09:44:52 +01:00
minglotus-6	e95ad93e6e	[llvm-dis] Add an option `dump-thinlto-index-only` in llvm-dis to read ThinLTO minimized code only.	2022-01-19 18:07:50 -08:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Kazu Hirata	435a5a3652	[llvm] Fix bugprone argument comments (NFC) Identified with bugprone-argument-comment.	2022-01-08 11:56:38 -08:00
Nikita Popov	e4d1779990	[IR] Add ConstraintInfo::hasArg() helper (NFC) Checking whether a constraint corresponds to an argument is a recurring pattern.	2022-01-07 10:44:38 +01:00
Kazu Hirata	2aed08131d	[llvm] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-07 00:39:14 -08:00
Nikita Popov	eddd5be1df	[BitCode] Autoupgrade inline asm elementtype attribute This is the autoupgrade part of D116531. If old bitcode is missing the elementtype attribute for indirect inline asm constraints, automatically add it. As usual, this only works when upgrading in typed mode, we haven't figured out upgrade in opaque mode yet.	2022-01-06 15:13:01 +01:00
Roman Lebedev	62b1682570	[Opaqueptrs][IR Serialization] Improve inlineasm [de]serialization The bitcode reader expected that the pointers are typed, so that it can extract the function type for the assembly so `bitc::CST_CODE_INLINEASM` did not explicitly store said function type. I'm not really sure how the upgrade path will look for existing bitcode, but i think we can easily support opaque pointers going forward, by simply storing the function type. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D116341	2021-12-30 13:54:37 +03:00
Roman Lebedev	a5337d6a1c	[BitcodeReader] `bitc::CST_CODE_INLINEASM`: un-hardcode offsets	2021-12-30 13:50:02 +03:00
Roman Lebedev	d5a4d6a497	[BitcodeReader] propagateAttributeTypes(): fix opaque pointer handling Can't get the pointee type of an opaque pointer, but in that case said attributes must already be typed, so just don't try to rewrite them if they already are.	2021-12-28 22:06:51 +03:00
Sami Tolvanen	5dc8aaac39	[llvm][IR] Add no_cfi constant With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that needs the address of the actual function body. For example, in the Linux kernel, the code that sets up interrupt handlers needs to take the address of the interrupt handler function instead of the CFI jump table, as the jump table may not even be mapped into memory when an interrupt is triggered. This change adds the no_cfi constant type, which wraps function references in a value that LowerTypeTestsModule::replaceCfiUses does not replace. Link: https://github.com/ClangBuiltLinux/linux/issues/1353 Reviewed By: nickdesaulniers, pcc Differential Revision: https://reviews.llvm.org/D108478	2021-12-20 12:55:32 -08:00
Nikita Popov	18ab892ff7	[Bitcode] Avoid setting invalid comdat pointer (NFC) Instead track global objects with implicit comdat in a separate set. The current approach of temporarily assigning an invalid comdat pointer is incompatible with D115864.	2021-12-17 20:27:37 +01:00
Mingming Liu	09a704c5ef	[LTO] Ignore unreachable virtual functions in WPD in hybrid LTO. Differential Revision: https://reviews.llvm.org/D115492	2021-12-14 20:18:04 +00:00
Kazu Hirata	7787a8f1b7	[llvm] Use llvm::reverse (NFC)	2021-12-13 21:54:51 -08:00
Kazu Hirata	1457e78352	[llvm] Use range-based for loops (NFC)	2021-12-05 08:33:02 -08:00
Zarko Todorovski	95875d246a	[LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm Part of work to use more inclusive language in clang/llvm. Rewording some comments and change function and variable names.	2021-11-24 17:29:55 -05:00
Kazu Hirata	f6bce30cf9	[llvm] Use range-based for loops (NFC)	2021-11-20 18:42:10 -08:00
Kazu Hirata	d243cbf8ea	[llvm] Use isa instead of dyn_cast (NFC)	2021-11-14 19:40:46 -08:00
Luís Ferreira	665b4138d9	[DebugInfo] run clang-format on some unformatted files This trivial patch runs clang-format on some unformatted files before doing logic changes and prevent hard to review diffs. Differential Revision: https://reviews.llvm.org/D113572	2021-11-11 18:59:41 -08:00
Arthur Eubanks	05963a3d66	Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes" This reverts commit ee7652569854af567ba83e5255d70e80cc8619a1. Causes crashes, see comments in D104827.	2021-11-09 14:27:55 -08:00
Scott Linder	ee76525698	[DebugInfo] Enforce implicit constraints on `distinct` MDNodes Add UNIQUED and DISTINCT properties in Metadata.def and use them to implement restrictions on the `distinct` property of MDNodes: * DIExpression can currently be parsed from IR or read from bitcode as `distinct`, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be `distinct` anyway, as there is no syntax to describe it. * Similarly, DIArgList is conceptually always uniqued. It is currently restricted to only appearing in contexts where there is no syntax for `distinct`, but for consistency it is treated equivalently to DIExpression in this patch. * DICompileUnit is already restricted to always being `distinct`, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support. The new UNIQUED property applies to DIExpression and DIArgList, and forbids them to be `distinct`. It also implies they are canonically printed inline at each use, rather than via MDNode ID. The new DISTINCT property applies to DICompileUnit, and requires it to be `distinct`. A potential alternative change is to forbid the non-inline syntax for DIExpression entirely, as is done with DIArgList implicitly by requiring it appear in the context of a function. For example, we would forbid: !named = !{!0} !0 = !DIExpression() Instead we would only accept the equivalent inlined version: !named = !{!DIExpression()} This essentially removes the ability to create a `distinct` DIExpression by construction, as there is no syntax for `distinct` inline. If this patch is accepted as-is, the result would be that the non-canonical version is accepted, but the following would be an error and produce a diagnostic: !named = !{!0} ; error: 'distinct' not allowed for !DIExpression() !0 = distinct !DIExpression() Also update some documentation to consistently use the inline syntax for DIExpression, and to describe the restrictions on `distinct` for nodes where applicable. Reviewed By: StephenTozer, t-tye Differential Revision: https://reviews.llvm.org/D104827	2021-11-09 18:19:11 +00:00
Kazu Hirata	3c06920cd1	[llvm] Use make_early_inc_range (NFC)	2021-11-08 09:09:39 -08:00
Itay Bookstein	848812a55e	[Verifier] Add verification logic for GlobalIFuncs Verify that the resolver exists, that it is a defined Function, and that its return type matches the ifunc's type. Add corresponding check to BitcodeReader, change clang to emit the correct type, and fix tests to comply. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D112349	2021-10-31 20:00:57 -07:00
Duncan P. N. Exon Smith	b12a864c29	Bitcode: Use Expected<T>::takeError() and moveInto() more, NFC Avoid naming some Expected<T> values in the Bitcode reader by using takeError() and moveInto() more often. This follows the smaller set of changes included in 2410fb4616b2c08bbaddd44e6c11da8285fbd1d3.	2021-10-25 16:03:40 -07:00
Duncan P. N. Exon Smith	2410fb4616	Support: Use Expected<T>::moveInto() in a few places These are some usage examples for `Expected<T>::moveInto()`. Differential Revision: https://reviews.llvm.org/D112280	2021-10-22 12:40:10 -07:00
Itay Bookstein	08ed216000	[IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol As discussed in: * https://reviews.llvm.org/D94166 * https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html The GlobalIndirectSymbol class lost most of its meaning in https://reviews.llvm.org/D109792, which disambiguated getBaseObject (now getAliaseeObject) between GlobalIFunc and everything else. In addition, as long as GlobalIFunc is not a GlobalObject and getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee is a GlobalIFunc cannot currently be modeled properly. Creating aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition, calling getAliaseeObject on a GlobalIFunc will currently return nullptr, which is undesirable because it should return the object itself for non-aliases. This patch refactors the GlobalIFunc class to inherit directly from GlobalObject, and removes GlobalIndirectSymbol (while inlining the relevant parts into GlobalAlias and GlobalIFunc). This allows for calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc itself, making getAliaseeObject() more consistent and enabling alias-to-ifunc to be properly modeled in the IR. I exercised some judgement in the API clients of GlobalIndirectSymbol: some were 'monomorphized' for GlobalAlias and GlobalIFunc, and some remained shared (with the type adapted to become GlobalValue). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D108872	2021-10-20 10:29:47 -07:00
william woodruff	e7fc254875	[BitcodeAnalyzer] allow a motivated user to dump BLOCKINFO This adds the `--dump-blockinfo` flag to `llvm-bcanalyzer`, allowing a sufficiently motivated user to dump (parts of) the `BLOCKINFO_BLOCK` block. The default behavior is unchanged, and `--dump-blockinfo` only takes effect in the same context as other flags that control dump behavior (i.e., requires that `--dump` is also passed). Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D107536	2021-10-10 10:15:14 +05:30
william woodruff	778bf73d7b	[BitcodeReader] fix a logic error in vector type element validation The current code checks whether the vector's element type is a valid structure element type, rather than a valid vector element type. The two have separate implementations and but only accept very slightly different sets of types, which is probably why this wasn't caught before. Differential Revision: https://reviews.llvm.org/D109655	2021-10-09 09:42:02 +05:30
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit 8d64314ffea55f2ad94c1b489586daa8ce30f451.	2021-10-06 11:38:11 -07:00

... 5 6 7 8 9 ...

1717 Commits