llvm-project

Author	SHA1	Message	Date
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Nikita Popov	0cfea5b73c	[BitcodeReader] Avoid quadratic complexity in intrinsic upgrade (#150032 ) When materializing a function, we'd upgrade all calls to all upgraded intrinsics. However, this would operate on all calls to the intrinsic (including previously materialized ones), which leads to quadratic complexity. Instead, only upgrade the calls that are in the materialized function. This fixes a compile-time regression introduced by #149310.	2025-07-23 09:52:05 +02:00
Nikita Popov	92c55a315e	[IR] Only allow lifetime.start/end on allocas (#149310 ) lifetime.start and lifetime.end are primarily intended for use on allocas, to enable stack coloring and other liveness optimizations. This is necessary because all (static) allocas are hoisted into the entry block, so lifetime markers are the only way to convey the actual lifetimes. However, lifetime.start and lifetime.end are currently allowed to be used on non-alloca pointers. We don't actually do this in practice, but just the mere fact that this is possible breaks the core purpose of the lifetime markers, which is stack coloring of allocas. Stack coloring can only work correctly if all lifetime markers for an alloca are analyzable. * If a lifetime marker may operate on multiple allocas via a select/phi, we don't know which lifetime actually starts/ends and handle it incorrectly (https://github.com/llvm/llvm-project/issues/104776). * Stack coloring operates on the assumption that all lifetime markers are visible, and not, for example, hidden behind a function call or escaped pointer. It's not possible to change this, as part of the purpose of lifetime markers is that they work even in the presence of escaped pointers, where simple use analysis is insufficient. I don't think there is any way to have coherent semantics for lifetime markers on allocas, while also permitting them on arbitrary pointer values. This PR restricts lifetimes to operate on allocas only. As a followup, I will also drop the size argument, which is superfluous if we always operate on an alloca. (This change also renders various code handling lifetime markers on non-alloca dead. I plan to clean up that kind of code after dropping the size argument as well.) In practice, I've only found a few places that currently produce lifetimes on non-allocas: * CoroEarly replaces the promise alloca with the result of an intrinsic, which will later be replaced back with an alloca. I think this is the only place where there is some legitimate loss of functionality, but I don't think this is particularly important (I don't think we'd expect the promise in a coroutine to admit useful lifetime optimization.) * SafeStack moves unsafe allocas onto a separate frame. We can safely drop lifetimes here, as SafeStack performs its own stack coloring. * Similar for AddressSanitizer, it also moves allocas into separate memory. * LSR sometimes replaces the lifetime argument with a GEP chain of the alloca (where the offsets ultimately cancel out). This is just unnecessary. (Fixed separately in https://github.com/llvm/llvm-project/pull/149492.) * InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast of an alloca. I don't think this is necessary.	2025-07-21 15:04:50 +02:00
Aleksandr Urakov	c9cbd4e9d4	[lld] Fix -ObjC load behavior with LTO for section names with whitespace (#146654 ) This is a fix additional to #92162 In some cases, section names contain a whitespace between the segment name and the actual section name (e.g. `__TEXT, __swift5_proto`). It is confirmed by source code of the Swift compiler This fix allows LTO to work correctly with the `-ObjC` flag in that rare case when only a section with a whitespace in the name is present in the linked bitcode module, but there are no sections containing `__TEXT,__swift` --------- Co-authored-by: Ураков Александр Сергеевич <a.urakov@tbank.ru> Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>	2025-07-21 09:08:33 +03:00
Jeremy Morse	c995c50355	[KeyInstr] Add bitcode support (#147260 ) Serialise key-instruction fields of DILocations and DISubprograms into and outof bitcode, add tests. debug-info bitcode sizes grow, but it balances out given an earlier size optimisation in 51f4e2c. Co-authored-by: Orlando Cazalet-Hyams <orlando.hyams@sony.com>	2025-07-07 11:57:46 +01:00
Rahul Joshi	b38de6c18e	[NFCI][LLVM] Adopt `ArrayRef::consume_front()` in a few places (#146793 )	2025-07-04 10:42:14 -07:00
Antonio Frighetto	f1cc0b607b	[IR] Introduce `dead_on_return` attribute Add `dead_on_return` attribute, which is meant to be taken advantage by the frontend, and states that the memory pointed to by the argument is dead upon function return. As with `byval`, it is supposed to be used for passing aggregates by value. The difference lies in the ABI: `byval` implies that the pointer is explicitly passed as argument to the callee (during codegen the copy is emitted as per byval contract), whereas a `dead_on_return`-marked argument implies that the copy already exists in the IR, is located at a specific stack offset within the caller, and this memory will not be read further by the caller upon callee return – or otherwise poison, if read before being written. RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.	2025-07-02 09:29:36 +02:00
Matt Arsenault	fff720d641	Triple: Forward declare Twine and remove include (#145685 )	2025-06-26 15:26:04 +09:00
Mircea Trofin	82cbd68504	[NFC][PGO] Use constants rather than free strings for metadata labels (#145721 )	2025-06-25 16:20:10 -07:00
Jeremy Morse	97ac6483aa	[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746 ) This flag was used to let us incrementally introduce debug records into LLVM, however everything is now using records. It serves no purpose now, so delete it.	2025-06-12 11:51:58 +01:00
Jeremy Morse	0e4b8b8f81	[DebugInfo][RemoveDIs] Rip out the UseNewDbgInfoFormat flag (#143207 ) Start removing debug intrinsics support -- starting with the flag that controls production of their replacement, debug records. This patch removes the command-line-flag and with it the ability to switch back to intrinsics. The module / function / block level "IsNewDbgInfoFormat" flags get hardcoded to true, I'll to incrementally remove things that depend on those flags.	2025-06-09 19:36:34 +01:00
Teresa Johnson	3ec2de2753	[MemProf] Optionally save context size info on largest cold allocations (#142837 ) Reapply PR142507 with fix for test: add in the same x86_64-linux requirement as other tests as the stack ids are currently computed differently on big endian systems. This will be investigated separately. In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.	2025-06-04 13:08:56 -07:00
Teresa Johnson	6c1091ea3f	Revert "[MemProf] Optionally save context size info on largest cold allocations" (#142688 ) Reverts llvm/llvm-project#142507 due to buildbot failures that I will look into tomorrow.	2025-06-03 16:05:16 -07:00
Teresa Johnson	f2adae5780	[MemProf] Optionally save context size info on largest cold allocations (#142507 ) In order to allow selective reporting of context hinting during the LTO link, and in the future to allow selective more aggressive cloning, add an option to specify a minimum percent of the max cold size in the profile summary. Contexts that meet that threshold will get context size info metadata (and ThinLTO summary information) on the associated allocations. Specifying -memprof-report-hinted-sizes during the pre-LTO compile step will continue to cause all contexts to receive this metadata. But specifying -memprof-report-hinted-sizes only during the LTO link will cause only those that meet the new threshold and have the metadata to get reported. To support this, because the alloc info summary and associated bitcode requires the context size information to be in the same order as the other context information, 0s are inserted for contexts without this metadata. The bitcode writer uses a more compact format for the context ids to allow better compression of the 0s. As part of this change several helper methods are added to query whether metadata contains context size info on any or all contexts.	2025-06-03 14:20:38 -07:00
Andrew Rogers	34bb642660	[llvm] annotate interfaces in AsmParser, BinaryFormat, Bitcode, and Bitstream libraries for DLL export (#141794 ) ## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/AsmParser`, `llvm/BinaryFormat`, `llvm/Bitcode` and `llvm/Bitstream libraries. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Add `LLVM_ABI_FRIEND` to friend member functions declared with `LLVM_ABI` - Add `LLVM_ABI` symbols that require export but are not declared in headers ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-06-02 09:52:09 -07:00
Timothy Werquin	a8e486bfc4	[Bitcode] Fix constexpr expansion creating invalid PHIs (#141560 ) Fixes errors about duplicate PHI edges when the input had duplicates with constexprs in them. The constexpr translation makes new basic blocks, causing the verifier to complain about duplicate entries in PHI nodes.	2025-05-27 15:51:48 +02:00
Kazu Hirata	13c3df9a36	[Bitcode] Remove unused includes (NFC) (#141354 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-05-24 09:37:35 -07:00
Rahul Joshi	86c5112409	[NFCI][LLVM/MLIR] Adopt `TrailingObjects` convenience API (#138554 ) Adopt `TrailingObjects` convenience API that was added in https://github.com/llvm/llvm-project/pull/138970 in LLVM and MLIR code.	2025-05-12 16:17:43 -07:00
Kazu Hirata	58014a506d	[IR] Teach getConstraintString to return StringRef (NFC) (#139401 ) With this change, some callers get to use StringRef::starts_with. I'm planning to teach getAsmString to return StringRef also, but I'ld like to keep that separate from this patch.	2025-05-10 13:00:47 -07:00
Teresa Johnson	a3d027f923	[MemProf] Disable alloc context in combined summary for ndebug builds (#139161 ) Since we currently only use the context information in the alloc info summary in the LTO backend for assertion checking, there is no need to write this into the combined summary index for distributed ThinLTO for NDEBUG builds. Put this under a new -combined-index-memprof-context option which is off by default for NDEBUG. The advantage is that we save time (not having to sort in preparation for building the radix trees), and space in the generated bitcode files. We could also do so for the callsite info records, but those are smaller and less expensive to prepare.	2025-05-09 16:28:49 -07:00
Matt Arsenault	9383fb23e1	Reapply "IR: Remove uselist for constantdata (#137313 )" (#138961 ) Reapply "IR: Remove uselist for constantdata (#137313)" This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75. Fix checking uselists of constants in assume bundle queries	2025-05-08 08:00:09 +02:00
Rahul Joshi	7245e21e89	[NFC][Support] Add llvm::uninitialized_copy (#138174 ) Add `llvm::uninitialized_copy` that accepts a range instead of start/end iterator for the source of the copy.	2025-05-07 17:37:38 -07:00
Kirill Stoimenov	5936c02c8b	Revert "IR: Remove uselist for constantdata (#137313 )" Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119 This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.	2025-05-07 00:07:55 +00:00
Matt Arsenault	87f312aad6	IR: Remove uselist for constantdata (#137313 ) This is a resurrected version of the patch attached to this RFC: https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606 In this adaptation, there are a few differences. In the original patch, the Use's use list was replaced with an unsigned* to the reference count in the value. This version leaves them as null and leaves the ref counting only in Value. Remove use-lists from instances of ConstantData (which are shared across modules and have no operands). To continue supporting most of the use-list API, store a ref-count in place of the use-list; this is for API like Value::use_empty and Value::hasNUses. Operations that actually need the use-list -- like Value::use_begin -- will assert. This change has three benefits: 1. The compiler output cannot in any way depend on the use-list order of instances of ConstantData. 2. There's no use-list traffic when adding and removing simple constants from operand lists (although there is ref-count traffic; YMMV). 3. It's cheaper to serialize use-lists (since we're no longer serializing the use-list order of things like i32 0). The downside is that you can't look at all the users of ConstantData, but traversals of users of i32 0 are already ill-advised. Possible follow-ups: - Track if an instance of a ConstantVector/ConstantArray/etc. is known to have all ConstantData arguments, and drop the use-lists to ref-counts in those cases. Callers need to check Value::hasUseList before iterating through the use-list. - Remove even the ref-counts. I'm not sure they have any benefit besides minimizing the scope of this commit, and maintaining the counts is not free. Fixes #58629 Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>	2025-05-06 17:20:37 +02:00
Kazu Hirata	2f3067ed69	[llvm] Remove unused local variables (NFC) (#138454 )	2025-05-04 09:38:16 -07:00
Nikita Popov	4109bac330	[IR] Do not store Function inside BlockAddress (#137958 ) Currently BlockAddresses store both the Function and the BasicBlock they reference, and the BlockAddress is part of the use list of both the Function and BasicBlock. This is quite awkward, because this is not really a use of the function itself (and walks of function uses generally skip block addresses for that reason). This also has weird implications on function RAUW (as that will replace the function in block addresses in a way that generally doesn't make sense), and causes other peculiar issues, like the ability to have multiple block addresses for one block (with different functions). Instead, I believe it makes more sense to specify only the basic block and let the function be implied by the BB parent. This does mean that we may have block addresses without a function (if the BB is not inserted), but this should only happen during IR construction.	2025-05-02 09:40:50 +02:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Owen Rodley	d3d856ad84	Clean up external users of GlobalValue::getGUID(StringRef) (#129644 ) See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801 for context. This is a non-functional change which just changes the interface of GlobalValue, in preparation for future functional changes. This part touches a fair few users, so is split out for ease of review. Future changes to the GlobalValue implementation can then be focused purely on that class. This does the following: * Rename GlobalValue::getGUID(StringRef) to getGUIDAssumingExternalLinkage. This is simply making explicit at the callsite what is currently implicit. * Where possible, migrate users to directly calling getGUID on a GlobalValue instance. * Otherwise, where possible, have them call the newly renamed getGUIDAssumingExternalLinkage, to make the assumption explicit. There are a few cases where neither of the above are possible, as the caller saves and reconstructs the necessary information to compute the GUID themselves. We want to migrate these callers eventually, but for this first step we leave them be.	2025-04-28 11:09:43 +10:00
Kazu Hirata	584aefbc9b	[llvm] Use range-based for loops with llvm::drop_begin (NFC) (#136417 )	2025-04-19 09:10:43 -07:00
Jeremy Morse	40f9bb9e25	[DebugInfo][RemoveDIs] Eliminate another debug-info variation flag (#133917 ) The "preserve input debug-info format" flag allowed some tooling to opt into not seeing the new debug records yet, and to not autoupgrade. This was good at the time, but un-necessary now that we'll be ditching intrinsics shortly. It also hides errors now: verify-uselistorder was hardcoding this flag to on, and as a result it hasn't seen debug records before. Thus, we missed a uselistorder variation: constant-expressions such as GEPs can be contained within debug records and completely isolated from the value hierachy, see the metadata-use-uselistorder.ll test. These Values didn't get ordered, but were legitimate uses of constants like "i64 0", and we now run into difficulty handling that. The patch to AsmWriter seeks Values to order even through debug-info now. Finally there are a few intrinsics-tests relying on this flag that we can just delete, such as one in llvm-reduce and another few in the LocalTest unit tests. For the fast-isel test, it was added in https://reviews.llvm.org/D67703 explicitly for checking the size of blocks without debug-info and in 1525abb9c94 the codepath it tests moved towards being sunsetted. It'll be totally redundant once RemoveDIs is on permanently. Note that there's now no explicit test for the textual-IR autoupgrade path. I submit that we can rely on the thousands of .ll files where we've only been bothered to update the outputs, not the inputs, to debug records.	2025-04-09 18:00:28 +01:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00
Kazu Hirata	6257621f41	[llvm] Use llvm::append_range (NFC) (#133658 )	2025-03-30 18:43:02 -07:00
Vitaly Buka	e3076c6f3d	[NFC][IR] Use emplace instead of insert (#130360 ) Preparation for for CFI Index refactoring, which will fix O(N^2) in ThinLTO indexing.	2025-03-07 14:57:02 -08:00
Vitaly Buka	356bbb0b25	[NFC][IR] Use auto instead of explicit type (#130351 ) Preparation for CFI Index refactoring, which will fix O(N^2) in ThinLTO indexing.	2025-03-07 14:47:24 -08:00
Nikita Popov	979c275097	[IR] Store Triple in Module (NFC) (#129868 ) The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.	2025-03-06 10:27:47 +01:00
Antonio Frighetto	ff585feacf	[IR][ModRef] Introduce `errno` memory location Model C/C++ `errno` macro by adding a corresponding `errno` memory location kind to the IR. Preliminary work to separate `errno` writes from other memory accesses, to the benefit of alias analyses and optimization correctness. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.	2025-02-13 12:13:39 +01:00
Alex MacLean	de7438e472	[NVPTX] Auto-Upgrade some nvvm.annotations to attributes (#119261 ) Add a new AutoUpgrade function to convert some legacy nvvm.annotations metadata to function level attributes. These attributes are quicker to look-up so improve compile time and are more idiomatic than using metadata which should not include required information that changes the meaning of the program. Currently supported annotations are: - !"kernel" -> ptx_kernel calling convention - !"align" -> alignstack parameter attributes (return not yet supported)	2025-01-29 16:27:27 -08:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Mats Jun Larsen	416f1c465d	[IR] Replace of PointerType::get(Type) with opaque version (NFC) (#123617 ) In accordance with https://github.com/llvm/llvm-project/issues/123569 In order to keep the patch at reasonable size, this PR only covers for the llvm subproject, unittests excluded.	2025-01-21 00:32:56 +09:00
Nikita Popov	22e9024c9f	[IR] Introduce captures attribute (#116990 ) This introduces the `captures` attribute as described in: https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420 This initial patch only introduces the IR/bitcode support for the attribute and its in-memory representation as `CaptureInfo`. This will be followed by a patch to upgrade and remove the `nocapture` attribute, and then by actual inference/analysis support. Based on the RFC feedback, I've used a syntax similar to the `memory` attribute, though the only "location" that can be specified is `ret`. I've added some pretty extensive documentation to LangRef on the semantics. One non-obvious bit here is that using ptrtoint will not result in a "return-only" capture, even if the ptrtoint result is only used in the return value. Without this requirement we wouldn't be able to continue ordinary capture analysis on the return value.	2025-01-13 14:40:25 +01:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
John Brawn	ecbe4d1e36	[IR] Allow fast math flags on fptrunc and fpext (#115894 ) This consists of: * Make these instructions part of FPMathOperator. * Adjust bitcode/ir readers/writers to expect fast math flags on these instructions. * Make IRBuilder set the fast math flags on these instructions. * Update langref and release notes. * Update a bunch of tests. Some of these are due to InstCombineCasts incorrectly adding fast math flags to fptrunc, which will be fixed in a later patch.	2024-12-04 10:53:04 +00:00
Nikita Popov	98204a2e5b	[Bitcode] Verify types for aggregate initializers Unfortunately all the nice error messages get lost because we don't forward errors from lazy value materialization. Fixes https://github.com/llvm/llvm-project/issues/117707.	2024-11-28 11:21:38 +01:00
Teresa Johnson	776476c282	Reapply "[MemProf] Use radix tree for alloc contexts in bitcode summaries" (#117395 ) (#117404 ) This reverts commit fdb050a5024320ec29d2edf3f2bc686c3a84abaa, and restores ccb4702038900d82d1041ff610788740f5cef723, with a fix for build bot failures. Specifically, add ProfileData to the dependences of the BitWriter library, which was causing shared library builds of LLVM to fail. Reproduced the failure with a shared library build and confirmed this change fixes that build failure.	2024-11-22 16:18:30 -08:00
Teresa Johnson	fdb050a502	Revert "[MemProf] Use radix tree for alloc contexts in bitcode summaries" (#117395 ) Reverts llvm/llvm-project#117066 This is causing some build bot failures that need investigation.	2024-11-22 14:57:58 -08:00
Teresa Johnson	ccb4702038	[MemProf] Use radix tree for alloc contexts in bitcode summaries (#117066 ) Leverage the support added to represent allocation contexts in a more compact way via a radix tree in the indexed profile to similarly reduce sizes of the bitcode summaries. For a large target, this reduced the size of the per-module summaries by about 18% and in the distributed combined index files by 28%.	2024-11-22 14:49:55 -08:00
Teresa Johnson	b35f40688e	[MemProf] Change the STACK_ID record to fixed width values (#116448 ) The stack ids are hashes that are close to 64 bits in size, so emitting as a pair of 32-bit fixed-width values is more efficient than a VBR. This reduced the summary bitcode size for a large target by about 1%. Bump the index version and ensure we can read the old format.	2024-11-18 15:16:48 -08:00
Teresa Johnson	9513f2fdf2	[MemProf] Print full context hash when reporting hinted bytes (#114465 ) Improve the information printed when -memprof-report-hinted-sizes is enabled. Now print the full context hash computed from the original profile, similar to what we do when reporting matching statistics. This will make it easier to correlate with the profile. Note that the full context hash must be computed at profile match time and saved in the metadata and summary, because we may trim the context during matching when it isn't needed for distinguishing hotness. Similarly, due to the context trimming, we may have more than one full context id and total size pair per MIB in the metadata and summary, which now get a list of these pairs. Remove the old aggregate size from the metadata and summary support. One other change from the prior support is that we no longer write the size information into the combined index for the LTO backends, which don't use this information, which reduces unnecessary bloat in distributed index files.	2024-11-15 08:24:44 -08:00

1 2 3 4 5 ...

1373 Commits