llvm-project

Author	SHA1	Message	Date
Nikita Popov	71f7b972c3	[Local] Make combineAAMetadata() more principled (#122091 ) This moves combineAAMetadata() into Local and implements it via a new AAOnly flag, which will intersect only AA metadata and keep other known metadata. The existing KnownIDs list is dropped, because it is redundant with the switch in combineMetadata(), which already drops unknown metadata. I tried a few variants of this, and ultimately went with the AAOnly flag because this way we make an explicit choice for each metadata kind supported by combineMetadata(), and ignoring the flag gives you conservatively correct behavior. I checked that the memcpy tests still pass if we adjust the logic for MD_memprof/MD_callsite to drop the metadata instead of arbitrarily picking one. Fixes https://github.com/llvm/llvm-project/issues/121495.	2025-01-09 09:34:46 +01:00
Akshat Oke	f6c76d5180	[PM] Remove is_analysis label for LoopSimplify (#121433 ) This reverts part of the changes in #118779	2025-01-09 10:11:14 +05:30
Ryan Mansfield	67efbd0bf1	[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955 )	2025-01-08 11:07:23 +01:00
Mircea Trofin	4312075efa	[nfc][thinlto] remove unnecessary return from `renameModuleForThinLTO` (#121851 ) Same goes for `FunctionImportGlobalProcessing::run`. The return value was used, but it was always `false`.	2025-01-06 15:19:09 -08:00
Yingwei Zheng	a77346bad0	[IRBuilder] Refactor FMF interface (#121657 ) Up to now, the only way to set specified FMF flags in IRBuilder is to use `FastMathFlagGuard`. It makes the code ugly and hard to maintain. This patch introduces a helper class `FMFSource` to replace the original parameter `Instruction *FMFSource` in IRBuilder. To maximize the compatibility, it accepts an instruction or a specified FMF. This patch also removes the use of `FastMathFlagGuard` in some simple cases. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=f87a9db8322643ccbc324e317a75b55903129b55&to=9397e712f6010be15ccf62f12740e9b4a67de2f4&stat=instructions%3Au	2025-01-06 14:37:04 +08:00
Fangrui Song	e6f76378c2	EntryExitInstrumenter: skip available_externally linkage gnu::always_inline functions, which lower to available_externally, may not have definitions external to the module. -finstrument-function family options instrumentating the function (which takes the function address) may lead to a linker error if the function is not optimized out, e.g. ``` // -std=c++17 or above with libstdc++ #include <string> std::string str; int main() {} ``` Simplified reproduce: ``` template <typename T> struct A { [[gnu::always_inline]] T bar(T a) { return a * 2; } }; extern template class A<int>; int main(int argc, char **argv) { return A<int>().bar(argc); } ``` GCC's -finstrument-function instrumentation skips such functions (https://gcc.gnu.org/PR78333). Let's skip such functions (available_externally) as well. Fix #50742 Pull Request: https://github.com/llvm/llvm-project/pull/121452	2025-01-03 09:25:08 -08:00
Teresa Johnson	3a423a10ff	[MemProf][PGO] Prevent dropping of profile metadata during optimization (#121359 ) This patch fixes a couple of places where memprof-related metadata (!memprof and !callsite) were being dropped, and one place where PGO metadata (!prof) was being dropped. All were due to instances of combineMetadata() being invoked. That function drops all metadata not in the list provided by the client, and also drops any not in its switch statement. Memprof metadata needed a case in the combineMetadata switch statement. For now we simply keep the metadata of the instruction being kept, which doesn't retain all the profile information when two calls with memprof metadata are being combined, but at least retains some. For the memprof metadata being dropped during call CSE, add memprof and callsite metadata to the list of known ids in combineMetadataForCSE. Neither memprof nor regular prof metadata were in the list of known ids for the callsite in MemCpyOptimizer, which was added to combine AA metadata after optimization of byval arguments fed by memcpy instructions, and similar types of optimizations of memcpy uses. There is one other callsite of combineMetadata, but it is only invoked on load instructions, which do not carry these types of metadata.	2025-01-02 12:11:59 -08:00
Yingwei Zheng	eafbab6fac	[EntryExitInstrumenter][AArch64][RISCV][LoongArch] Pass `__builtin_return_address(0)` into `_mcount` (#121107 ) On RISC-V, AArch64, and LoongArch, the `_mcount` function takes `__builtin_return_address(0)` as an argument since `__builtin_return_address(1)` is not available on these platforms. This patch fixes the argument passing to match the behavior of glibc/gcc. Closes https://github.com/llvm/llvm-project/issues/121103.	2025-01-01 15:02:08 +08:00
DaPorkchop_	cea738bc9a	[SimplifyCFG] Replace unreachable switch lookup table holes with poison (#94990 ) As discussed in #94468, this causes switch lookup table entries which are unreachable to be poison instead of filling them with a value from one of the reachable cases. --------- Co-authored-by: DianQK <dianqk@dianqk.net>	2024-12-26 07:47:26 +08:00
Owen Anderson	bc8fa9c443	Revert "SimplifyLibCalls: Use default globals address space when building new global strings. (#118729 )" (#119616 ) This reverts commit cfa582e8aaa791b52110791f5e6504121aaf62bf.	2024-12-21 09:33:39 +13:00
Dominik Steenken	fa9cef50b1	Only guard loop metadata that has non-debug info in it (#118825 ) This PR is motivated by a mismatch we discovered between compilation results with vs. without `-g3`. We noticed this when compiling SPEC2017 testcases. The specific instance we saw is fixed in this PR by modifying a guard (see below), but it is likely similar instances exist elsewhere in the codebase. The specific case fixed in this PR manifests itself in the `SimplifyCFG` pass doing different things depending on whether DebugInfo is generated or not. At the end of this comment, there is reduced example code that shows the behavior in question. The differing behavior has two root causes: 1. Commit https://github.com/llvm/llvm-project/commit/c07e19b adds loop metadata including debug locations to loops that otherwise would not have loop metadata 2. Commit https://github.com/llvm/llvm-project/commit/ac28efa6c100 adds a guard to a simplification action in `SImplifyCFG` that prevents it from simplifying away loop metadata So, the change in 2. does not consider that when compiling with debug symbols, loops that otherwise would not have metadata that needs preserving, now have debug locations in their loop metadata. Thus, with `-g3`, `SimplifyCFG` behaves differently than without it. The larger issue is that while debug info is not supposed to influence the final compilation result, commits like 1. blur the line between what is and is not debug info, and not all optimization passes account for this. This PR does not address that and rather just modifies this particular guard in order to restore equivalent behavior between debug and non-debug builds in this one instance. --- Here is a reduced version of a file from `f526.blender_r` that showcases the behavior in question: ```C struct LinkNode; typedef struct LinkNode { struct LinkNode next; void link; } LinkNode; void do_projectpaint_thread_ph_v_state() { int ps = do_projectpaint_thread_ph_v_state; LinkNode node; while (do_projectpaint_thread_ph_v_state) for (node = ps; node; node = node->next) ; } ``` Compiling this with and without DebugInfo, and then disassembling the results, leads to different outcomes (tested on SystemZ and X86). The reason for this is that the `SimplifyCFG` pass does different things in either case.	2024-12-20 15:15:51 +01:00
Abhay Kanhere	cc246d4a29	[Transforms][CodeExtraction] bug fix regions with stackrestore (#118564 ) Ensure code extraction for outlining to a function does not create a function with stacksave of caller to restore stack (e.g. tail call).	2024-12-19 09:19:11 -07:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
Artem Pianykh	fbdbb13d5b	[NFC][Utils] Eliminate DISubprogram set from BuildDebugInfoMDMap (#118625 ) Summary: Previously, we'd add all SPs distinct from the cloned one into a set. Then when cloning a local scope we'd check if it's from one of those 'distinct' SPs by checking if it's in the set. We don't need to do that. We can just check against the cloned SP directly and drop the set. Test Plan: ninja check-llvm-unit check-llvm	2024-12-17 08:57:59 +00:00
Artem Pianykh	8402a0fab0	[NFC][Utils] Extract CloneFunctionBodyInto from CloneFunctionInto (#118624 ) Summary: This and previously extracted `CloneFunction*Into` functions will be used in later diffs. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 22:30:56 +00:00
Artem Pianykh	a9237b1a10	[NFC][Utils] Extract CloneFunctionMetadataInto from CloneFunctionInto (#118623 ) Summary: The new API expects the caller to populate the VMap. We need it this way for a subsequent change around coroutine cloning. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 20:50:05 +00:00
Vedant Paranjape	b21fa18b44	[LoopVersioning] Add a check to see if the input loop is in LCSSA form (#116443 ) Loop Optimizations expect the input loop to be in LCSSA form. But it seems that LoopVersioning doesn't have any check to see if the loop is actually in LCSSA form. As a result, if we give it a loop which is not in LCSSA form but still correct semantically, the resulting transformation fails to pass through verifier pass with the following error. Instruction does not dominate all uses! %inc = add nsw i16 undef, 1 store i16 %inc, ptr @c, align 1 As the loop is not in LCSSA form, LoopVersioning's transformations leads to invalid IR! As some instructions do not dominate all their uses. This patch checks if a loop is in LCSSA form, if not it will call formLCSSARecursively on the loop before passing it to LoopVersioning. Fixes: #36998	2024-12-16 11:55:19 -05:00
David Green	0032c151dc	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-14 18:07:21 +00:00
Florian Hahn	c4a78b6fe3	[SimplifyCFG] Always allow hoisting if all instructions match. (#97158 ) Generalize hoistCommonCodeFromSuccessors's `EqTermsOnly` to `AllInstsEqOnly` and always allow hoisting if all instructions match. In that case, all instructions can be hoisted and the original branch will be replaced and selects for PHIs are added. This allows preserving metadata in more cases, using the existing hoisting logic, whereas previously FoldTwoEntryPHINode would drop the metadata. https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u PR: https://github.com/llvm/llvm-project/pull/97158	2024-12-13 21:26:27 +00:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
Antonio Frighetto	d26df32255	[SimplifyCFG] Consider preds to switch in `simplifyDuplicateSwitchArms` Allow a duplicate basic block with multiple predecessors to the jump table to be simplified, by considering that the same basic block may appear in more switch cases.	2024-12-13 09:07:24 +01:00
Kirill Stoimenov	e3676aa21f	Revert "[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 )" Causing buffer overflow: SUMMARY: AddressSanitizer: heap-buffer-overflow llvm/lib/Transforms/Scalar/SROA.cpp:5552:35 This reverts commit 5e247d726d7a54cf0acc997bc17b50e7494e6fa3.	2024-12-12 21:32:35 +00:00
David Green	5e247d726d	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-12 10:27:27 +00:00
Nikita Popov	5013c81b78	[GlobalOpt][Evaluator] Don't evaluate calls with signature mismatch (#119548 ) The global ctor evaluator tries to evalute function calls where the call function type and function type do not match, by performing bitcasts. This currently causes a crash when calling a void function with non-void return type. I've opted to remove this functionality entirely rather than fixing this specific case. With opaque pointers, there shouldn't be a legitimate use case for this anymore, as we don't need to look through pointer type casts. Doing other bitcasts is very iffy because it ignores ABI considerations. We should at least leave adjusting the signatures to make them line up to InstCombine (which also does some iffy things, but is at least somewhat more constrained). Fixes https://github.com/llvm/llvm-project/issues/118725.	2024-12-12 10:44:52 +01:00
Mel Chen	b3cba9be41	[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812 ) Consider the following loop: ``` int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; ``` We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV. --------- Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>	2024-12-12 16:48:31 +08:00
Owen Anderson	22f0ebb19c	TargetLibraryInfo: Use pointer index size to determine getSizeTSize(). (#118747 ) When using non-integral pointer types, such as on CHERI targets, size_t is equivalent to the index size, which is allowed to be smaller than the size of the pointer.	2024-12-12 15:45:44 +13:00
Owen Anderson	ab15976173	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119483 ) This reapplies #119138 with a defensive fix for the assertion failure when building libcxx. Unfortunately the failure does not reproduce on my machine, so I am not able to extract a test case. The key insight for the fix comes from Jessica Clarke, who observes that `VTablePtr` may, in fact, not be a pointer on return from `FindAvailableLoadedValue`. Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 16:49:48 +13:00
Owen Anderson	9b6bb83860	Revert "CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 )" Reverting due to ASAN bootstrap failures. This reverts commit 4027e2f248044d944aaf3d9bc9c8eb6928506d44.	2024-12-11 13:20:17 +13:00
Owen Anderson	4027e2f248	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 ) Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 12:43:40 +13:00
Pedro Lobo	d7c12ea29e	[LoopRotate] Use `poison` instead of `undef` as placeholder in debug info [NFC] (#119135 ) The `poison` values are used to substitute debug information of values moved from the original header into the preheader that are no longer available in the former.	2024-12-10 15:06:48 +00:00
Artem Pianykh	eadc0c901b	[NFC][Utils] Extract BuildDebugInfoMDMap from CloneFunctionInto (#118622 ) Summary: Extract the logic to build up a metadata map to use in metadata cloning into a separate function. Test Plan: ninja check-llvm-unit check-llvm	2024-12-10 17:10:22 +09:00
Artem Pianykh	e529681ad5	[NFC][Utils] Clone basic blocks after we're done with metadata in CloneFunctionInto (#118621 ) Summary: Moving the cloning of BBs after the metadata makes the flow of the function a bit more straightforward and makes it easier to extract more into helper functions. Test Plan: ninja check-llvm-unit check-llvm	2024-12-09 21:40:04 +09:00
Artem Pianykh	a202a35e79	[NFC][Utils] Remove DebugInfoFinder parameter from CloneBasicBlock (#118620 ) Summary: There was a single usage of CloneBasicBlock with non-default DebugInfoFinder inside CloneFunctionInto which has been refactored in more focused. Test Plan: ninja check-llvm-unit check-llvm	2024-12-06 21:41:29 +09:00
Nikita Popov	9a24f2198e	[MergeFuncs] Handle ConstantRangeList attributes Support comparison of ConstantRangeList attributes in FunctionComparator.	2024-12-06 12:21:45 +01:00
Akshat Oke	49abcd207f	[CodeGen][PM] Initialize analyses with isAnalysis=true (#118779 ) Analyses should be marked as analyses. Otherwise they are prone to get ignored by the legacy analysis cache mechanism and get scheduled redundantly.	2024-12-06 15:25:54 +05:30
Nikita Popov	b569ec6de6	[SCCP] Infer nuw for gep nusw with non-negative offsets (#118819 ) If the GEP is nusw/inbounds and has all-non-negative offsets infer nuw as well. This doesn't have measurable compile-time impact. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-06 09:52:32 +01:00
Owen Anderson	cfa582e8aa	SimplifyLibCalls: Use default globals address space when building new global strings. (#118729 ) Writing a test for this transitively exposed a number of places in BuildLibCalls where we were failing to propagate address spaces properly, which are additionally fixed.	2024-12-06 10:51:14 +13:00
Florian Hahn	4226e0a0c7	[TTI] Add SCEVExpansionBudget to loop unrolling options. (#118316 ) Add an extra know to UnrollingPreferences to let backends control the maximum budget for SCEV expansions. This gives backends more fine-grained control on the cost of the runtime checks for runtime unrolling. PR: https://github.com/llvm/llvm-project/pull/118316	2024-12-02 21:35:00 +00:00
AdityaK	39601a6e54	Bail out jump threading on indirect branches only (#117778 ) Remove check for PHI in pred as pointed out in #103688 Reduced the testcase to remove redundant phi in pred Fixes: #102351	2024-11-26 14:57:28 -08:00
Florian Hahn	46a08579f2	[Local] Only intersect alias.scope,noalias & parallel_loop if inst moves (#117716 ) Preserve !alias.scope, !noalias and !mem.parallel_loop_access metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the aliasing property encoded in the metadata does not hold. This makes use of the clarification re aliasing metadata implying UB if the property does not hold: #116220 Same as #115868, but for !alias.scope, !noalias and !mem.parallel_loop_access. PR: https://github.com/llvm/llvm-project/pull/117716	2024-11-26 20:39:53 +00:00
Matt Arsenault	4028bb10c3	Local: Handle noalias_addrspace in combineMetadata (#103938 ) This should act like range. Previously ConstantRangeList assumed a 64-bit range. Now query from the actual entries. This also means that the empty range has no bitwidth, so move asserts to avoid checking the bitwidth of empty ranges.	2024-11-26 09:13:34 -05:00
David Green	18abc7e0c5	[PatternMatch] Introduce m_c_Select (#114328 ) This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).	2024-11-25 13:47:23 +00:00
Phoebe Wang	2568e52a73	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812 ) This is a follow up of #96878 to support hoisting load/store from BBs have the same predecessor, if load/store are the only instructions and the branch is unpredictable, e.g.: ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-11-25 15:19:28 +08:00
Jay Foad	d6fc7d3ab1	Fix typo "intead"	2024-11-21 14:48:38 +00:00
Artem Pianykh	f5002a0fae	[Utils] Extract CollectDebugInfoForCloning from CloneFunctionInto (#114537 ) Summary: Consolidate the logic in a single function. We do an extra pass over Instructions but this is necessary to untangle things and extract metadata cloning in a future diff. Test Plan: ``` $ ninja check-llvm-unit check-llvm [211/213] Running the LLVM regression tests Testing Time: 106.06s Total Discovered Tests: 62601 Skipped : 17 (0.03%) Unsupported : 2518 (4.02%) Passed : 59911 (95.70%) Expectedly Failed: 155 (0.25%) [212/213] Running lit suite Testing Time: 12.47s Total Discovered Tests: 8474 Skipped: 17 (0.20%) Passed : 8457 (99.80%) ``` Extracted from #109032 (commit 3) (there are more refactors and cleanups in subsequent commits)	2024-11-20 23:36:55 +00:00
Florian Hahn	0bb1b68330	[Local] Only intersect tbaa metadata if instr moves. (#116682 ) Preserve tbaa metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the aliasing property encoded in the metadata does not hold. This makes use of the clarification re tbaa metadata implying UB if the property does not hold: https://github.com/llvm/llvm-project/pull/116220 Same as https://github.com/llvm/llvm-project/pull/115868, but for !tbaa PR: https://github.com/llvm/llvm-project/pull/116682	2024-11-20 19:31:16 +00:00
Florian Hahn	076513646c	[Local] Only intersect llvm.access.group metadata if instr moves. (#115868 ) Preserve llvm.access.group metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the parallel property encoded in the metadata does not hold. This matches the LangRef recently updated in #116220 PR https://github.com/llvm/llvm-project/pull/115868	2024-11-19 22:01:16 +00:00
Stephen Tozer	2188a56a75	[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235 ) Currently when we merge invokes as part of SimplifyCFG we apply a merge of the invoke DILocations to the merged invoke. We also insert an unconditional branch to the merged invoke at the positions previously occupied by the original invokes; as this branch is part of the substitution for the invoke it has replaced, we should propagate the original invoke DebugLoc to it.	2024-11-15 17:20:55 +00:00
Alex Bradbury	298127dcbe	Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case. Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 15:21:39 +00:00
Alex Bradbury	0fb8fac5d6	Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 )" This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3. Recent scheduling changes means tests need to be re-generated. Reverting to green while I do that.	2024-11-15 14:48:32 +00:00

1 2 3 4 5 ...

7663 Commits