llvm-project

Author	SHA1	Message	Date
Abhay Kanhere	cc246d4a29	[Transforms][CodeExtraction] bug fix regions with stackrestore (#118564 ) Ensure code extraction for outlining to a function does not create a function with stacksave of caller to restore stack (e.g. tail call).	2024-12-19 09:19:11 -07:00
Florian Hahn	a487b792e2	[TySan] Add initial Type Sanitizer (LLVM) (#76259 ) This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (https://github.com/llvm/llvm-project/pull/76260) and compiler-rt changes (https://github.com/llvm/llvm-project/pull/76261) PR: https://github.com/llvm/llvm-project/pull/76259	2024-12-17 13:57:34 +00:00
Artem Pianykh	fbdbb13d5b	[NFC][Utils] Eliminate DISubprogram set from BuildDebugInfoMDMap (#118625 ) Summary: Previously, we'd add all SPs distinct from the cloned one into a set. Then when cloning a local scope we'd check if it's from one of those 'distinct' SPs by checking if it's in the set. We don't need to do that. We can just check against the cloned SP directly and drop the set. Test Plan: ninja check-llvm-unit check-llvm	2024-12-17 08:57:59 +00:00
Artem Pianykh	8402a0fab0	[NFC][Utils] Extract CloneFunctionBodyInto from CloneFunctionInto (#118624 ) Summary: This and previously extracted `CloneFunction*Into` functions will be used in later diffs. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 22:30:56 +00:00
Artem Pianykh	a9237b1a10	[NFC][Utils] Extract CloneFunctionMetadataInto from CloneFunctionInto (#118623 ) Summary: The new API expects the caller to populate the VMap. We need it this way for a subsequent change around coroutine cloning. Test Plan: ninja check-llvm-unit check-llvm	2024-12-16 20:50:05 +00:00
Vedant Paranjape	b21fa18b44	[LoopVersioning] Add a check to see if the input loop is in LCSSA form (#116443 ) Loop Optimizations expect the input loop to be in LCSSA form. But it seems that LoopVersioning doesn't have any check to see if the loop is actually in LCSSA form. As a result, if we give it a loop which is not in LCSSA form but still correct semantically, the resulting transformation fails to pass through verifier pass with the following error. Instruction does not dominate all uses! %inc = add nsw i16 undef, 1 store i16 %inc, ptr @c, align 1 As the loop is not in LCSSA form, LoopVersioning's transformations leads to invalid IR! As some instructions do not dominate all their uses. This patch checks if a loop is in LCSSA form, if not it will call formLCSSARecursively on the loop before passing it to LoopVersioning. Fixes: #36998	2024-12-16 11:55:19 -05:00
David Green	0032c151dc	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-14 18:07:21 +00:00
Florian Hahn	c4a78b6fe3	[SimplifyCFG] Always allow hoisting if all instructions match. (#97158 ) Generalize hoistCommonCodeFromSuccessors's `EqTermsOnly` to `AllInstsEqOnly` and always allow hoisting if all instructions match. In that case, all instructions can be hoisted and the original branch will be replaced and selects for PHIs are added. This allows preserving metadata in more cases, using the existing hoisting logic, whereas previously FoldTwoEntryPHINode would drop the metadata. https://llvm-compile-time-tracker.com/compare.php?from=716360367fbdabac2c374c19b8746f4de49a5599&to=986b2c47df516b31d998c055400e4f62aa76edc6&stat=instructions:u PR: https://github.com/llvm/llvm-project/pull/97158	2024-12-13 21:26:27 +00:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
Antonio Frighetto	d26df32255	[SimplifyCFG] Consider preds to switch in `simplifyDuplicateSwitchArms` Allow a duplicate basic block with multiple predecessors to the jump table to be simplified, by considering that the same basic block may appear in more switch cases.	2024-12-13 09:07:24 +01:00
Kirill Stoimenov	e3676aa21f	Revert "[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 )" Causing buffer overflow: SUMMARY: AddressSanitizer: heap-buffer-overflow llvm/lib/Transforms/Scalar/SROA.cpp:5552:35 This reverts commit 5e247d726d7a54cf0acc997bc17b50e7494e6fa3.	2024-12-12 21:32:35 +00:00
David Green	5e247d726d	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-12 10:27:27 +00:00
Nikita Popov	5013c81b78	[GlobalOpt][Evaluator] Don't evaluate calls with signature mismatch (#119548 ) The global ctor evaluator tries to evalute function calls where the call function type and function type do not match, by performing bitcasts. This currently causes a crash when calling a void function with non-void return type. I've opted to remove this functionality entirely rather than fixing this specific case. With opaque pointers, there shouldn't be a legitimate use case for this anymore, as we don't need to look through pointer type casts. Doing other bitcasts is very iffy because it ignores ABI considerations. We should at least leave adjusting the signatures to make them line up to InstCombine (which also does some iffy things, but is at least somewhat more constrained). Fixes https://github.com/llvm/llvm-project/issues/118725.	2024-12-12 10:44:52 +01:00
Mel Chen	b3cba9be41	[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812 ) Consider the following loop: ``` int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; ``` We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV. --------- Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>	2024-12-12 16:48:31 +08:00
Owen Anderson	22f0ebb19c	TargetLibraryInfo: Use pointer index size to determine getSizeTSize(). (#118747 ) When using non-integral pointer types, such as on CHERI targets, size_t is equivalent to the index size, which is allowed to be smaller than the size of the pointer.	2024-12-12 15:45:44 +13:00
Owen Anderson	ab15976173	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119483 ) This reapplies #119138 with a defensive fix for the assertion failure when building libcxx. Unfortunately the failure does not reproduce on my machine, so I am not able to extract a test case. The key insight for the fix comes from Jessica Clarke, who observes that `VTablePtr` may, in fact, not be a pointer on return from `FindAvailableLoadedValue`. Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 16:49:48 +13:00
Owen Anderson	9b6bb83860	Revert "CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 )" Reverting due to ASAN bootstrap failures. This reverts commit 4027e2f248044d944aaf3d9bc9c8eb6928506d44.	2024-12-11 13:20:17 +13:00
Owen Anderson	4027e2f248	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 ) Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 12:43:40 +13:00
Pedro Lobo	d7c12ea29e	[LoopRotate] Use `poison` instead of `undef` as placeholder in debug info [NFC] (#119135 ) The `poison` values are used to substitute debug information of values moved from the original header into the preheader that are no longer available in the former.	2024-12-10 15:06:48 +00:00
Artem Pianykh	eadc0c901b	[NFC][Utils] Extract BuildDebugInfoMDMap from CloneFunctionInto (#118622 ) Summary: Extract the logic to build up a metadata map to use in metadata cloning into a separate function. Test Plan: ninja check-llvm-unit check-llvm	2024-12-10 17:10:22 +09:00
Artem Pianykh	e529681ad5	[NFC][Utils] Clone basic blocks after we're done with metadata in CloneFunctionInto (#118621 ) Summary: Moving the cloning of BBs after the metadata makes the flow of the function a bit more straightforward and makes it easier to extract more into helper functions. Test Plan: ninja check-llvm-unit check-llvm	2024-12-09 21:40:04 +09:00
Artem Pianykh	a202a35e79	[NFC][Utils] Remove DebugInfoFinder parameter from CloneBasicBlock (#118620 ) Summary: There was a single usage of CloneBasicBlock with non-default DebugInfoFinder inside CloneFunctionInto which has been refactored in more focused. Test Plan: ninja check-llvm-unit check-llvm	2024-12-06 21:41:29 +09:00
Nikita Popov	9a24f2198e	[MergeFuncs] Handle ConstantRangeList attributes Support comparison of ConstantRangeList attributes in FunctionComparator.	2024-12-06 12:21:45 +01:00
Akshat Oke	49abcd207f	[CodeGen][PM] Initialize analyses with isAnalysis=true (#118779 ) Analyses should be marked as analyses. Otherwise they are prone to get ignored by the legacy analysis cache mechanism and get scheduled redundantly.	2024-12-06 15:25:54 +05:30
Nikita Popov	b569ec6de6	[SCCP] Infer nuw for gep nusw with non-negative offsets (#118819 ) If the GEP is nusw/inbounds and has all-non-negative offsets infer nuw as well. This doesn't have measurable compile-time impact. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-06 09:52:32 +01:00
Owen Anderson	cfa582e8aa	SimplifyLibCalls: Use default globals address space when building new global strings. (#118729 ) Writing a test for this transitively exposed a number of places in BuildLibCalls where we were failing to propagate address spaces properly, which are additionally fixed.	2024-12-06 10:51:14 +13:00
Florian Hahn	4226e0a0c7	[TTI] Add SCEVExpansionBudget to loop unrolling options. (#118316 ) Add an extra know to UnrollingPreferences to let backends control the maximum budget for SCEV expansions. This gives backends more fine-grained control on the cost of the runtime checks for runtime unrolling. PR: https://github.com/llvm/llvm-project/pull/118316	2024-12-02 21:35:00 +00:00
AdityaK	39601a6e54	Bail out jump threading on indirect branches only (#117778 ) Remove check for PHI in pred as pointed out in #103688 Reduced the testcase to remove redundant phi in pred Fixes: #102351	2024-11-26 14:57:28 -08:00
Florian Hahn	46a08579f2	[Local] Only intersect alias.scope,noalias & parallel_loop if inst moves (#117716 ) Preserve !alias.scope, !noalias and !mem.parallel_loop_access metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the aliasing property encoded in the metadata does not hold. This makes use of the clarification re aliasing metadata implying UB if the property does not hold: #116220 Same as #115868, but for !alias.scope, !noalias and !mem.parallel_loop_access. PR: https://github.com/llvm/llvm-project/pull/117716	2024-11-26 20:39:53 +00:00
Matt Arsenault	4028bb10c3	Local: Handle noalias_addrspace in combineMetadata (#103938 ) This should act like range. Previously ConstantRangeList assumed a 64-bit range. Now query from the actual entries. This also means that the empty range has no bitwidth, so move asserts to avoid checking the bitwidth of empty ranges.	2024-11-26 09:13:34 -05:00
David Green	18abc7e0c5	[PatternMatch] Introduce m_c_Select (#114328 ) This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).	2024-11-25 13:47:23 +00:00
Phoebe Wang	2568e52a73	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812 ) This is a follow up of #96878 to support hoisting load/store from BBs have the same predecessor, if load/store are the only instructions and the branch is unpredictable, e.g.: ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```	2024-11-25 15:19:28 +08:00
Jay Foad	d6fc7d3ab1	Fix typo "intead"	2024-11-21 14:48:38 +00:00
Artem Pianykh	f5002a0fae	[Utils] Extract CollectDebugInfoForCloning from CloneFunctionInto (#114537 ) Summary: Consolidate the logic in a single function. We do an extra pass over Instructions but this is necessary to untangle things and extract metadata cloning in a future diff. Test Plan: ``` $ ninja check-llvm-unit check-llvm [211/213] Running the LLVM regression tests Testing Time: 106.06s Total Discovered Tests: 62601 Skipped : 17 (0.03%) Unsupported : 2518 (4.02%) Passed : 59911 (95.70%) Expectedly Failed: 155 (0.25%) [212/213] Running lit suite Testing Time: 12.47s Total Discovered Tests: 8474 Skipped: 17 (0.20%) Passed : 8457 (99.80%) ``` Extracted from #109032 (commit 3) (there are more refactors and cleanups in subsequent commits)	2024-11-20 23:36:55 +00:00
Florian Hahn	0bb1b68330	[Local] Only intersect tbaa metadata if instr moves. (#116682 ) Preserve tbaa metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the aliasing property encoded in the metadata does not hold. This makes use of the clarification re tbaa metadata implying UB if the property does not hold: https://github.com/llvm/llvm-project/pull/116220 Same as https://github.com/llvm/llvm-project/pull/115868, but for !tbaa PR: https://github.com/llvm/llvm-project/pull/116682	2024-11-20 19:31:16 +00:00
Florian Hahn	076513646c	[Local] Only intersect llvm.access.group metadata if instr moves. (#115868 ) Preserve llvm.access.group metadata on the replacement instruction, if it does not move. In that case, the program would be UB, if the parallel property encoded in the metadata does not hold. This matches the LangRef recently updated in #116220 PR https://github.com/llvm/llvm-project/pull/115868	2024-11-19 22:01:16 +00:00
Stephen Tozer	2188a56a75	[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235 ) Currently when we merge invokes as part of SimplifyCFG we apply a merge of the invoke DILocations to the merged invoke. We also insert an unconditional branch to the merged invoke at the positions previously occupied by the original invokes; as this branch is part of the substitution for the invoke it has replaced, we should propagate the original invoke DebugLoc to it.	2024-11-15 17:20:55 +00:00
Alex Bradbury	298127dcbe	Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case. Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 15:21:39 +00:00
Alex Bradbury	0fb8fac5d6	Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 )" This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3. Recent scheduling changes means tests need to be re-generated. Reverting to green while I do that.	2024-11-15 14:48:32 +00:00
Michael Maitland	6b9952759f	[SimplifyCFG] Simplify switch instruction that has duplicate arms (#114262 ) I noticed that the two C functions emitted different IR: ``` int switch_duplicate_arms(int switch_val, int v, int w) { switch (switch_val) { default: break; case 0: w = v; break; case 1: w = v; break; } return w; } int if_duplicate_arms(int switch_val, int v, int w) { if (switch_val == 0) w = v; else if (switch_val == 1) w = v; return v0; } ``` We generate IR that looks like this: ``` define i32 @switch_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { switch i32 %1, label %7 [ i32 0, label %5 i32 1, label %6 ] 5: br label %7 6: br label %7 7: %8 = phi i32 [ %3, %4 ], [ %2, %6 ], [ %2, %5 ] ret i32 %8 } define i32 @if_duplicate_arms(i32 %0, i32 %1, i32 %2, i32 %3) { %5 = icmp ult i32 %1, 2 %6 = select i1 %5, i32 %2, i32 %3 ret i32 %6 } ``` For `switch_duplicate_arms`, taking case 0 and 1 are the same since %5 and %6 branch to the same location and the incoming values for %8 are the same from those blocks. We could remove one on the duplicate switch targets and update the switch with the single target. On RISC-V, prior to this patch, we generate the following code: ``` switch_duplicate_arms: li a4, 1 beq a1, a4, .LBB0_2 mv a0, a3 bnez a1, .LBB0_3 .LBB0_2: mv a0, a2 .LBB0_3: ret if_duplicate_arms: li a4, 2 mv a0, a2 bltu a1, a4, .LBB1_2 mv a0, a3 .LBB1_2: ret ``` After this patch, the O3 code is optimized to the icmp + select pair, which gives us the same code gen as `if_duplicate_arms`, as desired. This results is one less branch instruction in the final assembly. This may help with both code size and further switch simplification. I found that this patch causes no significant impact to spec2006/int/ref and spec2017/intrate/ref. --------- Co-authored-by: Min Hsu <min@myhsu.dev>	2024-11-15 15:38:34 +01:00
Alex Bradbury	7ff3a9acd8	[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 14:07:46 +00:00
Justin Fargnoli	2e9f8696e9	Reland "[LLVM] Add IRNormalizer Pass" (#113780 ) `IRNormalizer` will reorder instructions. Thus, we need to invalidate analyses. Done in cd500d28cba3177c213f2f2faf50f14ea56e230b. This should resolve the [BuildBot failure](https://github.com/llvm/llvm-project/pull/68176#issuecomment-2428243474). --- Original PR: #68176 Original commit: 1295d2e6da2fe90f3b770ab1d35bf5caecd38bed Reverted with: 8a12e0131f3d84b470fac63af042aa96a1b19f56 --- Add the llvm-canon tool. Description from the [original PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu): > Added a new llvm-canon tool which aims to transform LLVM Modules into a canonical form by reordering and renaming instructions while preserving the same semantics. This tool makes it easier to spot semantic differences while diffing two modules which have undergone different transformation passes. The current version of this tool can: - Reorder instructions within a function. - Rename instructions based on the operands. - Sort commutative operands. This code was originally written by @michalpaszkowski and [submitted to mainline LLVM](`14d358537f`). However, it was quickly [reverted](`335de55fa3`) to do BuildBot errors. Michal presented his version of the tool in [LLVM-Canon: Shooting for Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg). @AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains. Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR! CC @plotfi for initial review. --------- Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>	2024-11-14 09:56:22 -08:00
Haojian Wu	9b6b9d3903	Default initialize a pointer in CodeExtractor. This fixes msan failure after f6795e6b4f619cbecc59a92f7e5fad7ca90ece54	2024-11-14 11:38:02 +01:00
serge-sans-paille	dc4185fe2f	[TLI] Add support for reallocarray (#114818 ) reallocarray is available in glibc since 2.29 under _DEFAULT_SOURCE and under _GNU_SOURCE before, let's model it appropriately.	2024-11-13 20:57:29 +00:00
Michael Kruse	f6795e6b4f	[CodeExtractor] Refactor extractCodeRegion, fix alloca emission. (#114419 ) Reorganize the code into phases: * Analyze/normalize * Create extracted function prototype * Generate the new function's implementation * Generate call to new function * Connect call to original function's CFG The motivation is #114669 to optionally clone the selected code region into the new function instead of moving it. The current structure made it difficult to add such functionality since there was no obvious place to do so, not made easier by some functions doing more than their name suggests. For instance, constructFunction modifies code outside the constructed function, but also function properties such as setPersonalityFn are derived somewhere else. Another example is emitCallAndSwitchStatement, which despite its name also inserts stores for output parameters. Many operations also implicitly depend on the order they are applied which this patch tries to reduce. For instance, ExtractedFuncRetVals becomes the list exit blocks which also defines the return value when leaving via that block. It is computed early such that the new function's return instructions and the switch can be generated independently. Also, ExtractedFuncRetVals is combining the lists ExitBlocks and OldTargets which were not always kept consistent with each other or NumExitBlocks. The method recomputeExitBlocks() will update it when necessary. The coding style partially contradict the current coding standard. For instance some local variable start with lower case letters. I updated some, but not all occurrences to make the diff match at least some lines as unchanged. The patch [D96854](https://reviews.llvm.org/D96854) introduced some confusion of function argument indexes this is fixed here as well, hence the patch is not NFC anymore. Tested in modified CodeExtractorTest.cpp. Patch [D121061](https://reviews.llvm.org/D121061) introduced AllocationBlock, but not all allocas were inserted there. Efectively includes the following fixes: 1. `ce73b1672a` 2. `4aaa925786` 3. Missing allocas, still unfixed Originally submitted as https://reviews.llvm.org/D115218	2024-11-12 20:12:22 +01:00
Kazu Hirata	4048c64306	[llvm] Remove redundant control flow statements (NFC) (#115831 ) Identified with readability-redundant-control-flow.	2024-11-12 10:09:42 -08:00
Nikita Popov	6dc23b7009	[SCEVExpander] Don't try to reuse SCEVUnknown values (#115141 ) The expansion of a SCEVUnknown is trivial (it's just the wrapped value). If we try to reuse an existing value it might be a more complex expression that simplifies to the SCEVUnknown. This is inspired by https://github.com/llvm/llvm-project/issues/114879, because SCEVExpander replacing a constant with a phi node is just silly. (I don't consider this a fix for that issue though.)	2024-11-11 12:36:29 +01:00
Kazu Hirata	5b19ed8bb4	[llvm] Migrate away from PointerUnion::{is,get,dyn_cast} (NFC) (#115626 ) Note that PointerUnion::{is,get,dyn_cast} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T>	2024-11-10 07:24:06 -08:00
Harald van Dijk	ccaded2b1d	[Inliner] Prevent adding pointer attributes to non-pointer arguments (#115569 ) Fixes a crash seen after #114311	2024-11-09 16:17:16 +00:00
Stephen Tozer	92e0fb0c94	[DebugInfo][LoopUnroll] Preserve DebugLocs on optimized cond branches (#114225 ) This patch fixes a simple error where as part of loop unrolling we optimize conditional loop-exiting branches into unconditional branches when we know that they will or won't exit the loop, but does not propagate the source location of the original branch to the new one. Found using https://github.com/llvm/llvm-project/pull/107279.	2024-11-08 16:52:30 +00:00

1 2 3 4 5 ...

7652 Commits