llvm-project

Author	SHA1	Message	Date
Nikita Popov	e6d9542b77	[X86][Inline] Check correct function for target feature check (#152515 ) The check for ABI differences for inlined calls involves the caller, the callee and the nested callee. Before inlining, the ABI is determined by the target features of the callee. After inlining it is determined by the caller. The features of the nested callee should never actually matter.	2025-08-19 09:44:00 +02:00
Justin Fargnoli	58de8f2c25	[Inliner] Add option (default off) to inline all calls regardless of the cost (#152365 ) Add a default off option to the inline cost calculation to always inline all viable calls regardless of the cost/benefit and cost/threshold calculations. For performance reasons, some users require that all calls be inlined. Rather than forcing them to adjust the inlining threshold to an arbitrarily high value, offer an option to inline all calls.	2025-08-18 17:48:49 +00:00
Tobias Stadler	d803a93f55	[Inliner] Report inlining decision before deleting Callee contents (#153616 ) Call `recordInliningWithCalleeDeleted` before dropping the contents of the Callee. Otherwise the handlers don't have access to e.g. the DebugLoc, so the Callee DebugLoc was missing in inlining remarks for functions with internal linkage. The test is the same as `optimization-remarks-passed-yaml.ll` except that the function `foo` has internal linkage instead of external linkage.	2025-08-15 12:00:34 +01:00
choikwa	1d30f71b21	[AMDGPU] Make ds/global load intrinsics IntrArgMemOnly (#152792 ) This along with IntrReadMem means that the Intrinsic only reads memory through the given argument ptr and its derivatives. This allows passes like Inliner to attach alias.scope to the call instruction as it sees that no other memory is accessed. Discovered via SWDEV-543741 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-08-12 21:51:39 +09:00
Mircea Trofin	75b3cc9e6c	[mlgo] Remove tests involving the size estimator, after PR #139357 (#152813 ) We'll remove the size estimator after, this change is to get the `ml-*` build bots green after the aforementioned PR. We never used the size estimator again after the initial DQN-based training. Should we want to again, we now have IR2Vec, which the old estimator was approximating in functionality.	2025-08-08 16:42:40 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Nikita Popov	18e4f775c3	[SystemZ] Remove incorrect areInlineCompatible hook (#152494 ) This reverts https://github.com/llvm/llvm-project/pull/132976. The PR incorrectly claimed that this makes inlining more liberal, referencing the string comparison in TargetTransformInfoImpl.h. However, the implementation that actually applies is the one in BasicTTIImpl.h, which performs a feature subset comparison. As such, this regressed inlining, most concerningly of functions without +vector into functions with +vector. Revert the change to restore the previous behavior.	2025-08-08 10:06:19 +02:00
Sander de Smalen	76ce464073	[AArch64] Dont inline streaming fn into non-streaming caller (#150595 ) Without this change, the following test would fail to compile with `-march=armv8-a+sme`: ``` void func1(const svuint32_t in, svuint32_t out) { [&]() __arm_streaming { out = in; }(); } ``` But in general, it's probably better never to inline streaming functions into non-streaming functions, because they will have been marked as 'streaming' for a reason by the user.	2025-08-01 09:05:19 +01:00
Sander de Smalen	d97566a5f5	[AArch64] NFC: Precommit test changes to sme-pstatesm-attrs.ll The tests need feature attributes, in order for the inliner to make a more informed decision.	2025-07-25 09:10:27 +00:00
Nikita Popov	18edd82716	[Inline] Regenerate test checks (NFC) Do not omit check lines for any functions, to avoid spurious diffs on regeneration. Also update to a newer UTC version which properly generates the metadata checks.	2025-07-23 14:21:22 +02:00
Nikita Popov	92c55a315e	[IR] Only allow lifetime.start/end on allocas (#149310 ) lifetime.start and lifetime.end are primarily intended for use on allocas, to enable stack coloring and other liveness optimizations. This is necessary because all (static) allocas are hoisted into the entry block, so lifetime markers are the only way to convey the actual lifetimes. However, lifetime.start and lifetime.end are currently allowed to be used on non-alloca pointers. We don't actually do this in practice, but just the mere fact that this is possible breaks the core purpose of the lifetime markers, which is stack coloring of allocas. Stack coloring can only work correctly if all lifetime markers for an alloca are analyzable. * If a lifetime marker may operate on multiple allocas via a select/phi, we don't know which lifetime actually starts/ends and handle it incorrectly (https://github.com/llvm/llvm-project/issues/104776). * Stack coloring operates on the assumption that all lifetime markers are visible, and not, for example, hidden behind a function call or escaped pointer. It's not possible to change this, as part of the purpose of lifetime markers is that they work even in the presence of escaped pointers, where simple use analysis is insufficient. I don't think there is any way to have coherent semantics for lifetime markers on allocas, while also permitting them on arbitrary pointer values. This PR restricts lifetimes to operate on allocas only. As a followup, I will also drop the size argument, which is superfluous if we always operate on an alloca. (This change also renders various code handling lifetime markers on non-alloca dead. I plan to clean up that kind of code after dropping the size argument as well.) In practice, I've only found a few places that currently produce lifetimes on non-allocas: * CoroEarly replaces the promise alloca with the result of an intrinsic, which will later be replaced back with an alloca. I think this is the only place where there is some legitimate loss of functionality, but I don't think this is particularly important (I don't think we'd expect the promise in a coroutine to admit useful lifetime optimization.) * SafeStack moves unsafe allocas onto a separate frame. We can safely drop lifetimes here, as SafeStack performs its own stack coloring. * Similar for AddressSanitizer, it also moves allocas into separate memory. * LSR sometimes replaces the lifetime argument with a GEP chain of the alloca (where the offsets ultimately cancel out). This is just unnecessary. (Fixed separately in https://github.com/llvm/llvm-project/pull/149492.) * InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast of an alloca. I don't think this is necessary.	2025-07-21 15:04:50 +02:00
Teresa Johnson	e57315e6ca	[MemProf] Fix discarding of noncold contexts after inlining (#149599 ) When we rebuild the call site tries after inlining of an allocation with MD_memprof metadata, we don't want to reapply the discarding of small non-cold contexts (under -memprof-callsite-cold-threshold=) because we have either no context size info (without -memprof-report-hinted-sizes or another option that causes us to keep that as metadata), and even with that information in the metadata, we have imperfect information at that point as we have already discarded some contexts during matching. The first case was even worse because we didn't guard our check by whether the number of cold bytes was 0, leading to very aggressive pruning during post-inline metadata rebuilding without the context size information.	2025-07-18 21:11:37 -07:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Rahman Lavaee	cd9236d788	Account for inline assembly instructions in inlining cost. (#146628 ) Inliner currently treats every "call asm" IR instruction as a single instruction regardless of how many instructions the inline assembly may contain. This may underestimate the cost of inlining for a callee containing long inline assembly. Besides, we may need to assign a higher cost to instructions in inline assembly since they cannot be analyzed and optimized by the compiler. This PR introduces a new option `-inline-asm-instr-cost` -- set zero by default, which can control the cost of inline assembly instructions in inliner's cost-benefit analysis.	2025-07-09 10:48:07 -07:00
Tobias Stadler	9186df9b08	[InlineCost] Simplify extractvalue across callsite (#145054 ) Motivation: When using libc++, `std::bitset<64>::count()` doesn't optimize to a single popcount instruction on AArch64, because we fail to inline the library code completely. Inlining fails, because the internal bit_iterator struct is passed as a [2 x i64] %arg value on AArch64. The value is built using insertvalue instructions and only one of the array entries is constant. If we know that this entry is constant, we can prove that half the function becomes dead. However, InlineCost only considers operands for simplification if they are Constants, which %arg is not. Without this simplification the function is too expensive to inline. Therefore, we had to teach InlineCost to support non-Constant simplified values (PR #145083). Now, we enable this for extractvalue, because we want to simplify the extractvalue with the insertvalues from the caller function. This is enough to get bitset::count fully optimized. There are similar opportunities we can explore for BinOps in the future (e.g. cmp eq %arg1, %arg2 when the caller passes the same value into both arguments), but we need to be careful here, because InstSimplify isn't completely safe to use with operands owned by different functions.	2025-06-24 14:15:27 +01:00
Nikita Popov	6157028fea	[BasicAA][ValueTracking] Increase depth for underlying object search (#143714 ) This depth limits a linear search (rather than the usual potentially exponential one) and is not particularly important for compile-time in practice. The change in #137297 is going to increase the length of GEP chains, so I'd like to increase this limit a bit to reduce the chance of regressions (https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2419 showed a 13% increase in SearchLimitReached). There is no particular significance to the new value of 10. Compile-time is neutral.	2025-06-12 09:19:50 +02:00
Hassnaa Hamdi	c81d84c30b	[InlineCost]: Optimize inlining of recursive function. (#139982 ) - Consider inlining recursive function of depth 1 only when the caller is the function itself instead of inlining it for each callsite so that we avoid redundant work. - Use CondContext instead of DomTree for better compilation time.	2025-06-04 11:41:55 +01:00
Teresa Johnson	49d48c32e0	[MemProf] Emit remarks when hinting allocations not needing cloning (#141859 ) The context disambiguation code already emits remarks when hinting allocations (by adding hotness attributes) during cloning. However, we did not yet emit hints when applying the hotness attributes during building of the metadata (during matching and again after inlining). Add remarks when we apply the hint attributes for these non-context-sensitive allocations.	2025-05-28 16:44:44 -07:00
Mircea Trofin	ead27e69d7	[mlgo][inliner] Tigher test for interactive mode (#141677 ) Prompted by #141453 - have the test check the feature that was previously clobbered in interactive mode, if the default decision was requested.	2025-05-27 16:49:43 -07:00
Nikita Popov	904d0c293e	[Inline] Only consider provenance captures for scoped alias metadata (#138540 ) When determining whether an escape source may alias with a noalias argument, only take provenance captures into account. If only the address of the argument was captured, an access through the escape source is not legal.	2025-05-27 15:15:57 +02:00
Jellytabby	18fced40d5	[LLVM][MLGO] Fix: Index correctly into features to get default inlining decision (#141453 ) Currently, `InlineCostFeatureIndex::NumberOfFeatures` results in an index in the middle of the feature vector, therefore not correctly setting the default inlining decision and overwriting another feature. `FeatureIndex::NumberOfFeatures` is the last index of the feature vector, where the default inlining decision gets appended to when enabled.	2025-05-26 14:57:04 +09:00
Daniel Paoliello	97a58b04c6	[aarch64][x86][win] Add compiler support for MSVC's /funcoverride flag (Windows kernel loader replaceable functions) (#125320 ) Adds support for MSVC's undocumented `/funcoverride` flag, which marks functions as being replaceable by the Windows kernel loader. This is used to allow functions to be upgraded depending on the capabilities of the current processor (e.g., the kernel can be built with the naive implementation of a function, but that function can be replaced at boot with one that uses SIMD instructions if the processor supports them). For each marked function we need to generate: * An undefined symbol named `<name>_$fo$`. * A defined symbol `<name>_$fo_default$` that points to the `.data` section (anywhere in the data section, it is assumed to be zero sized). * An `/ALTERNATENAME` linker directive that points from `<name>_$fo$` to `<name>_$fo_default$`. This is used by the MSVC linker to generate the appropriate metadata in the Dynamic Value Relocation Table. Marked function must never be inlined (otherwise those inline sites can't be replaced). Note that I've chosen to implement this in AsmPrinter as there was no way to create a `GlobalVariable` for `<name>_$fo$` that would result in a symbol being emitted (as nothing consumes it and it has no initializer). I tried to have `llvm.used` and `llvm.compiler.used` point to it, but this didn't help. Within LLVM I referred to this feature as "loader replaceable" as "function override" already has a different meaning to C++ developers... I also took the opportunity to extract the feature symbol generation code used by both AArch64 and X86 into a common function in AsmPrinter.	2025-05-09 14:56:38 -07:00
Hassnaa Hamdi	0159a26744	[InlineCost]: Add a new heuristic to branch folding for better inlining decisions. Recursive functions are generally not inlined to avoid issues like infinite inlining or excessive code expansion. However, this conservative approach misses opportunities for optimization in cases where a recursive call is guaranteed to execute only once. This patch detects a scenario where a guarding branch condition of a recursive call will become false after the first iteration of the recursive function. If such a condition is met, and the recursion depth is confirmed to be one, the Inliner will now consider this recursive function for inlining. A new test case (`test/Transforms/Inline/inline-recursive-fn.ll`) has been added to verify this behaviour.	2025-05-06 11:25:37 +01:00
Nikita Popov	c66ce08041	[Inline] Add tests for captures-before check for scoped AA metadata (NFC)	2025-05-05 17:11:58 +02:00
Alexander Richardson	ee13638362	[AMDGPU] Remove explicit datalayout from tests where not needed Since e39f6c1844fab59c638d8059a6cf139adb42279a opt will infer the correct datalayout when given a triple. Avoid explicitly specifying it in tests that depend on the AMDGPU target being present to avoid the string becoming out of sync with the TargetInfo value. Only tests with REQUIRES: amdgpu-registered-target or a local lit.cfg were updated to ensure that tests for non-target-specific passes that happen to use the AMDGPU layout still pass when building with a limited set of targets. Reviewed By: shiltian, arsenm Pull Request: https://github.com/llvm/llvm-project/pull/137921	2025-04-30 10:58:17 -07:00
sallto	419a2cb218	[Inliner] Preserve alignment of byval arguments (#137455 ) Previously the inliner always produced a memcpy with alignment 1 for src and destination, leading to potentially suboptimal Codegen. Since the Src ptr alignment is only available through the CallBase it has to be passed to HandleByValArgumentInit. Dst Alignment is already known so it doesn't have to be passed along. If there is no specified Src Alignment my changes cause the ptr to have no align data attached instead of align 1 as before (see inline-tail.ll), I believe this is fine but since I'm a first time contributor, please confirm. My changes are already covered by 4 existing regression tests, so I did not add any additional ones. The example from #45778 now results in: ```C opt -S -passes=inline,instcombine,sroa,instcombine test.ll define dso_local i32 @test(ptr %t) { entry: %.sroa.0.0.copyload = load ptr, ptr %t, align 8 # this used to be align 1 in the original issue %arrayidx.i = getelementptr inbounds nuw i8, ptr %.sroa.0.0.copyload, i64 24 %0 = load i32, ptr %arrayidx.i, align 4 ret i32 %0 } ``` Fixes #45778.	2025-04-26 21:38:58 +02:00
Matt Arsenault	f819f46284	Reapply "Inline: Propagate callsite nofpclass attribute" (#135018 ) This reverts commit 3f38cd07d820248fd2043efb1341fabaac2d84a6. Fix case where inner callsite has nofpclass but callsite does not.	2025-04-10 07:15:58 +02:00
Stephen Tozer	5039bf4e26	[DebugInfo][Inline] Propagate source locs when simplifying cond branches (#134827 ) During inlining, we may opportunistically simplify conditional branches (incl. switches) to unconditional branches if, after inlining, their destination is fixed. While we do this, we should propagate any DILocation attached to the original branch to the simplified branch, which this patch enables. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-09 16:52:05 +01:00
Stephen Tozer	9344b2196c	[DebugInfo][Inlining] Propagate inlined `resume` source loc to new br (#134826 ) As part of inlining an invoke instruction, we may replace an inlined resume instruction with a simple branch to the landing pad block. When this happens, we should also propagate the resume's DILocation to this branch, which this patch enables. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-09 16:42:06 +01:00
Matt Arsenault	3f38cd07d8	Revert "Inline: Propagate callsite nofpclass attribute" This reverts commit b0cb672b9968eeee6eb022e98476957dbdf8e6e2. Breaks bot	2025-04-08 23:15:00 +07:00
Matt Arsenault	b0cb672b99	Inline: Propagate callsite nofpclass attribute (#134800) Fixes #134070	2025-04-08 22:53:17 +07:00
Andres Chavarria	9b63a92ca7	Implement areInlineCompatible for SystemZ using feature bitset (#132976 ) ## What? Implement `areInlineCompatible` for the SystemZ target using FeatureBitset comparison. ## Why? The default implementation in `TargetTransformInfoImpl.h` makes a string comparison and only inlines when the target-cpu and the target-features for caller and callee are the same. We are missing out on optimizations when the callee has a subset of features of the caller. ## How? Get the FeatureBitset of the caller and callee and check when callee is a subset or equal to the caller's features. It's a similar implementation to ARM, PowerPC... ## Testing? Test cases check for when the callee is a subset of the caller, when it's not a subset and when both are equals.	2025-04-08 00:50:30 +02:00
Matt Arsenault	5b8d8bb90a	Inliner: Fix missing test coverage for incompatible gc rejection (#133708 )	2025-04-01 06:24:59 +07:00
Vasileios Porpodas	973ea045aa	Revert "[Analysis][EphemeralValuesAnalysis][NFCI] Remove EphemeralValuesCache class (#132454 )" This reverts commit 4adefcfb856aa304b7b0a9de1eec1814f3820e83.	2025-03-22 10:13:39 -07:00
vporpo	4adefcfb85	[Analysis][EphemeralValuesAnalysis][NFCI] Remove EphemeralValuesCache class (#132454 ) This is a follow-up to https://github.com/llvm/llvm-project/pull/130210. The EphemeralValuesAnalysis pass used to return an EphemeralValuesCache object which used to hold the ephemeral values and used to provide a lazy collection of the ephemeral values, and an invalidation using the `clear()` function. This patch removes the EphemeralValuesCache class completely and instead returns the SmallVector containing the ephemeral values.	2025-03-21 18:18:03 -07:00
vporpo	08dda4dcbf	[Analysis][EphemeralValuesCache][InlineCost] Ephemeral values caching for the CallAnalyzer (#130210 ) This patch does two things: 1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`. 2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.	2025-03-19 18:18:45 -07:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Henry Jiang	6d0cfbc9c0	[PPC] Implement `areInlineCompatible` (#126562 ) After the default implementation swap from https://github.com/llvm/llvm-project/pull/117493, where `areInlineCompatible` checks if the callee features are a subset of caller features. This is not a safe assumption in general on PPC. We fallback to check for strict feature set equality for now, and see what improvements we can make.	2025-02-24 17:53:43 -05:00
Csanád Hajdú	a190f15d2b	[AArch64] Add support for SHF_AARCH64_PURECODE ELF section flag (1/3) (#125687 ) Add support for the new SHF_AARCH64_PURECODE ELF section flag: https://github.com/ARM-software/abi-aa/pull/304 The general implementation follows the existing one for ARM targets. Generating object files with the `SHF_AARCH64_PURECODE` flag set is enabled by the `+execute-only` target feature. Related PRs: * Clang: https://github.com/llvm/llvm-project/pull/125688 * LLD: https://github.com/llvm/llvm-project/pull/125689	2025-02-14 08:56:07 +00:00
Nikita Popov	d8b2e432d6	[IR] Remove mul constant expression (#127046 ) Remove support for the mul constant expression, which has previously already been marked as undesirable. This removes the APIs to create mul expressions and updates tests to stop using mul expressions. Part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179	2025-02-14 09:28:57 +01:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Kerry McLaughlin	d8d4c18761	[AArch64][SME] Disable inlining of callees with new ZT0 state (#121338 ) Inlining must be disabled for new-ZT0 callees as the callee is required to save ZT0 and toggle PSTATE.ZA on entry.	2025-01-06 12:02:28 +00:00
Sander de Smalen	2ce168baed	[AArch64] SME implementation for agnostic-ZA functions (#120150 ) This implements the lowering of calls from agnostic-ZA functions to non-agnostic-ZA functions, using the ABI routines `__arm_sme_state_size`, `__arm_sme_save` and `__arm_sme_restore`. This implements the proposal described in the following PRs: * https://github.com/ARM-software/acle/pull/336 * https://github.com/ARM-software/abi-aa/pull/264	2024-12-23 19:10:21 +00:00
Sander de Smalen	b85ddba421	[AArch64] NFC: Fix inlining tests for SME ZA state. By adding inline-asm to test, we now actually test the code-path they're meant to test.	2024-12-23 15:36:32 +00:00
Nikita Popov	2be41e7aee	[AlwaysInline] Fix analysis invalidation (#119566 ) This is a followup to #117750. Currently, AlwaysInline only invalidates analyses at the end, by returning that no analyses are preserved. However, this means that analyses fetched during inlining may be outdated. The aforementioned PR exposed this issue. Instead, bring the logic closer to what the normal inliner does, by directly invalidating the caller in FAM. This should make sure that we don't receive any outdated analyses even if they are fetched during inlining. Also drop the BFI updating entirely -- there's no point in doing it if we're going to invalidate everything anyway.	2024-12-12 12:59:59 +01:00
Owen Anderson	ab15976173	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119483 ) This reapplies #119138 with a defensive fix for the assertion failure when building libcxx. Unfortunately the failure does not reproduce on my machine, so I am not able to extract a test case. The key insight for the fix comes from Jessica Clarke, who observes that `VTablePtr` may, in fact, not be a pointer on return from `FindAvailableLoadedValue`. Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 16:49:48 +13:00
Owen Anderson	9b6bb83860	Revert "CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 )" Reverting due to ASAN bootstrap failures. This reverts commit 4027e2f248044d944aaf3d9bc9c8eb6928506d44.	2024-12-11 13:20:17 +13:00
Owen Anderson	4027e2f248	CallPromotionUtils: Correctly use IndexSize when determining the bit width of pointer offsets. (#119138 ) Co-authored-by: Alexander Richardson <alexander.richardson@cl.cam.ac.uk>	2024-12-11 12:43:40 +13:00
Marina Taylor	8fb748b4a7	[Inliner] Don't count a call penalty for foldable __memcpy_chk and similar (#117876 ) When the size is an appropriate constant, __memcpy_chk will turn into a memcpy that gets folded away by InstCombine. Therefore this patch avoids counting these as calls for purposes of inlining costs. This is only really relevant on platforms whose headers redirect memcpy to __memcpy_chk (such as Darwin). On platforms that use intrinsics, memcpy and similar functions are already exempt from call penalties.	2024-11-29 18:28:39 +00:00
hev	2523439021	[LoongArch] Add a test case for inline compatibility checks (#117144 )	2024-11-25 12:34:46 +08:00

1 2 3 4 5 ...

1066 Commits