llvm-project

Author	SHA1	Message	Date
Mingming Liu	2d641858fa	[nfc][PGO]Factor out profile scaling into a standalone helper function (#83780 ) - Put the helper function in `ProfDataUtil.h/cpp`, which is already a dependency of `Instructions.cpp` - The helper function could be re-used to update profiles of `InvokeInst` (in a follow-up pull request)	2024-03-27 11:57:07 -07:00
Jianjian Guan	05a7b22a01	[RISCV] Add areInlineCompatible for riscv target (#86639 ) Inline a callee if its target-features are a subset of the callers target-features.	2024-03-27 14:16:03 +08:00
Nikita Popov	0f46e31cfb	[IR] Change representation of getelementptr inrange (#84341 ) As part of the migration to ptradd (https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we need to change the representation of the `inrange` attribute, which is used for vtable splitting. Currently, inrange is specified as follows: ``` getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2) ``` The `inrange` is placed on a GEP index, and all accesses must be "in range" of that index. The new representation is as follows: ``` getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2) ``` This specifies which offsets are "in range" of the GEP result. The new representation will continue working when canonicalizing to ptradd representation: ``` getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48) ``` The inrange offsets are relative to the return value of the GEP. An alternative design could make them relative to the source pointer instead. The result-relative format was chosen on the off-chance that we want to extend support to non-constant GEPs in the future, in which case this variant is more expressive. This implementation "upgrades" the old inrange representation in bitcode by simply dropping it. This is a very niche feature, and I don't think trying to upgrade it is worthwhile. Let me know if you disagree.	2024-03-20 10:59:45 +01:00
Nikita Popov	e84182af91	[X86][Inline] Skip inline asm in inlining target feature check (#83820 ) When inlining across functions with different target features, we perform roughly two checks: 1. The caller features must be a superset of the callee features. 2. Calls in the callee cannot use types where the target features would change the call ABI (e.g. by changing whether something is passed in a zmm or two ymm registers). The latter check is very crude right now. The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string. Fixes https://github.com/llvm/llvm-project/issues/67054.	2024-03-05 14:21:33 +01:00
Nikita Popov	cad6ad2759	[Inline] Add test for #67054 (NFC)	2024-03-04 11:32:44 +01:00
lifengxiang1025	daf3079222	[ThinLTO] Add metedata 'thinlto_src_module' and 'thinlto_src_file' (#83110 ) Originally, when `EnableImportMetadata` enabled, `SourceFileName` will be recorded as `thinlto_src_module`. Now `SourceFileName` will be recorded as `thinlto_src_file` and `ModuleIdentifier` will be recorded as `thinlto_src_module`.	2024-02-29 10:42:06 +08:00
Dani	6fae3e7844	[llvm][AArch64] Do not inline a function with different signing scheme. (#80642 ) If the signing scheme is different that maybe the functions assumes different behaviours and dangerous to inline them without analysing them. This should be a rare case.	2024-02-23 09:30:36 +01:00
Quentin Dian	5932fcc478	[InlineCost] Consider the default branch when calculating cost (#77856 ) First step in fixing #76772. This PR considers the default branch as a case branch. This will give the unreachable default branch fair consideration.	2024-02-11 18:24:59 +08:00
Nikita Popov	2d69827c5c	[Transforms] Convert tests to opaque pointers (NFC)	2024-02-05 11:57:34 +01:00
Sander de Smalen	d313614b60	[AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166 ) Since https://github.com/ARM-software/acle/pull/276 the ACLE defines attributes to better describe the use of a given SME state. Previously the attributes merely described the possibility of it being 'shared' or 'preserved', whereas the new attributes have more semantics and also describe how the data flows through the program. For ZT0 we already had to add new LLVM IR attributes: * aarch64_new_zt0 * aarch64_in_zt0 * aarch64_out_zt0 * aarch64_inout_zt0 * aarch64_preserves_zt0 We have now done the same for ZA, such that we add: * aarch64_new_za (previously `aarch64_pstate_za_new`) * aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_inout_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`) This explicitly removes 'pstate' from the name, because with SME2 and the new ACLE attributes there is a difference between "sharing ZA" (sharing the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).	2024-02-01 13:37:37 +00:00
Sander de Smalen	3abf55a68c	[AArch64][SME] Fix inlining bug introduced in #78703 (#79994 ) Calling a `__arm_locally_streaming` function from a function that is not a streaming-SVE function would lead to incorrect inlining. The issue didn't surface because the tests were not testing what they were supposed to test.	2024-01-31 11:38:29 +00:00
Yingwei Zheng	1228becf7d	[FuncAttrs] Deduce `noundef` attributes for return values (#76553 ) This patch deduces `noundef` attributes for return values. IIUC, a function returns `noundef` values iff all of its return values are guaranteed not to be `undef` or `poison`. Definition of `noundef` from LangRef: ``` noundef This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation. ``` Alive2: https://alive2.llvm.org/ce/z/g8Eis6 Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=30dcc33c4ea3ab50397a7adbe85fe977d4a400bd&to=c5e8738d4bfbf1e97e3f455fded90b791f223d74&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| \|+0.01%\|+0.01%\|-0.01%\|+0.01%\|+0.03%\|-0.04%\|+0.01%\| The motivation of this patch is to reduce the number of `freeze` insts and enable more optimizations.	2023-12-31 20:44:48 +08:00
Jeremy Morse	c672ba7dde	[DebugInfo][RemoveDIs] Instrument inliner for non-instr debug-info (#72884 ) With intrinsics representing debug-info, we just clone all the intrinsics when inlining a function and don't think about it any further. With non-instruction debug-info however we need to be a bit more careful and manually move the debug-info from one place to another. For the most part, this means keeping a "cursor" during block cloning of where we last copied debug-info from, and performing debug-info copying whenever we successfully clone another instruction. There are several utilities in LLVM for doing this, all of which now need to manually call cloneDebugInfo. The testing story for this is not well covered as we could rely on normal instruction-cloning mechanisms to do all the hard stuff. Thus, I've added a few tests to explicitly test dbg.value behaviours, ahead of them becoming not-instructions.	2023-11-26 21:24:29 +00:00
Anna Thomas	4ba50a783b	Update test to consider incompatible align attribute	2023-11-10 10:50:35 -05:00
Sander de Smalen	00a831421f	[AArch64][SME] Extend Inliner cost-model with custom penalty for calls. (#68416 ) This is a stacked PR following on from #68415 This patch has two purposes: (1) It tries to make inlining more likely when it can avoid a streaming-mode change. (2) It avoids inlining when inlining causes more streaming-mode changes. An example of (1) is: ``` void streaming_compatible_bar(void); void foo(void) __arm_streaming { /* other code / streaming_compatible_bar(); / other code / } void f(void) { foo(); // expensive streaming mode change } -> void f(void) { / other code / streaming_compatible_bar(); / other code */ } ``` where it wouldn't have inlined the function when foo would be a non-streaming function. An example of (2) is: ``` void streaming_bar(void) __arm_streaming; void foo(void) __arm_streaming { streaming_bar(); streaming_bar(); } void f(void) { foo(); // expensive streaming mode change } -> (do not inline into) void f(void) { streaming_bar(); // these are now two expensive streaming mode changes streaming_bar(); }```	2023-10-31 10:28:40 +00:00
Sander de Smalen	6d30bc0085	[AArch64][SME] Allow inlining when streaming-mode attributes dont match up. (#68415 ) The use-case here is to support things like: int foo(int x, int y) __arm_streaming { return std::max<int>(x, y); } where the call to non-streaming `std::max<int>(x, y)` can be safely inlined into the streaming function. This is a first step and will need further work to allow more cases (e.g. more finegrained analysis of the function calls to ensure they don't result in any incompatible instructions for the requested mode).	2023-10-30 10:47:07 +00:00
Aiden Grossman	f39c38584e	[MLGO] Fix tests post 1a2e77c This patch switched the default value of the mandatory-inlining-first flag from true to false. This broke one of the MLGO tests that relied on the default value of this flag. This patch explicitly sets the value to fix the test and avoid future breakages.	2023-10-29 08:41:11 +00:00
Amara Emerson	1a2e77cf9e	Revert "Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner."" This reverts commit 86bfeb906e3a95ae428f3e97d78d3d22a7c839f3. This is a long time coming re-application that was originally reverted due to regressions, unrelated to the actual inlining change. These regressions have since been fixed due to another long-in-the-making change of a66051c6 landing. Original commit message for reference: --- We have several situations where it's beneficial for code size to ensure that every call to always-inline functions are inlined before normal inlining decisions are made. While the normal inliner runs in a "MandatoryOnly" mode to try to do this, it only does it on a per-SCC basis, rather than the whole module. Ensuring that all mandatory inlinings are done before any heuristic based decisions are made just makes sense. Despite being referred to the "legacy" AlwaysInliner pass, it's already necessary for -O0 because the CGSCC inliner is too expensive in compile time to run at -O0. This also fixes an exponential compile time blow up in https://github.com/llvm/llvm-project/issues/59126 Differential Revision: https://reviews.llvm.org/D143624 ---	2023-10-28 23:21:11 -07:00
Sander de Smalen	0e099faff1	[AArch64][SME] NFC: use update_test_checks.py for sme-pstate(sm\|za)-attrs.ll	2023-10-06 12:46:20 +00:00
Mircea Trofin	a4765c6a02	[mlgo] Fix state-tracking-coro.ll test Post #68263, the inline advisor printer tries to print SCC Nodes' names, but if we perform a full pipeline (like O1), there'll be some DCE-ing happening and the Node pointers kept in the advisor for this (printing) purpose are dangling. Using the more eager printer post each scc inline pass is sufficient.	2023-10-04 22:07:44 -07:00
Mircea Trofin	1b3fc40586	[mlgo][coro] Assign coro split-ed functions a `FunctionLevel` (#68263 )	2023-10-04 21:20:00 -07:00
Noah Goldstein	2da4960f20	[Inliner] Also propagate `noundef` and `align` ret attributes during inlining Both of these can potentially be lost otherwise.	2023-10-03 16:12:19 -05:00
Noah Goldstein	2d037f5aed	[Inliner] Use "best" ret attribute when propagating attributes during inlining For attributes assosiated with a value (like `dereferenceable(N)`) instead of always using the attribute from the to-be inlined caller, it should keep using the value at existing callsites that have the attribute if the value is higher (provides more information).	2023-10-03 16:12:16 -05:00
Noah Goldstein	733f373ebe	[Inliner] Regen checks for old test; NFC	2023-10-03 16:12:06 -05:00
Mingming Liu	aa6ee03709	[NFC][Inliner] Introduce another multiplier for cost benefit analysis and make multipliers overriddable in TargetTransformInfo. - The motivation is to expose tunable knobs to control the aggressiveness of inlines for different backend (e.g., machines with different icache size, and workload with different icache/itlb PMU counters). Tuning inline aggressiveness shows a small (~+0.3%) but stable improvement on workload/hardware that is more frontend bound. - Both multipliers could be overridden from command line. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D153154	2023-10-02 21:27:07 -07:00
Noah Goldstein	2f3b7d33f4	[Inliner] Fix bug when propagating poison generating return attributes Poison generating return attributes can't be propagated the same as others, as they can change the behavior of other uses and/or create UB where it otherwise wouldn't have occurred. For example: ``` define nonnull ptr @foo() { %p = call ptr @bar() call void @use(ptr %p) ret ptr %p } ``` If we inline `@foo` and propagate `nonnull` to `@bar`, it could change the behavior of `@use` as instead of taking `null`, `@use` will now be passed `poison`. This can be even worth in a case like: ``` define nonnull ptr @foo() { %p = call noundef ptr @bar() ret ptr %p } ``` Where propagating `nonnull` to `@bar` will cause UB on `null` return of `@bar` (`noundef` + `poison`) where it previously wouldn't have occurred. To fix this, we only propagate poison generating return attributes if either 1) The only use of the callsite to propagate too is return and the callsite to propagate too doesn't have `noundef`. Or 2) the callsite to be be inlined has `noundef`. The former case ensures no new UB or `poison` values will be added. The latter is UB anyways if the value is `poison` so we can go ahead without worrying about behavior changes.	2023-09-28 17:27:42 -05:00
Noah Goldstein	bf8d03921d	[Inliner] Add some additional tests for progagating attributes before inlining; NFC	2023-09-28 17:27:41 -05:00
Kazu Hirata	b4301df61f	Revert "[InlineCost] Check for conflicting target attributes early" This reverts commit d6f994acb3d545b80161e24ab742c9c69d4bbf33. Several people have reported breakage resulting from this patch: - https://github.com/llvm/llvm-project/issues/65152 - https://github.com/llvm/llvm-project/issues/65205	2023-09-21 10:29:46 -07:00
Anna Thomas	23f08af2be	[Inline] Avoid incompatible return attributes on deoptimize When updating the return type of deoptimize call during inline, we need to drop incompatible return attributes. This bug was exposed once we relaxed the contraint of adding the attributes through D156844. With that change deoptimize (are not willreturn) will start having return attributes added to it. Fixes https://github.com/llvm/llvm-project/issues/64804. Differential Revision: https://reviews.llvm.org/D158286	2023-08-18 12:55:51 -04:00
Sameer Sahasrabuddhe	8dce4c56dd	[Inliner] Handle convergence control when inlining a call When a convergencectrl token is passed to a convergent call, and the called function in turn calls the entry intrinsic, the intrinsic is now now replaced with the convergencectrl token. The spec requires the following check: A call from function F to function G can be inlined only if: - at least one of F or G does not make any convergent calls, or, - both F and G make the same kind of convergent calls: controlled or uncontrolled. But this change does not implement this complete check. A proper implemenation require a whole new analysis that identifies convergence in every function. For now, we skip that and just do a cursory check for the entry intrinsic. The underlying assumption is that in a compiler flow that fully implements convergence control tokens, there is no mixing of controlled and uncontrolled convergent operations in the whole program. This is a reboot of the original change D85606 by Nicolai Haehnle <nicolai.haehnle@amd.com>. Reviewed By: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D152431	2023-08-17 09:56:25 +05:30
Noah Goldstein	4d51c6258e	[Inliner] Add return attributes to callsites not marked `willreturn`/`nounwind` The actual callsite we are adding to doesn't need to be `willreturn`/`nounwind`, only ever instructions between the callsite and the return. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D156844	2023-08-16 22:43:04 -05:00
Noah Goldstein	612a7f0b15	[Inliner] Add the callsites called function return attributes to set addable attributes We can do this by just querying attribute in the callsite itself. This is both cleaner code and produces bette results. Differential Revision: https://reviews.llvm.org/D156843	2023-08-16 22:43:04 -05:00
Noah Goldstein	74c4d1e422	[Inliner] Add more tests for deducing return attributes of callsites when inlining; NFC Differential Revision: https://reviews.llvm.org/D156842	2023-08-16 22:43:04 -05:00
Nikita Popov	6f3f600b2a	[Inline] Add test for simplification in loop (NFC) This would have been miscompiled by D157816.	2023-08-16 09:27:01 +02:00
Matt Arsenault	25bc999d1f	Intrinsics: Add type overload to stacksave and stackstore This allows use with non-0 address space stacks. llvm_ptr_ty should never be used. This could use some more percolation up through mlir, but this is enough to fix existing tests. https://reviews.llvm.org/D156666	2023-08-09 18:33:11 -04:00
Matt Arsenault	acc163d4ab	Inliner: Regenerate test Test claims to be autogenerated but some functions are inexplicibly missing checks.	2023-07-31 08:05:12 -04:00
Matt Arsenault	d873a14e93	ValueTracking: Implement computeKnownFPClass for frexp Work around the lack of proper multiple return values by looking at the extractvalue. https://reviews.llvm.org/D150982	2023-07-21 16:04:13 -04:00
Matt Arsenault	e1ac984a10	ValueTracking: Implement computeKnownFPClass for ldexp https://reviews.llvm.org/D149590	2023-07-11 09:26:41 -04:00
Juan Manuel MARTINEZ CAAMAÑO	dd1df099ae	[InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2) Before this patch, the compiler gave a bump to the inline-threshold when the total size of the allocas passed as arguments to the callee was below 256 bytes. This heuristic ignores that some of these allocas could have be removed by SROA if inlining was applied. Ideally, this bonus would be attributed to the threshold once the size of all the allocas that could not be handled by SROA is known: at the end of the InlineCost analysis. However, we may never reach this point if the inline-cost analysis exits early when the inline cost goes over the threshold mid-analysis. This patch proposes: * Attribute the bonus in the inline-threshold when allocas are passed as arguments (regardless of their total size). * Assigns a cost to each alloca proportional to its size, such that the cost of all the allocas cancels the bonus. Potential problems: * This patch assumes that removing alloca instructions with SROA is always profitable. This may not be the case if the total size of the allocas is still too big to be promoted to registers/LDS. * Redundant calls to getTotalAllocaSize * Awkwardly, the threshold attributed contributes to the single-bb and vector bonus. Reviewed By: scchan Differential Revision: https://reviews.llvm.org/D149741	2023-06-29 09:49:16 +02:00
Arthur Eubanks	ff4fcbb5f4	[test] Add test for null_pointer_is_valid and Inliner instsimplify interaction As requested in D151254 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153435	2023-06-21 14:00:53 -07:00
Nikita Popov	650041a7f1	[Inline] Convert tests to opaque pointers (NFC)	2023-06-21 11:32:45 +02:00
Nikita Popov	4c51f0dee5	[Inline] Regenerate test checks (NFC)	2023-06-21 11:32:45 +02:00
Arthur Eubanks	f4f826bcd4	Revert "Revert "ValueTracking: Fix nan result handling for fmul"" This reverts commit 464dcab8a6c823c9cb462bf4107797b8173de088. Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.	2023-06-16 13:53:32 -07:00
Arthur Eubanks	3e39cfe5b4	Revert "Revert "InstSimplify: Require instruction be parented"" This reverts commit 0c03f48480f69b854f86d31235425b5cb71ac921. Going to fix forward size regression instead due to more dependent patches needing to be reverted otherwise.	2023-06-16 13:53:31 -07:00
Arthur Eubanks	0c03f48480	Revert "InstSimplify: Require instruction be parented" This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b. Causes large binary size regressions, see comments on https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b.	2023-06-16 11:24:29 -07:00
Arthur Eubanks	464dcab8a6	Revert "ValueTracking: Fix nan result handling for fmul" This reverts commit a632ca4b00279baf18e72a171ec0ce526e9d80aa. Dependent commit to be reverted	2023-06-16 11:24:28 -07:00
Alan Zhao	d6b4f6786b	Revert "Revert "InstSimplify: Require instruction be parented"" This reverts commit 00264eac4d0938ae8a0826da38e4777be269124c. Reason: caused a bunch of bots to break	2023-06-16 10:58:54 -07:00
Alan Zhao	00264eac4d	Revert "InstSimplify: Require instruction be parented" This reverts commit 1536e299e63d7788f38117b0212ca50eb76d7a3b. Reason: causes a regression in the inliner (see https://crbug.com/1454531 and https://reviews.llvm.org/rG1536e299e63d7788f38117b0212ca50eb76d7a3b#1217141)	2023-06-16 10:36:49 -07:00
Matt Arsenault	a632ca4b00	ValueTracking: Fix nan result handling for fmul This was mishandling maybe 0 * inf. Fixes issue #63316	2023-06-15 09:35:12 -04:00
Matt Arsenault	19293b82c1	Inline: Fix case of not inlining with denormal-fp-math-f32 This was failing to inline the opencl libraries with daz enabled. As a modifier to the base mode, denormal-fp-mode-f32 is weird and has no meaning if it's missing.	2023-06-09 19:09:48 -04:00

1 2 3 4 5 ...

963 Commits