llvm-project

Author	SHA1	Message	Date
Alexey Bataev	8d933ea5ac	[SLP][NFC]Use SmallDensetSet for lookup instead of ArrayRef, NFC.	2023-09-06 13:17:30 -07:00
Mircea Trofin	24a08592bc	[nfc][thinlto] Factor common state for `computeImportForModule` (#65427 ) Added a class to hold such common state. The goal is to both reduce the argument list of other utilities used by `computeImportForModule` (which will be brought as members in a subsequent patch), and to make it easy to extend such state later.	2023-09-06 11:57:15 -07:00
Florian Hahn	785e7063b9	[VPlan] Don't rely on underlying instr in VPWidenRecipe (NFCI). VPWidenRecipe only needs the opcode to widen, all other information (flags, debug loc and operands) is already modeled directly via the recipe. This removes the remaining uses of the underlying instruction from VPWidenRecipe::execute.	2023-09-06 16:27:09 +01:00
Aleksandr Popov	0e0ff8573d	[GuardWidening] Refactor to work with the list of checks to widen/hoist Currently we hoist conditions from widenable branch which are joined to the widenable_condition by And operation. E.g if we have br(WC && (c1 && c2)) we will operate with (c1 && c2) unsplitted. This patch adds more flexibility to that mechanism by supporting work with the list of checks parsed from the widenable branch. On that stage patch doesn't change the logic of checks hoisting. In the example above we will either hoist both checks [c1, c2] or none of them. But in the future we would improve that logic analyzing each check separately. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D157689	2023-09-06 00:46:48 +02:00
Alexey Bataev	09b8bbd6e0	[SLP][NFC]Reorder indeces instead of real values, NFC. May save some memory/compile time.	2023-09-05 08:48:52 -07:00
Florian Hahn	165e24aa2a	[VPlan] Move DebugLoc to VPRecipeBase (NFCI). Add a dedicated debug location to VPRecipeBase to remove another unneeded use of the underlying LLVM IR instruction and also consolidate various DL fields in sub classes. Each recipe can have debug location and it shouldn't rely on reference to the underlying LLVM IR instructions to retain it. See various recipes that had separate DL fields already.	2023-09-05 15:45:16 +01:00
Florian Hahn	168e23c741	[VPlan] Remove reference to Instr when setting debug loc. (NFCI) This allows untangling references to underlying IR for various recipes.	2023-09-05 10:59:13 +01:00
Mel Chen	26aed5b9a8	[VPlan][LoopUtils] Remove unused parameter TTI This patch removes the member TTI from VPReductionRecipe, as the generation of reduction operations no longer requires TTI. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D158148	2023-09-04 05:30:37 -07:00
Florian Hahn	3fa1b254b7	[VPlan] Print blend recipe as operand directly, instead of IR PHI. Update VPBlendRecipe::print() to print the result directly, instead of relying on the stored Phi pointer. This brings the recipe in line with how other recipes are printed.	2023-09-04 12:35:58 +01:00
DianQK	7ded71b1e4	[JumpThreading] Invalidate LVI after `combineMetadataForCSE`.	2023-09-04 11:50:14 +08:00
Florian Hahn	19d286bca0	[VPlan] Assert that inst isnt' a debug or pseudo inst (NFCI). Debug and pseudo instructions aren't modeled in VPlan. Turn a check into an assertion. This will help removing the direct use of Inst here in the future.	2023-09-03 21:31:31 +01:00
Nuno Lopes	5a3fd5f3f5	[LoopVectorizer] Fix PR #65212 : vectorization of reduction loop wasn't respecting original store alignment	2023-09-03 16:35:05 +01:00
Florian Hahn	fd66195777	[VPlan] Manage compare predicates in VPRecipeWithIRFlags. Extend VPRecipeWithIRFlags to also manage predicates for compares. This allows removing the custom ICmpULE opcode from VPInstruction which was a workaround for missing proper predicate handling. This simplifies the code a bit while also allowing compares with any predicates. It also fixes a case where the compare predixcate wasn't printed properly for VPReplicateRecipes. Discussed/split off from D150398. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158992	2023-09-02 21:45:24 +01:00
Kazu Hirata	83e6931827	[llvm] Use llvm::is_contained (NFC)	2023-09-02 09:32:46 -07:00
Kazu Hirata	6da470d7f8	[llvm] Use range-based for loops (NFC)	2023-09-02 09:32:45 -07:00
Johannes Doerfert	9207a90be5	[Attributor] Do not expand dead indirect call sites	2023-09-01 22:14:38 -07:00
Johannes Doerfert	a8ac969b10	[Attributor][NFC] Use common helper to avoid duplication Many AAs translated callee information to the call site explicitly but they now all use the helper we already had for callee return to call site return propagation. In a follow up the helper is going to be extended to handle multiple callees.	2023-09-01 21:04:03 -07:00
Johannes Doerfert	ac0d3869c5	[Attributor][NFC] Simplify the helper APIs We have various helpers to propagate information. This patch cleans up the API to allow less template parameters and more uniform handling.	2023-09-01 21:04:02 -07:00
Johannes Doerfert	6b95126b9b	[Attributor][NFC] Rename AACallSiteReturnedFromReturned In a follow up we'll use it for more than "callee return" -> "call site return" deduction, effectively allowing "callee" -> "call site".	2023-09-01 21:04:02 -07:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Johannes Doerfert	37642714ed	[Attributor][FIX] Support non-0 AS for function pointers	2023-09-01 17:17:51 -07:00
Noah Goldstein	54ec8bcaf8	Recommit "[InstCombine] Expand `foldSelectICmpAndOr` -> `foldSelectICmpAndBinOp` to work for more binops" (3rd Try) Fixed bug that assumed binop was commutative. Was re-reviewed by nikic and chapuni Differential Revision: https://reviews.llvm.org/D148414	2023-09-01 17:15:51 -05:00
Christoph Stiller	3af4590506	[InstCombine] Contracting x^2 + 2xy + y^2 to (x + y)^2 (float) Resolves https://github.com/llvm/llvm-project/issues/61296 if https://reviews.llvm.org/D156026 didn't suffice. Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D158079	2023-09-01 15:02:12 -05:00
Matt Arsenault	5ae881ff0a	InstCombine: Fold out scale-if-denormal pattern Fold select (fcmp oeq x, 0), (fmul x, y), x => x This cleans up a pattern left behind by denormal range checks under denormals are zero. The pattern starts out as something like: x = x < smallest_normal ? x * K : x; The comparison folds to an == 0 when the denormal mode treats input denormals as zero. This makes library denormal checks free after linked into DAZ enabled code. alive2 is mostly happy with this, but there are some issues. First, there are many reported failures in some of the negative tests that happen to trigger some preexisting canonicalize introducing combine. Second, alive2 is incorrectly asserting that denormals must be flushed with the DAZ modes. It's allowed to drop a canonicalize. https://reviews.llvm.org/D157030	2023-09-01 07:47:12 -04:00
Vitaly Buka	be601928e1	[HWASAN] Inline fast pass of instrumentMemAccessOutline Usually pointer tag will match tag in the shadow, so we can keep inlining this check keeping the rest in outlined part. It imroves performance by about 25%, but increases code size by 30%. Existing outlining reduces performance by 30%, but saves code size by 80%. So we still significantly reduce code size with minimal performance loss. Reviewed By: fmayer Differential Revision: https://reviews.llvm.org/D159172	2023-08-31 21:26:48 -07:00
Johannes Doerfert	8dd3b4581c	[Attributor][NFC] Clean include order	2023-08-31 19:32:52 -07:00
Johannes Doerfert	209496b766	[Core] Allow `hasAddressTaken` to ignore "casted direct calls" A direct call to a function casted to a different type is still not really an address taken event. We allow the user to opt out of these now. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D159149	2023-08-31 19:32:52 -07:00
Mircea Trofin	a479dd1242	[nfc][thinlto] Mark some functions explicitly as "Test" Also removed them from the header. They are there for test-only. This simplifies further refactoring (as well as code comprehension) Differential Revision: https://reviews.llvm.org/D159308	2023-08-31 16:30:18 -07:00
wlei	f14a5ff635	[CSSPGO] Refactoring findIRAnchors Address feedback in https://reviews.llvm.org/D158817. Since `extractProbe` can be used for both calliste and BB probe, we can leverage this to unify the callsite handling code. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D159169	2023-08-31 16:25:47 -07:00
Matt Arsenault	70aede228a	InstCombine: Recognize fneg(fabs) as bitcasted integer Technically increases the number of instructions if the result isn't cast back to float. Even in this case it's still probably a better canonical form since it enables FP value tracking. https://reviews.llvm.org/D151939	2023-08-31 19:07:36 -04:00
Matt Arsenault	5c0da5839d	InstCombine: Recognize fabs as bitcasted integer In the past we sort of pretended float might be implementable as a non-IEEE type but that never realistically would work. Exotic FP types would need to be added to the IR. Turning these into FP operations enables FP tracking optimizations. https://reviews.llvm.org/D151937	2023-08-31 19:03:48 -04:00
Matt Arsenault	50a9b3d8a5	InstCombine: Recognize fneg when performed as bitcasted integer This is a resurrection of D18874. This was previously wrong with fneg conflated with fsub, but we now have a proper fneg instruction. Additionally, I think it is now clearer that IR float=IEEE float, and a different bit layout would require adding a different IR type. https://reviews.llvm.org/D151934	2023-08-31 18:59:34 -04:00
Vitaly Buka	c6aaf2e521	[NFC][HWASAN] Extract insertShadowTagCheck() Reviewed By: fmayer Differential Revision: https://reviews.llvm.org/D159165	2023-08-31 13:22:51 -07:00
Vitaly Buka	b80fa58bdc	[NFC][hwasan] Rename local variable	2023-08-31 12:25:46 -07:00
Vitaly Buka	bb637396db	[test][HWASAN] Precommit -hwasan-inline-fast-path-checks tests Reviewed By: fmayer Differential Revision: https://reviews.llvm.org/D159157	2023-08-31 11:24:36 -07:00
Igor Kirillov	ac65fb8699	[LoopVectorize] Fix incorrect order of invariant stores when there are multiple reductions. When a loop has multiple reductions, each with an intermediate invariant store, the order in which those reductions are processed is not considered. This can result in the invariant stores outside the loop not preserving the original order. This patch sorts VPReductionPHIRecipes by the order in which they have stores in the original loop before running `InnerLoopVectorizer::fixReduction` function, and it helps to maintain the correct order of stores. Fixes https://github.com/llvm/llvm-project/issues/64047 Differential Revision: https://reviews.llvm.org/D157631	2023-08-31 16:21:44 +00:00
Matt Arsenault	9536bbe464	Attributor: Don't pass ArrayRef by const reference	2023-08-31 08:41:08 -04:00
Matt Arsenault	850ec7bbb1	Attributor: Try to propagate concrete denormal-fp-math{-f32} Allow specialization of functions with "dynamic" denormal modes to a known IEEE or DAZ mode based on callers. This should make it possible to implement a is-denormal-flushing-enabled test using llvm.canonicalize and have it be free after LTO. https://reviews.llvm.org/D156129	2023-08-31 08:26:32 -04:00
Fraser Cormack	e0c60bff8c	[InferAddressSpaces][NFC] Fix code formatting	2023-08-31 12:19:13 +01:00
Jie Fu	3b51881dd5	[CSSPGO] Silence -Wunused-but-set-variable warning without asserts (NFC) /data/home/jiefu/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2189:8: error: variable 'IsFuncHashMismatch' set but not used [-Werror,-Wunused-but-set-variable] bool IsFuncHashMismatch = false; ^ 1 error generated.	2023-08-31 09:58:29 +08:00
wlei	4bb6bbb9bf	[CSSPGO] Skip reporting staleness metrics for imported functions Accumulating the staleness metrics from per-link is less accurate than doing it from post-link time(assuming we use the offline profile mismatch as baseline), the reason is that there are some duplicated reports for the same functions, for example, one template function could be included in multiple TUs, but in post thin link time, only one function are kept(linkonce_odr) and others are marked as available-externally function. Hence, this change skips reporting the metrics for imported functions(available-externally). I saw the post-link number is now very close to the offline number(dump the mismatched functions and count the metrics offline based on the entire profile), sightly smaller than offline number due to some missing inlined functions. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D156725	2023-08-30 18:00:23 -07:00
wlei	3365cd4544	[CSSPGO] Compute checksum mismatch recursively on nested profile Follow-up diff for https://reviews.llvm.org/D158891. Compute the checksum mismatch based on the original nested profile. Additionally, use a recursive way to compute the children mismatched samples in the nested tree even the top-level func checksum is matched. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D158900	2023-08-30 18:00:23 -07:00
wlei	62a3f6c96e	[CSSPGO] Retire FlattenProfileForMatching - Always use flattened profile to find the profile anchors. Since profile under different contexts may have different inlined callsites, to get more profile anchors, we use a merged profile from all the contexts(the flattened profile) to find callsite anchors. - Compute the staleness metrics based on the original nested profile, as currently once a callsite is mismatched, all its children profile are dropped.(TODO: in future, we can improve to reuse the children valid profile) Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D158891	2023-08-30 18:00:23 -07:00
wlei	062af2e763	[CSSPGO] Support stale profile matching for LTO As in per-link time, callsites could be optimized out by inlining, we don't have those original call targets in the IR in LTO time. Additionally, the inlined code doesn't actually belong to the original function, the IR locations or pseudo probe parsed from it are incorrect and could mislead the matching later. This change adds the support to extract the original IR location info from the inlined code, specifically, it make sure to skip all the inlined code that doesn't belong the original function, but before that, it processes the inline frames of the debug info to extract the base frame and recover its callsite and callee target(name). Measured on some stale profile instances, all showed some perf improvements. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D156722	2023-08-30 18:00:23 -07:00
wlei	148cceb0d6	[CSSPGO] Refactoring SampleProfileMatcher::runOnFunction - rename `IRLocation` --> `IRAnchors`, `ProfileLocation` --> `ProfileAnchors` - reorganize runOnFunction, fact out the finding IR anchors code into `findIRAnchors` - introduce a new function `findProfileAnchors` to populate the profile related anchors, the result is saved into `ProfileAnchors`, it's later used for both mismatch report and matching, this can avoid to parse the `getBodySamples` and `getCallsiteSamples` for multiple times. - move the `MatchedCallsiteLocs` stuffs from `findIRAnchors` to `countProfileMismatches` so that all the staleness metrics report are computed in one function. - move all matching related into `runStaleProfileMatching`, and move all mismatching report into `countProfileMismatches` Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D158817	2023-08-30 18:00:23 -07:00
Philip Reames	aada8f2e54	[slp] Tweak debug costing output to include VL This makes it much easier to understand which vector length is being considered when the same set of nodes are evaluated at multiple vector lengths.	2023-08-30 09:13:19 -07:00
Florian Hahn	e544d9cc36	[VPlan] Remove unused VPBuilder::insert member (NFC).	2023-08-30 16:35:55 +01:00
Florian Hahn	cd9563ae17	[VPlan] Remove unused VPInstruction::clone member (NFC).	2023-08-30 15:53:39 +01:00
Mikhail Goncharov	74f4daef04	fix unused variable warnings in conditionals for 92023b15099012a657da07ebf49dd7d94a260f84	2023-08-30 14:36:42 +02:00
Florian Hahn	4a5bcbd560	[ConstraintElim] Store conditional facts as (Predicate, Op0, Op1). This allows to add facts even if no corresponding ICmp instruction exists in the IR. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D158837	2023-08-30 10:54:28 +01:00

1 2 3 4 5 ...

34532 Commits