llvm-project

Author	SHA1	Message	Date
Florian Hahn	2ab5c47c87	[VPlan] Don't replace scalarizing recipe with VPWidenCastRecipe. Don't replace a scalarizing recipe with a VPWidenCastRecipe. This would introduce wide (vectorizing) recipes when interleaving only. Fixes https://github.com/llvm/llvm-project/issues/76986	2024-01-04 20:39:44 +00:00
Gabriel Baraldi	a87fa7f0ca	[InstCombine] Dont throw away noalias/alias scope metadata when inlining memcpys (#74805 ) This was found in julia when we changed some operations from explicit loads + stores to memcpys. While applying it to both the src and the dest seems weird, thats what we do for normal TBAA.	2024-01-04 17:04:31 +01:00
Alexey Bataev	79e62315be	[SLP]Use revectorized value for extracts from buildvector, beeing vectorized. When trying to reuse the extractelement instruction, emitted for the insertelement instruction, need to check, if the this insertelement instruction was vectorized. In this case, need to use vectorized value, not the original insertelement.	2024-01-04 06:45:26 -08:00
Nikita Popov	62144969bc	[ConstraintElim] Add debug output for failed preconditions Print debug output if a constraint does not get added due to a failed precondition.	2024-01-04 14:29:07 +01:00
Nikita Popov	f812251875	[ConstraintElim] Use SCEV to check for multiples (#76925 ) When adding constraints for induction variables, if the step is not one, we need to make sure that (end-start) is a multiple of step, otherwise we might step over the end value. Currently this only supports one specific pattern for pointers, where the end is a gep of the start with an appropriate offset. Generalize this by using SCEV to check for multiples, which also makes this work for integer IVs.	2024-01-04 14:04:15 +01:00
Jannik Silvanus	7954c57124	[IR] Fix GEP offset computations for vector GEPs (#75448 ) Vectors are always bit-packed and don't respect the elements' alignment requirements. This is different from arrays. This means offsets of vector GEPs need to be computed differently than offsets of array GEPs. This PR fixes many places that rely on an incorrect pattern that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`. We replace these by usages of `GTI.getSequentialElementStride(DL)`, which is a new helper function added in this PR. This changes behavior for GEPs into vectors with element types for which the (bit) size and alloc size is different. This includes two cases: * Types with a bit size that is not a multiple of a byte, e.g. i1. GEPs into such vectors are questionable to begin with, as some elements are not even addressable. * Overaligned types, e.g. i16 with 32-bit alignment. Existing tests are unaffected, but a miscompilation of a new test is fixed. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-01-04 10:08:21 +01:00
Nilanjana Basu	cd28da390f	[LV] Change loops' interleave count computation (#73766 ) [LV] Change loops' interleave count computation A set of microbenchmarks in llvm-test-suite (https://github.com/llvm/llvm-test-suite/pull/56), when tested on a AArch64 platform, demonstrates that loop interleaving is beneficial when the vector loop runs at least twice or when the epilogue loop trip count (TC) is minimal. Therefore, we choose interleaving count (IC) between TC/VF & TC/2*VF (VF = vectorization factor), such that remainder TC for the epilogue loop is minimum while the IC is maximum in case the remainder TC is same for both. The initial tests for this change were submitted in PRs: https://github.com/llvm/llvm-project/pull/70272 and https://github.com/llvm/llvm-project/pull/74689.	2024-01-04 12:45:22 +05:30
Yingwei Zheng	0ce193708c	[InstCombine] Refactor folding of commutative binops over select/phi/minmax (#76692 ) This patch cleans up the duplicate code for folding commutative binops over `select/phi/minmax`. Related commits: + select support: `88cc35b27e` + phi support: `8674a023bc` + minmax support: `624973806c`	2024-01-04 15:11:28 +08:00
Florian Hahn	6dda74cc51	[VPlan] Use createSelect in adjustRecipesForReductions (NFCI). Simplify the code and rename Result->NewExitingVPV as suggested by @ayalz in https://github.com/llvm/llvm-project/pull/70253.	2024-01-03 20:54:10 +00:00
Alexey Bataev	7c963fde16	[SLP]Use revectorized value for extracts from buildvector, beeing vectorized. If the insertelement instruction is vectorized, and the extractelement instruction from such insertelement also vectorized as part of the same tree, need to extract from the corresponding for insertelement vectorized value rather than original insertelement instruction.	2024-01-03 10:38:09 -08:00
Wei Wang	0faf46befa	[coroutines][DPValue] Update DILocation in DPValue for hoisted dbg.declare (#76765 ) Follow up #75402 to cover DPValue	2024-01-03 08:55:38 -08:00
Nikita Popov	c17af94b96	[ConstraintElim] Use SmallDenseMap (NFC) The number of variables in the constraint is usually very small. Use SmallDenseMap to avoid allocations.	2024-01-03 17:04:04 +01:00
Alexandros Lamprineas	ec7a231b30	[TLI] Use the VFABI demangling when declaring vector variants. (#76753 ) When creating a declaration for a vector variant, in order to determine the argument types we need to consult the VFABI demangler. This will allow us to add TLI mappings with linear arguments (see #76060).	2024-01-03 14:28:52 +00:00
Quentin Dian	7d81e07271	[SimplifyCFG] When only one case value is missing, replace default with that case (#76669 ) When the default branch is the last case, we can transform that branch into a concrete branch with an unreachable default branch. ```llvm target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" define i64 @src(i64 %0) { %2 = urem i64 %0, 4 switch i64 %2, label %5 [ i64 1, label %3 i64 2, label %3 i64 3, label %4 ] 3: ; preds = %1, %1 br label %5 4: ; preds = %1 br label %5 5: ; preds = %1, %4, %3 %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ] ret i64 %.0 } define i64 @tgt(i64 %0) { %2 = urem i64 %0, 4 switch i64 %2, label %unreachable [ i64 0, label %5 i64 1, label %3 i64 2, label %3 i64 3, label %4 ] unreachable: ; preds = %1 unreachable 3: ; preds = %1, %1 br label %5 4: ; preds = %1 br label %5 5: ; preds = %1, %4, %3 %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ] ret i64 %.0 } ``` Alive2: https://alive2.llvm.org/ce/z/Y-PGXv After transform to a lookup table, I believe `tgt` is better code. The final instructions are as follows: ```asm src: # @src and edi, 3 lea rax, [rdi - 1] cmp rax, 2 ja .LBB0_1 mov rax, qword ptr [8rdi + .Lswitch.table.src-8] ret .LBB0_1: xor eax, eax ret tgt: # @tgt and edi, 3 mov rax, qword ptr [8rdi + .Lswitch.table.tgt] ret .Lswitch.table.src: .quad 1 # 0x1 .quad 1 # 0x1 .quad 2 # 0x2 .Lswitch.table.tgt: .quad 0 # 0x0 .quad 1 # 0x1 .quad 1 # 0x1 .quad 2 # 0x2 ``` Godbolt: https://llvm.godbolt.org/z/borME8znd Closes #73446.	2024-01-03 09:22:13 +08:00
Florian Hahn	3c127e83c0	[ConstraintElim] Replace NUWSub decomp with recursive decomp of ops. The current patterns for NUWSub decompositions do not handle negative constants correctly at the moment (causing #76713). Replace the incorrect pattern by more general code that recursively decomposes the operands and then combines the results. This is already done in most other places that handle operators like add/mul. This means we fall back to the general constant handling code (fixes the mis-compile) while also being able to support reasoning about decomposable expressions in the SUB operands. Fixes https://github.com/llvm/llvm-project/issues/76713.	2024-01-02 22:05:57 +00:00
Alexander Shaposhnikov	3af59cfe0b	[ConstraintElim] Add facts implied by llvm.abs (#73189 ) Add "abs(x) >=s x" fact. https://alive2.llvm.org/ce/z/gOrrU3 Test plan: ninja check-all	2024-01-02 11:00:03 -08:00
Alexandros Lamprineas	e512df3ecc	[LV] Fix crash when vectorizing function calls with linear args. (#76274 ) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an integer, floating point, or pointer type."' failed. Stack dump: llvm::FixedVectorType::get(llvm::Type, unsigned int) llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&) llvm::VPBasicBlock::execute(llvm::VPTransformState) llvm::VPRegionBlock::execute(llvm::VPTransformState) llvm::VPlan::execute(llvm::VPTransformState) ... Happens with function calls of void return type.	2024-01-02 18:14:16 +00:00
Wei Wang	9c978c9418	[coroutines] Use DILocation from new storage for hoisted dbg.declare (#75402 ) Make the hoisted dbg.declare inherent the DILocation scope from the new storage. After hoisting, the dbg.declare is moved into the block that defines the new storage. This could create an inconsistency in the debug location scope hierarchy where the scope of hoisted dbg.declare (i.e. DILexicalBlock) is enclosed with the scope of the block (i.e. DISubprogram). This confuses LiveDebugValues pass to think that the hoisted dbg.declare is killed in that block and does not generate DBG_VALUE in other blocks. Debugger won't be able to track its value anymore. We do this for unoptimized binary only.	2024-01-02 09:54:16 -08:00
Nikita Popov	9d5b0965c4	[InstCombine] Add helper for commutative icmp folds (NFCI) Add a common place for icmp folds that should be tried with both operand orders, so we don't have to repeat this pattern for individual folds.	2024-01-02 16:16:32 +01:00
Enna1	9943d33997	[SLP][NFC] Fix assertion in vectorizeGEPIndices() (#76660 ) The index constraints for the collected getelementptr instructions should be single and non-constant.	2024-01-02 21:32:18 +08:00
Yingwei Zheng	7e405eb722	[FuncAttrs] Don't infer `noundef` for functions with `sanitize_memory` attribute (#76691 ) MemorySanitizer assumes that the definition and declaration of a function will be consistent. If we add `noundef` for some definitions, it will break msan. Fix buildbot failure caused by #76553.	2024-01-02 06:59:56 +08:00
Florian Hahn	f18536d642	[VPlan] Model address separately. (#72164 ) Move vector pointer generation to a separate VPVectorPointerRecipe. This untangles address computation from the memory recipes future and is also needed to enable explicit unrolling in VPlan. https://github.com/llvm/llvm-project/pull/72164	2024-01-01 19:51:15 +00:00
hstk30-hw	4b2f1184fc	Skip tranformConstExprCastCall for naked function (#76496 ) Fix this issue https://github.com/llvm/llvm-project/issues/72843 . For naked function, assembly might be using an argument, or otherwise rely on the frame layout, so don't transformConstExprCastCall	2024-01-01 22:52:13 +08:00
Yingwei Zheng	949ec83eaf	[InstCombine] Relax the same-underlying-object constraint for the GEP canonicalization (#76583 ) `7d7001b2cb` canonicalizes `(gep i8, X, (ptrtoint Y) - (ptrtoint X))` into `bitcast Y` iff `X` and `Y` have the same underlying object. I find that the result of this pattern is usually used as an operand of an icmp in some real-world applications. I think we can do the canonicalization if the result is only used by icmps/ptrtoints. Alive2: https://alive2.llvm.org/ce/z/j4-HJZ	2024-01-01 00:35:42 +08:00
Florian Hahn	f248d5eed1	[Local] Bring back check for FP types in getExpressionForConstant. The check makes sure that the result for getZExtValue is guaranteed to fit into 64 bit.	2023-12-31 13:50:25 +00:00
Florian Hahn	b46638dc76	[Local] Handle undef FP constant in getExpressionForConstant. Check for FP constant instead of checking for floating point types, as Undef/Poison values can have floating point types while not being FPConstants. This fixes a crash introduced by #66745 (f3b20cb).	2023-12-31 13:42:47 +00:00
Yingwei Zheng	1228becf7d	[FuncAttrs] Deduce `noundef` attributes for return values (#76553 ) This patch deduces `noundef` attributes for return values. IIUC, a function returns `noundef` values iff all of its return values are guaranteed not to be `undef` or `poison`. Definition of `noundef` from LangRef: ``` noundef This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation. ``` Alive2: https://alive2.llvm.org/ce/z/g8Eis6 Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=30dcc33c4ea3ab50397a7adbe85fe977d4a400bd&to=c5e8738d4bfbf1e97e3f455fded90b791f223d74&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| \|+0.01%\|+0.01%\|-0.01%\|+0.01%\|+0.03%\|-0.04%\|+0.01%\| The motivation of this patch is to reduce the number of `freeze` insts and enable more optimizations.	2023-12-31 20:44:48 +08:00
Jie Fu	bf312263bf	[InstCombine] Remove unused variables in InstCombineSelect.cpp (NFC) llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:14: error: unused variable 'LHS' [-Werror,-Wunused-variable] 3810 \| Value LHS, RHS; \| ^~~ llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp:3810:20: error: unused variable 'RHS' [-Werror,-Wunused-variable] 3810 \| Value LHS, RHS; \|	2023-12-31 18:40:26 +08:00
Yingwei Zheng	b23f59a646	[InstCombine] Fold `select (A &/\| B), T, F` if `select B, T, F` is foldable (#76621 ) This patch does the following folds: ``` (select A && B, T, F) -> (select A, (select B, T, F), F) (select A \|\| B, T, F) -> (select A, T, (select B, T, F)) ``` if `(select B, T, F)` can be folded into a value or a canonicalized SPF. Alive2: https://alive2.llvm.org/ce/z/4Bdrbu The original motivation of this patch is to simplify the following pattern: ``` %.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1) %add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i %cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i %cmp9.i = icmp ugt i64 %add.i, 1152921504606846975 %or.cond.i = or i1 %cmp7.i, %cmp9.i %cond.i = select i1 %or.cond.i, i64 1152921504606846975, i64 %add.i -> %.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1) %add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i %cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i %max = call i64 @llvm.umax.i64(i64 %add.i, 1152921504606846975) %cond.i = select i1 %cmp7.i, i64 1152921504606846975, i64 %max ``` The later form has a better codegen for some backends. It is also more analysis-friendly than the original one. Godbolt: https://godbolt.org/z/eK6eb5jf1 Alive2: https://alive2.llvm.org/ce/z/VHlxL2 Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=7c71d3996a72b9b024622f23bf556539b961c88c&to=638ce8666fadaca1ab2639a3c2bc52a4a8508f40&stat=instructions:u \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| \|+0.02%\|-0.00%\|+0.02%\|-0.03%\|-0.00%\|-0.05%\|-0.00%\| It is an alternative to #76203 and #76363 because we can simplify `select (icmp eq/ne a, b), a, b` into `b` or `a`. Fixes #75784. Fixes #76043. Thank @XChy for providing additional tests. Co-authored-by: XChy <xxs_chy@outlook.com>	2023-12-31 18:28:48 +08:00
Yingwei Zheng	568db84247	[InstCombine] Refactor `canonicalizeSPF` to support decomposed select. NFC. See also https://github.com/llvm/llvm-project/pull/76621	2023-12-31 16:30:24 +08:00
Mikhail Gudim	7a581c34f1	Reland "[InstCombine] Extend `foldICmpBinOp` to `add`-like `or`" (#76531 ) The original PR had a typo which was causing a bug.	2023-12-30 01:55:07 -05:00
Enna1	a51c2f39f5	[SLP] no need to generate extract for in-tree uses for original scala… (#76077 ) …r instruction. Before `77a609b556`, we always skip in-tree uses of the vectorized scalars in `buildExternalUses()`, that commit handles the case that if the in-tree use is scalar operand in vectorized instruction, we need to generate extract for these in-tree uses. in-tree uses remain as scalar in vectorized instructions can be 3 cases: - The pointer operand of vectorized LoadInst uses an in-tree scalar - The pointer operand of vectorized StoreInst uses an in-tree scalar - The scalar argument of vector form intrinsic uses an in-tree scalar Generating extract for in-tree uses for vectorized instructions are implemented in `BoUpSLP::vectorizeTree()`: - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11497-L11506 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11542-L11551 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11657-L11667 However, `77a609b556` not only generates extract for vectorized instructions, but also generates extract for original scalar instructions. There is no need to generate extract for origin scalar instrutions, as these scalar instructions will be replaced by vector instructions and get erased later. This patch marks there is no exact user for in-tree scalars that remain as scalar in vectorized instructions when building external uses, In this case all uses of this scalar will be automatically replaced by extractelement. and remove - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11497-L11506 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11542-L11551 - https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L11657-L11667 extracts.	2023-12-30 10:45:26 +08:00
Yingwei Zheng	90802e652d	[InstCombine] Handle commuted cases of the fold `((B\|C)&A)\|B -> B\|(A&C)` (#76565 ) Alive2: https://alive2.llvm.org/ce/z/Qdsqk6 The commit `f1eda23514` didn't handle other cases that commute operands.	2023-12-29 23:58:58 +08:00
XChy	dafd17895f	[InstCombine][NFC] Format code in foldCmpLoadFromIndexedGlobal	2023-12-29 17:42:38 +08:00
Yingwei Zheng	2128fca6c1	[InstCombine] Canonicalize `gep T* X, V / sizeof(T)` to `gep i8* X, V` (#76458 ) This patch canonicalize `gep T* X, V / sizeof(T)` to `gep i8* X, V`. Alive2: https://alive2.llvm.org/ce/z/7XGjiB As this pattern has been handled by the backends, the motivation of this patch is to reduce the ref count of sdiv, which will enable more optimizations.	2023-12-29 11:30:00 +08:00
Florian Hahn	516cc98aff	[LV] Fix typo in comment (NFC).	2023-12-28 21:20:10 +00:00
Alexey Bataev	5096501082	[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 ) SLP/TTI do not know about the cost estimation for addsub pattern, supported by X86. Previously the support for pattern detection was added (seeTTI::isLegalAltInstr), but the cost still did not estimated properly.	2023-12-28 05:04:04 -08:00
Yingwei Zheng	7a1a476116	[InstCombine] Fold `(X & C1) \| C2` into `X & (C1 \| C2)` iff `(X & C2) == C2` (#76470 ) Alive2: https://alive2.llvm.org/ce/z/VKJYaS	2023-12-28 20:47:40 +08:00
Wei Tao	a700298b3d	[CanonicalizeFreezeInLoops] fix duplicate removal (#74716 ) This PR fixes #74572 where the freeze instruction could be found twice by the pass CanonicalizeFreezeInLoops, and then the compiling may crash in second removal since the instruction has already gone.	2023-12-28 09:47:31 +01:00
Douglas Yung	fb981e6b4b	Revert "[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 )" This reverts commit bc8c4bbd7973ab9527a78a20000aecde9bed652d. Change is failing to build on several bots: - https://lab.llvm.org/buildbot/#/builders/127/builds/60184 - https://lab.llvm.org/buildbot/#/builders/123/builds/23709 - https://lab.llvm.org/buildbot/#/builders/216/builds/32302	2023-12-27 23:52:04 -08:00
Alexey Bataev	bc8c4bbd79	[SLP][TTI][X86]Add addsub pattern cost estimation. (#76461 ) SLP/TTI do not know about the cost estimation for addsub pattern, supported by X86. Previously the support for pattern detection was added (seeTTI::isLegalAltInstr), but the cost still did not estimated properly.	2023-12-27 15:57:21 -05:00
Craig Topper	7f1c8fc25a	[InstCombine] Use ConstantInt::getSigned to sign extend -2 for large types. (#76464 ) Using ContantInt::get will zero extend. Fixes #76441	2023-12-27 12:27:12 -08:00
Yingwei Zheng	aacff347af	[InstCombine] Simplify `icmp pred (sdiv exact X, C), (sdiv exact Y, C)` into `icmp pred X, Y` when C is positive (#76409 ) Alive2: https://alive2.llvm.org/ce/z/u49dQ9 It will improve the codegen of `std::_Vector_base<T>::~_Vector_base()` when `sizeof(T)` is not a power of 2. NOTE: We can also fold `icmp signed-pred (sdiv exact X, C), (sdiv exact Y, C)` into `icmp signed-pred (sdiv exact Y, C), (sdiv exact X, C)` when C is negative. But I don't think it enables more optimizations for real-world applications.	2023-12-27 06:06:16 +08:00
Yingwei Zheng	4358e6e0c5	[FuncAttrs] Infer `norecurse` for funcs with calls to `nocallback` callees (#76372 ) This patch adds missing `norecurse` attrs to funcs that only call intrinsics with `nocallback` attrs. Fixes the regression found in https://github.com/dtcxzyw/llvm-opt-benchmark/pull/45#discussion_r1436148743. The function loses `norecurse` attr because it calls `@llvm.fabs.f64`, which is not marked as `norecurse`. Since `norecurse` is not a default attribute of intrinsics and it is ambiguous for intrinsics, I decided to use the existing `callback` attributes. > nocallback This attribute indicates that the function is only allowed to jump back into caller’s module by a return or an exception, and is not allowed to jump back by invoking a callback function, a direct, possibly transitive, external function call, use of longjmp, or other means. It is a compiler hint that is used at module level to improve dataflow analysis, dropped during linking, and has no effect on functions defined in the current module. See also https://llvm.org/docs/LangRef.html#function-attributes.	2023-12-27 03:16:43 +08:00
Yingwei Zheng	ff76627aeb	[InstCombine] Fix type mismatch between cond and value in `foldSelectToCopysign` (#76343 ) This patch fixes the miscompilation when we try to bitcast a floating point vector into an integer scalar.	2023-12-26 00:04:06 +08:00
Yingwei Zheng	0d454d6e59	[InstCombine] Fold xor of icmps using range information (#76334 ) This patch folds xor of icmps into a single comparison using range-based reasoning as `foldAndOrOfICmpsUsingRanges` does. Fixes #70928.	2023-12-25 07:14:31 +08:00
Craig Topper	d8ddcae547	[LSR] Fix typo in debug message where backspace escape was used instead of new line.	2023-12-24 10:35:27 -08:00
Benjamin Kramer	9423e45987	[ProfileData] Copy CallTargetMaps a bit less. NFCI	2023-12-24 17:48:18 +01:00
Kazu Hirata	1daf2994de	[llvm] Use StringRef::contains (NFC)	2023-12-23 22:21:52 -08:00
Florian Hahn	fbcf8a8cbb	[ConstraintElim] Add (UGE, var, 0) to unsigned system for new vars. (#76262 ) The constraint system used for ConstraintElimination assumes all varibles to be signed. This can cause missed optimization in the unsigned system, due to missing the information that all variables are unsigned (non-negative). Variables can be marked as non-negative by adding Var >= 0 for all variables. This is done for arguments on ConstraintInfo construction and after adding new variables. This handles cases like the ones outlined in https://discourse.llvm.org/t/why-does-llvm-not-perform-range-analysis-on-integer-values/74341 The original example shared above is now handled without this change, but adding another variable means that instcombine won't be able to simplify examples like https://godbolt.org/z/hTnra7zdY Adding the extra variables comes with a slight compile-time increase https://llvm-compile-time-tracker.com/compare.php?from=7568b36a2bc1a1e496ec29246966ffdfc3a8b87f&to=641a47f0acce7755e340447386013a2e086f03d9&stat=instructions:u stage1-O3 stage1-ReleaseThinLTO stage1-ReleaseLTO-g stage1-O0-g +0.04% +0.07% +0.05% +0.02% stage2-O3 stage2-O0-g stage2-clang +0.05% +0.05% +0.05% https://github.com/llvm/llvm-project/pull/76262	2023-12-23 15:53:48 +01:00

1 2 3 4 5 ...

35448 Commits