27637 Commits

Author SHA1 Message Date
Florian Hahn
fbcf8a8cbb
[ConstraintElim] Add (UGE, var, 0) to unsigned system for new vars. (#76262)
The constraint system used for ConstraintElimination assumes all
varibles to be signed. This can cause missed optimization in the
unsigned system, due to missing the information that all variables are
unsigned (non-negative).

Variables can be marked as non-negative by adding Var >= 0 for all
variables. This is done for arguments on ConstraintInfo construction and
after adding new variables. This handles cases like the ones outlined in
https://discourse.llvm.org/t/why-does-llvm-not-perform-range-analysis-on-integer-values/74341

The original example shared above is now handled without this change,
but adding another variable means that instcombine won't be able to
simplify examples like https://godbolt.org/z/hTnra7zdY

Adding the extra variables comes with a slight compile-time increase
https://llvm-compile-time-tracker.com/compare.php?from=7568b36a2bc1a1e496ec29246966ffdfc3a8b87f&to=641a47f0acce7755e340447386013a2e086f03d9&stat=instructions:u

stage1-O3    stage1-ReleaseThinLTO    stage1-ReleaseLTO-g  stage1-O0-g
 +0.04%           +0.07%                   +0.05%           +0.02%
stage2-O3    stage2-O0-g    stage2-clang
  +0.05%         +0.05%        +0.05%

https://github.com/llvm/llvm-project/pull/76262
2023-12-23 15:53:48 +01:00
Yingwei Zheng
345d7b1618
[InstCombine] Fold minmax intrinsic using KnownBits information (#76242)
This patch tries to fold minmax intrinsic by using
`computeConstantRangeIncludingKnownBits`.
Fixes regression in
[_karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c](c31943af16/Modules/_decimal/libmpdec/mpdecimal.c (L5460-L5462)),
which was introduced by #71396.
See also
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/16#issuecomment-1865875756.

Alive2 for splat vectors with undef: https://alive2.llvm.org/ce/z/J8hKWd
2023-12-23 04:41:32 +08:00
Florian Hahn
e9a56ab316
[PhaseOrdering] Add test with removable chained conditions.
Based on https://godbolt.org/z/hTnra7zdY, which is a slightly more
complicated version of the example from
https://discourse.llvm.org/t/why-does-llvm-not-perform-range-analysis-on-integer-values/74341
2023-12-22 19:44:20 +00:00
Alexandros Lamprineas
6c2ad8ac7b
[TLI][NFC] Autogenerate vectorized call tests for SLEEF/ArmPL. (#76146)
This patch prepares the ground for #76060.

* Unifies ArmPL and SLEEF tests for better coverage
* Replaces deprecated float* and double* types with ptr
* Adds noalias attribute to pointer arguments
* Adds some cmd-line options to the RUN lines to simplify output
* Removes datalayout since target triple is provided
* Removes checks for return statements
* Refactors the regex filter for autogenerated checks
* Removes redundant test file suffix (already under the AArch64 dir)
2023-12-22 16:29:18 +00:00
Nikita Popov
658b260dbf [Attributor] Don't construct pretty GEPs
Bring this in line with other transforms like ArgPromotion/SROA/
SCEVExpander and always produce canonical i8 GEPs.
2023-12-22 16:48:13 +01:00
Nikita Popov
c16559137c [IndVars] Avoid unnecessary truncate for zext nneg use
When performing sext IV widening, if one of the narrow uses is in
a zext nneg, we can treat it like an sext and avoid the insertion
of a trunc.
2023-12-22 11:30:17 +01:00
Nikita Popov
54067c5fbe [SROA] Use memcpy if type size does not match store size
The original memcpy also copies the padding, so make sure that
this is still the case after splitting.

Fixes https://github.com/llvm/llvm-project/issues/64081.
2023-12-22 10:19:22 +01:00
Nikita Popov
3f199cb14c [SROA] Add test for #64081 (NFC) 2023-12-22 10:19:21 +01:00
Shan Huang
06a9c6738a
[CVP] Fix #76058: missing debug location in processSDiv function (#76118)
This PR fixes #76058.
2023-12-22 09:26:32 +01:00
Mikhail Gudim
411cba215a
Revert "[InstCombine] Extend foldICmpBinOp to add-like or. (#71… (#76167)
…396)"

This reverts commit 8773c9be3d9868288f1f46957945d50ff58e4e91.
2023-12-21 11:41:09 -05:00
Nikita Popov
a134abf4be
[ValueTracking] Make isGuaranteedNotToBeUndef() more precise (#76160)
Currently isGuaranteedNotToBeUndef() is the same as
isGuaranteedNotToBeUndefOrPoison(). This function is used in places
where we only care about undef (due to multi-use issues), not poison.

Make it more precise by only considering instructions that can create
undef (like loads or call), and ignore those that can only create
poison. In particular, we can ignore poison-generating flags.

This means that inferring more flags has less chance to pessimize other
transforms.
2023-12-21 16:49:37 +01:00
Nikita Popov
b8df88b41c [InstCombine] Support zext nneg in gep of sext add fold
Add m_NNegZext() and m_SExtLike() matchers to make doing these kinds
of changes simpler in the future.
2023-12-21 16:38:09 +01:00
Nikita Popov
4d7112435e [InstCombine] Add zext nneg test variant for gep of sext add fold (NFC) 2023-12-21 16:38:09 +01:00
Chia
8674a023bc
[InstCombine] fold (Binop phi(a, b) phi(b, a)) -> (Binop a, b) while Binop is commutative. (#75765)
Alive2 proof: https://alive2.llvm.org/ce/z/2P8gq-
This patch closes #73905
2023-12-21 22:47:21 +08:00
Nikita Popov
38c1ff89ee [CVP] Add additional tests for undef check (NFC) 2023-12-21 15:40:14 +01:00
Z572
e6d2bb0ed8
[InstCombine] Simplifiy (-x * y * -x) into (x * y * x) (#72953)
fix https://github.com/llvm/llvm-project/issues/72259
proof: https://alive2.llvm.org/ce/z/HsrmTC
2023-12-21 19:13:09 +08:00
Florian Hahn
68fb3d596e
[ConstraintElim] Add test with select where the second op cant be poison.
Extra test for TODO from #75750.
2023-12-21 09:13:33 +00:00
Mikhail Gudim
8773c9be3d
[InstCombine] Extend foldICmpBinOp to add-like or. (#71396)
InstCombine canonicalizes `add` to `or` when possible, but this makes
some optimizations applicable to `add` to be missed because they don't
realize that the `or` is equivalent to `add`.

In this patch we generalize `foldICmpBinOp` to handle such cases.
2023-12-20 17:28:57 -05:00
Florian Hahn
18170d0f28
[ConstraintElim] Extend AND implication logic to support OR as well. (#76044)
Extend the logic check if an operand of an AND is implied by the other
to also support OR. This is done by checking if !op1 implies op2 or vice
versa.
2023-12-20 18:13:41 +01:00
Nikita Popov
8b8f2ef06e [MergeFunc] Fix comparison of constant expressions
Functions using different constant expressions were incorrectly
merged, because a lot of state was missing from the comparison,
including the opcode, the comparison predicate, the GEP element
type, as well as the inbounds, inrange and nowrap poison flags.
2023-12-20 15:59:02 +01:00
Alexey Bataev
a13148a880 [SLP]Fix PR75995: drop wrapping flags for resized wrapped binops.
If decided to resize the instruction, need to drop wrapping flags from
the resulting vector instructions to avoid incorrect
optimizations/assumptions later.
Fixes PR75995.
2023-12-20 06:51:39 -08:00
Alexey Bataev
8abf8c948c [SLP][NFC]Add a test with incorrect wrapping flags in the binops with
minbitwidth types.
2023-12-20 06:27:01 -08:00
bipmis
64987c648f
[ValueTracking] isNonZero sub of ptr2int's with recursive GEP (#68680)
When the sub arguments are ptr2int it is not possible to determine
computeKnownBits() of its arguments.
For scalar case generally sub of 2 ptr2int are converted to sub of
indexes.
However a loop with recursive GEP/PHI where the arguments to sub is of
type ptr2int, if it is possible to determine that a sub of this GEP and
another pointer with the same base is KnownNonZero we can return this.
This helps subsequent passes to optimize the loop further.
2023-12-20 14:11:58 +00:00
Nikita Popov
836e71a425 [MergeFunc] Adjust GEP indices in test (NFC)
Otherwise inbounds will be inferred, and we don't actually end
up testing the case of one gep without inbounds and one with.
2023-12-20 15:08:13 +01:00
Nikita Popov
3dd2db08a2 [MergeFunc] Add another test for incorrect constexpr merging (NFC)
Looks like we don't even check the opcode :(
2023-12-20 14:53:25 +01:00
Nikita Popov
1ff9fb78c8 [MergeFunc] Add tests for incorrect const expr merging (NFC) 2023-12-20 14:42:21 +01:00
Florian Hahn
7cf499c63b
[ConstraintElim] Check if second op implies first for And. (#75750)
Generalize checkAndSecondOpImpliedByFirst to also check if the second
operand implies the first.
2023-12-20 11:58:35 +01:00
Nikita Popov
273a0c9c07 [PhaseOrdering] Add data layout to test (NFC)
Needed for switch to lookup table optimization.
2023-12-20 11:49:34 +01:00
Nikita Popov
5ab5810054 [PhaseOrdering] Add additional test for switch with GEPs (NFC) 2023-12-20 11:41:46 +01:00
Nikita Popov
9d60e95bcd
[AMDGPU] Use poison instead of undef for non-demanded elements (#75914)
Return poison instead of undef for non-demanded lanes in the AMDGPU
demanded element simplification hook.

Also bail out of dmask is 0, as this case has special semantics:

> If DMASK==0, the TA overrides DMASK=1 and puts zeros in VGPR followed by
> LWE status if exists. TFE status is not generated since the fetch is dropped.
2023-12-20 11:01:59 +01:00
Paschalis Mpeis
2349731992
[TLI] Add SLEEFGNUABI mappings for fmod/fmodf fixed-width. (#75803)
Cleanup test sleef-calls-aarch64.ll:
- make the util update script's regex more clear
- eliminate scalar epilogues in tests
2023-12-20 09:08:17 +00:00
Mingming Liu
92f17714e8
[test]For ThinLTO icp test, require llvm-64-bit given the raw profile data is generated on 64-bit systems (#76005) 2023-12-19 20:11:39 -08:00
Mingming Liu
bdd76e691f
[test] Restrict thinlto icp IR test to little endian systems, and the compiler-rt test to three tested platforms. (#76001)
- The IR test failed to import indirect callees on big-endian systems.
The raw profiles are generated on little-endian systems. Going to
require little-endian.
- Limit the compiler-rt test to three tested platforms.
2023-12-19 19:55:27 -08:00
Mingming Liu
85525f8fb6
[test]Mark thinlto icp test as unsupported on powerpc (#75979)
Test failed on ppc
(https://lab.llvm.org/buildbot/#/builders/231/builds/18902), and logs
shows missed import.

Cannot reproduce this with machines I could access so far.
https://gcc.gnu.org/wiki/CompileFarm seems to provide ppc64 machine.
Mark the thinlto icp test as unsupported for now.
2023-12-19 14:51:18 -08:00
Mingming Liu
78a195e100
Reland the reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954)
Simplify the compiler-rt test to make it more general for different
platforms, and use `*DAG` matchers for lines that may be emitted
out-of-order.
- The compiler-rt test passed on a Windows machine. Previously name
matchers don't work for MSVC mangling
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907)
- `*DAG` matchers fixed the error in
https://lab.llvm.org/buildbot/#/builders/94/builds/17924

This is the second reland and fixed errors caught in first reland
(https://github.com/llvm/llvm-project/pull/75860)

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-19 12:25:56 -08:00
Nikita Popov
92fc4b482f [InstCombine] Preserve poison in bitcast of insertelement fold
If the base was poison, retain the poison value.
2023-12-19 13:06:04 +01:00
Nikita Popov
f412b78ffc [InstCombine] Return poison if all lanes are poison 2023-12-19 12:43:23 +01:00
Nikita Popov
e879f44b28 [InstCombine] Regenerate test checks (NFC) 2023-12-19 12:29:22 +01:00
Nikita Popov
9d4557920f [InstCombine] Don't treat undef as poison in demanded element simplification
We can only set PoisonElts if the element is poison, not if it is
undef.
2023-12-19 12:26:48 +01:00
Eric Biggers
09058654f6
[RISCV] Remove experimental from Vector Crypto extensions (#74213)
The RISC-V vector crypto extensions have been ratified. This patch
updates the Clang and LLVM support for these extensions to be
non-experimental, while leaving the C intrinsics as experimental since
the C intrinsics are not yet standardized.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
2023-12-18 22:04:22 -08:00
Mingming Liu
6ce23ea0ab
Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. "" (#75888)
Reverts llvm/llvm-project#75860
- Mangled name mismatch on Windows
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18 19:31:18 -08:00
Mingming Liu
c5871712ae
Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75860)
Fixed build-bot failures caught by post-submit tests
1) Add the list of command line tools needed by new compiler-rt test into dependency.
2) Use `starts_with` to replace deprecated `startswith`.

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 17:43:40 -08:00
James Y Knight
137f785fa6
[AMDGPU] Set MaxAtomicSizeInBitsSupported. (#75185)
This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.

While AMDGPU currently disables the use of all libcalls, I've changed it
to instead disable all of them _except_ the atomic ones. Those are
already be emitted by the Clang frontend, and enabling them in the
backend allows the same behavior there.
2023-12-18 16:51:06 -05:00
Mingming Liu
3aa5d71127
Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
2023-12-18 09:39:55 -08:00
Mingming Liu
245cddae70
[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 09:10:39 -08:00
Nikita Popov
e400c59beb Revert "[InstCombine] Favour m_Poison in SimplifyDemandedVectorElts"
This reverts commit 318d5bff0b65aa7d52fc7004d49587416f0fb564.

Has incomplete test updates.
2023-12-18 18:08:57 +01:00
Antonio Frighetto
318d5bff0b [InstCombine] Favour m_Poison in SimplifyDemandedVectorElts
A miscompilation issue has been addressed with refined checking.
2023-12-18 17:28:39 +01:00
Nikita Popov
cd54c47424 [InstCombine] Match poison instead of undef in foldVectorBinop()
Some negative tests turn into positive tests, as the differences
between undef and poison propagation allow additional transforms.
2023-12-18 17:01:59 +01:00
Nikita Popov
9d25b28b9e [InstCombine] Explicitly canonicalize splat shuffles to use poison RHS
This is usually handled by demanded elements simplification. However,
as that is not supported for scalable vectors, also handle it
explicitly here.
2023-12-18 16:30:40 +01:00
Nikita Popov
a5f3415533 [InstCombine] Replace non-demanded undef vector with poison
If an operand (esp to shufflevector or insertelement) is not
demanded, canonicalize it from undef to poison.
2023-12-18 16:12:37 +01:00