174 Commits

Author SHA1 Message Date
Matt Arsenault
d4912e8050 ValueTracking: Add some tests to cover asserts in fcmpImpliesClass
Catch asserts hit after 1adce7d8e47e2438f99f91607760b825e5e3cc37
2023-11-13 15:05:52 +09:00
XChy
e471cd1d73
[EarlyCSE] Support CSE for commutative intrinsics with over 2 args (#67255)
Extends EarlyCSE to support commutative intrinsics with over 2 args.
2023-09-24 21:23:00 +08:00
DianQK
2d1e8a03f5
[EarlyCSE] Compare GEP instructions based on offset (#65875)
Closes #65763.
This will provide more opportunities for constant propagation for
subsequent optimizations.
2023-09-20 06:14:45 +08:00
Jay Foad
9ff71814cb [EarlyCSE] Do not CSE convergent calls with memory effects
D149348 did this for readnone calls, which are handled by SimpleValue.
This patch does the same for all other CSEable calls, which are handled
by CallValue.

Differential Revision: https://reviews.llvm.org/D153151
2023-07-14 11:43:41 +01:00
Jay Foad
c2f8fe7cd8 [EarlyCSE] Precommit test for D153151
Differential Revision: https://reviews.llvm.org/D155210
2023-07-14 11:43:41 +01:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Arthur Eubanks
1876592ce3 [test] Regenerate some tests 2023-06-27 16:53:50 -07:00
Matt Arsenault
72d53e109e EarlyCSE: Add regression test for computeKnownFPClass phi handling
This was reduced from the failure that caused the revert in
e13f88d1ff5234946af6349a9a7cf56fcb6c040e
2023-05-18 12:49:20 +01:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Mingming Liu
bb4ba96ed2 [NFC] Add a test case to make sure EarlyCSE preserves !prof when one
instruction CSE'ed another.

- This should be a part of D148877. Before that patch, !prof is not added to known-id-set [1], and turns out unknown types of metadata are dropped in the implementation [2].
  - This test is mainly added to make sure there won't be regressions for this kind of pattern. The pattern is observed it in application code; looks like the result of indirect call is used as function arguments initially; after the function is inlined load-after-store CSE opportunity is exposed.

  [1] f478721231/llvm/lib/Transforms/Utils/Local.cpp (L2727-L2741)
  [2] ade3c6a6a8/llvm/lib/Transforms/Utils/Local.cpp (L2639)

Differential Revision: https://reviews.llvm.org/D149396
2023-05-02 17:34:53 -07:00
Mingming Liu
297c10fd17 [NFC][EarlyCSE]Modify test case to ensure branch weights are preserved with cse.
Differential Revision: https://reviews.llvm.org/D149390
2023-05-02 17:34:53 -07:00
Nikita Popov
084ca632ac [EarlyCSE] Only combine metadata for load CSE
There is no need to combine metadata if we're performing store to
load forwarding. In that case we would end up combining metadata
on an unrelated load instruction.
2023-05-02 12:51:56 +02:00
Nikita Popov
a67a21bf41 [EarlyCSE] Add additional metadata preservation test (NFC) 2023-05-02 12:51:55 +02:00
Nikita Popov
9b5ff4436e [EarlyCSE] Call combineMetadataForCSE() when CSEing loads
We may have to adjust metadata on the replacement load if the
metadata is poison-generating.
2023-04-03 16:10:19 +02:00
Nikita Popov
1b326be9e8 [EarlyCSE] Regenerate test checks (NFC) 2023-04-03 15:52:13 +02:00
Nikita Popov
2ead232626 [EarlyCSE] Add metadata preservation tests (NFC) 2023-04-03 15:52:13 +02:00
Max Kazantsev
0cbb8ec030 Revert "[AssumptionCache] caches @llvm.experimental.guard's"
This reverts commit f9599bbc7a3f831e1793a549d8a7a19265f3e504.

For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.

Differential Revision: https://reviews.llvm.org/D142330
2023-02-20 18:38:07 +07:00
Joshua Cao
f9599bbc7a [AssumptionCache] caches @llvm.experimental.guard's
As discussed in https://github.com/llvm/llvm-project/issues/59901

This change is not NFC. There is one SCEV and EarlyCSE test that have an
improved analysis/optimization case. Rest of the tests are not failing.

I've mostly only added cleanup to SCEV since that is where this issue
started. As a follow up, I believe there is more cleanup opportunity in
SCEV and other affected passes.

There could be cases where there are missed registerAssumption of
guards, but this case is not so bad because there will be no
miscompilation. AssumptionCacheTracker should take care of deleted
guards.

Differential Revision: https://reviews.llvm.org/D142330
2023-01-24 20:16:46 -08:00
Fraser Cormack
d808ad822a [EarlyCSE] Fix crash when optimizing masked loads/stores
With opaque pointers, it is possible for EarlyCSE to encounter masked
load/store intrinsics which access the same pointer value but with
different incompatible types. These cannot form valid replacements
(without explicit casting, which we don't yet do even for regular
load/store instructions) so should be prevented.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D141613
2023-01-12 17:31:36 +00:00
Bjorn Pettersson
ac696ac453 Use opt -passes=<name> instead of opt -name
Updated the RUN line in several test cases to use the new PM syntax
  opt -passes=<pipeline>
instead of the deprecated syntax
  opt -pass1 -pass2
2022-11-08 12:15:42 +01:00
Arthur Eubanks
c384b20b55 [opt] Remove temporary legacy pass name translations
And update corresponding tests.
2022-10-07 11:09:46 -07:00
Sanjay Patel
2981a94902 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0), part 2
Follow-up to 7f1262a322c0d80f3. That patch avoided removing the
call, but it still allowed the constant-folded result. This
makes the behavior consistent with 1-arg libm folding: if the
call potentially raises an exception, then we just bail out.

It seems likely that there are other corner-cases like this,
but the tests are incomplete, so we have lived with these
discrepancies for a long time. This was untested before the
the constant folding was expanded in D127964.
2022-08-20 10:16:06 -04:00
Sanjay Patel
7f1262a322 [EarlyCSE][ConstantFolding] do not constant fold atan2(+/-0.0, +/-0.0)
These may raise an error (set errno) as discussed in the post-commit
comments for D127964, so we can't fold away the call and potentially
alter that behavior.
2022-08-19 12:27:29 -04:00
Sanjay Patel
4bff1037bb [EarlyCSE][ConstantFolding] add tests for atan2 with zero args; NFC 2022-08-19 12:18:53 -04:00
Kevin P. Neal
05ac82de40 [FPEnv][EarlyCSE] Support for CSE when exception behavior is "ignore" or "maytrap" and the rounding mode is known.
Previously we would only CSE constrained FP intrinsics in the default
floating point environment. Exception behavior of "strict" is still not
allowed since we are not allowed to remove any traps in that case.

There are no restrictions on CSE across function calls inside a function.

Differential Revision: https://reviews.llvm.org/D112256
2022-08-16 08:31:42 -04:00
Sanjay Patel
43dd567443 [EarlyCSE] allow flexibility in atan(-0.0) test
As discussed in the post-commit feedback for b53d44fe47413c87f619b,
this test was failing on AIX because atan(-0.0) results in 0.0 (positive).

Differential Revision: https://reviews.llvm.org/D131601
2022-08-10 15:02:01 -04:00
Mohammed Nurul Hoque
30abc1a6a1 [ConstantFolding] Eliminate atan and atan2 calls
From the opengroup specifications, atan2 may fail if the result
underflows and atan may fail if the argument is subnormal, but
we assume that does not happen and eliminate the calls if we
can constant fold the result at compile-time.

Differential Revision: https://reviews.llvm.org/D127964
2022-08-10 11:01:50 -04:00
Jake Egan
c1226585b3 [AIX][tests] XFAIL for system-aix instead
The Clang folding for floating-point sometimes calls out to the host.
2022-08-10 09:31:42 -04:00
Jake Egan
6da3f90195 [AIX][tests] XFAIL atan.ll test on AIX
XFAIL this newly added test for now to get the AIX bot back to green.
2022-08-09 09:58:08 -04:00
Sanjay Patel
59f3b3d796 [EarlyCSE][ConstantFolding] move test files to dir of pass in RUN line; NFC 2022-08-08 10:08:55 -04:00
Mohammed Nurul Hoque
b53d44fe47 [EarlyCSE][ConstantFolding] add tests for atan/atan2; NFC
Baseline coverage for D127964.
2022-08-08 09:24:58 -04:00
Denis Antrushin
36cc533471 [EarlyCSE][OpaquePointers]Replace assert with return for mask type check.
When EarlyCSE tries to common vector masked loads/stores, it first checks that
they have same base operand and then assumes that this is enough for mask types
to be equal. This is true for typed pointers but false for opaque ones -
two loads of different vector sizes from same base pointer '%b' are the same,
`ptr %b`. (For typed pointers, `%b` was cast to vector pointer type so bases
were different).
Change assert to return from lambda `isSubmask` so this transformation properly
works with opaque pointers.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D131251
2022-08-08 16:14:42 +03:00
Chris Bieneman
383e754072 NFC. Require DirectX backend for these tests
Should have added this when I added the test directory. This just
requires the DirectX target for running these tests.
2022-08-03 15:55:03 -05:00
Chris Bieneman
ee4d815008 [DX] Remove IntrNoMem from create handle intrinsic
The create handle intrinsic calls can't be removed, so it was incorrect
to mark them as IntrNoMem.
2022-08-02 16:57:22 -05:00
Kevin P. Neal
25a83005ef Precommit tests for D112256 "[FPEnv][EarlyCSE] Add support for CSE of constrained FP intrinsics, take 2" 2022-07-28 08:59:27 -04:00
Nikita Popov
60a32157a5 [Tests] Remove unnecessary bitcasts from opaque pointer tests (NFC)
Previously left these behind due to the required instruction
renumbering, drop them now. This more accurately represents
opaque pointer input IR.

Also drop duplicate opaque pointer check lines in one SROA test.
2022-06-22 14:15:46 +02:00
Florian Hahn
b8d728a098
[SimplifyCFG,EarlyCSE] Update 2 tests to not branch on undef (NFC). 2022-06-12 18:03:26 +01:00
Nikita Popov
3c514d31d7 [EarlyCSE] Update tests to use opaque pointers (NFC)
Update the EarlyCSE tests to use opaque pointers.

Worth noting that this leaves some bitcast ptr to ptr instructions
in the input IR behind which are no longer necessary. This is
because these use numbered instructions, so it's hard to drop them
in an automated fashion (as it would require renumbering all other
instructions as well). I'm leaving that as a problem for another day.

The test updates have been performed using
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34.

Differential Revision: https://reviews.llvm.org/D127278
2022-06-10 09:53:35 +02:00
Artur Pilipenko
5ee0123642 [EarlyCSE] Add tests demonstrating missed opportunitites
Add tests demonstrating missed opportunitites around
invariant.start intrinsic.

NFC.
2022-04-26 11:58:16 -07:00
Arthur Eubanks
af6b9939aa [EarlyCSE][OpaquePtr] Check access type when performing DSE
This will bail out on target specific intrinsics. If those are deemed
important enough for EarlyCSE to handle, we can augment MemIntrinsicInfo
with an access type for TargetTransformInfo::getTgtMemIntrinsic() to
handle.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120077
2022-02-17 11:58:53 -08:00
Nikita Popov
46f9e45ef0 [Statepoint] Update gc.statepoint calls in tests with elementtype (NFC)
This updates tests for the LangRef change in D117890.
2022-02-04 14:15:41 +01:00
Nikita Popov
60147c6034 [EarlyCSE] Regenerate test checks (NFC) 2022-01-20 14:49:26 +01:00
Nikita Popov
918015c9ba [EarlyCSE] Support opaque pointers
Explicitly check the load/store value type, because this is no
longer implicitly checked through the pointer type.
2022-01-06 17:08:50 +01:00
Florian Hahn
361111906b
[EarlyCSE] Retain poison flags, if program is UB if poison.
Poison-generating flags can be retained during CSE on the earlier
instruction , *if* the earlier instruction being poison causes UB. For
now, always take AND for floating point instructions.

https://alive2.llvm.org/ce/z/4K3D7P

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D115247
2021-12-11 15:11:44 +00:00
Florian Hahn
22e6094b20
[EarlyCSE] Add test case with inbounds gep where flags can be retained. 2021-12-07 13:46:25 +00:00
Florian Hahn
aca7a19039
[EarlyCSE] Auto-generate check lines for flags.ll.
The test already checks the full IR. To make updating easier,
auto-generate the check lines.
2021-12-07 13:46:13 +00:00
Bjorn Pettersson
d52f506192 [NewPM] Use parameterized syntax for a couple of more passes
A couple of passes that are parameterized in new-PM used different
pass names (in cmd line interface) while using the same pass class
name. This patch updates the PassRegistry to model pass parameters
more properly using PASS_WITH_PARAMS.

Reason for the change is to ensure that we have a 1-1 mapping
between class name and pass name (when disregarding the params).
With a 1-1 mapping it is more obvious which pass name to use in
options such as -debug-only, -print-after etc.

The opt -passes syntax is changed for the following passes:
  early-cse-memssa => early-cse<memssa>
  post-inline-ee-instrument => ee-instrument<post-inline>
  loop-extract-single => loop-extract<single>
  lower-matrix-intrinsics-minimal => lower-matrix-intrinsics<minimal>

This patch is not updating pass names in docs/Passes.rst. Not quite
sure what the status is for that document (e.g. when it comes to
listing pass paramters). It is only loop-extract-single that is
mentioned in Passes.rst today, out of the passes mentioned above.

Differential Revision: https://reviews.llvm.org/D108362
2021-08-20 14:59:21 +02:00