30 Commits

Author SHA1 Message Date
paperchalice
60eca674b1
[CodeGen] Port ExpandMemCmp to new pass manager (#74050) 2023-12-13 16:18:24 +08:00
Igor Kirillov
849f963e31
[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#70469)
* Enhanced the logic of ExpandMemCmp pass to merge contiguous
subsequences
  in LoadSequence, based on sizes allowed in `AllowedTailExpansions`.
* This enhancement seeks to minimize the number of basic blocks and
produce
  optimized code when using memcmp with non-register aligned sizes.
* Enable this feature for AArch64 with memcmp sizes modulo 8 equal to
  3, 5, and 6.

Reapplication of #69942 after fixing a bug
2023-10-30 18:40:48 +00:00
Igor Kirillov
deb429e5b0 Revert "[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#69942)"
This reverts commit 9bcb30d31813bbdea6b65789f64aed3f0e58d507.
2023-10-27 14:12:45 +00:00
Igor Kirillov
9bcb30d318
[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#69942)
* Enhanced the logic of ExpandMemCmp pass to merge contiguous
subsequences
  in LoadSequence, based on sizes allowed in `AllowedTailExpansions`.
* This enhancement seeks to minimize the number of basic blocks and
produce optimized code when using memcmp with non-register aligned sizes.
* Enable this feature for AArch64 with memcmp sizes modulo 8 equal to
  3, 5, and 6.
2023-10-27 12:41:08 +01:00
Tobias Hieta
f84bac329b
[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: barannikov88, kwk

Differential Revision: https://reviews.llvm.org/D150762
2023-05-17 17:03:15 +02:00
Bjorn Pettersson
8ee529a398 [test][ExpandMemCmp] Convert test cases to opaque pointers. NFC
Conversion performed using the script at:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
2022-10-07 15:29:32 +02:00
Clement Courbet
46a13a0ef8 [ExpandMemCmp] Properly expand bcmp to an equality pattern.
Before that change, constant-size `bcmp` would miss an opportunity to generate
a more efficient equality pattern and would generate a -1/0-1 pattern
instead.

Differential Revision: https://reviews.llvm.org/D123849
2022-04-15 11:26:24 +02:00
Clement Courbet
866bd4df47 [NFC] Add test in preparation for D123849. 2022-04-15 11:15:29 +02:00
Sanjay Patel
8721490d38 [x86] split memcmp tests for 32/64-bit targets; NFC
memcmp is defined as taking a size_t length arg,
so that differs depending on pointer size of the
target.

We casually matched non-compliant function signatures
as memcmp, but that can cause crashing as seen with
PR50850.

If we fix that bug, these tests would no longer be
testing the expected behavior for a 32-bit target,
so I have duplicated all tests and adjusted them
to match the stricter definition of memcmp/bcmp
by changing the length arg to i32 on a 32-bit target.
2021-08-15 13:51:18 -04:00
Clement Courbet
fb4aa30f27 [ExpandMemCmp] Allow overlaping loads in the zero-relational case.
Summary:
This allows doing `memcmp(p, q, 7)` with 2 loads instead of a call to
memcmp.
This fixes part of PR45147.

Reviewers: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76133
2020-04-02 11:20:47 +02:00
Juneyoung Lee
7aecf2323c [ExpandMemCmp] Correctly set alignment of generated loads
Summary:
This is a part of the series of efforts for correcting alignment of memory operations.
(Another related bugs: https://bugs.llvm.org/show_bug.cgi?id=44388 , https://bugs.llvm.org/show_bug.cgi?id=44543 )

This fixes https://bugs.llvm.org/show_bug.cgi?id=43880 by giving default alignment of loads to 1.

The test CodeGen/AArch64/bcmp-inline-small.ll should have been changed; it was introduced by https://reviews.llvm.org/D64805 . I talked with @evandro, and confirmed that the test is okay to be changed.
Other two tests from PowerPC needed changes as well, but fixes were straightforward.

Reviewers: courbet

Reviewed By: courbet

Subscribers: nlopes, gchatelet, wuzish, nemanjai, kristof.beyls, hiraditya, steven.zhang, danielkiss, llvm-commits, evandro

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76113
2020-03-16 22:39:48 +09:00
Juneyoung Lee
acdcd23b7b Add tests to ExpandMemCmp/X86/memcmp.ll before submitting D76113 2020-03-16 22:19:37 +09:00
Clement Courbet
6518b72f93 [ExpandMemCmp] Properly constant-fold all compares.
Summary:
This gets rid of duplicated code and diverging behaviour w.r.t.
constants.
Fixes PR45086.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75519
2020-03-09 10:40:52 +01:00
Clement Courbet
f7e6f5f8e3 [ExpandMemCmp] Properly constant-fold all compares.
Summary:
This gets rid of duplicated code and diverging behaviour w.r.t.
constants.
Fixes PR45086.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75519
2020-03-09 09:10:34 +01:00
David Zarzycki
f68925d450
[X86] Make memcmp vector lowering handle arbitrary expansions
Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp
expansions but do not change any default policy for now.

This also fixes a bug in the memcmp expansion itself when large
displacements are needed.

https://reviews.llvm.org/D69507
2019-10-30 09:12:57 +02:00
Dmitri Gribenko
2bf8d77453 Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.""
This reverts commit r371502, it broke tests
(clang/test/CodeGenCXX/auto-var-init.cpp).

llvm-svn: 371507
2019-09-10 10:39:09 +00:00
Clement Courbet
664d9d2da2 [ExpandMemCmp] Add lit.local.cfg
To prevent AArch64 tests from running when the target is not compiled.

Fixes r371502:

/home/buildslave/ps4-buildslave4/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/llvm.src/test/Transforms/ExpandMemCmp/AArch64/memcmp.ll:11:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP0:%.*]] = bitcast i8* [[S1:%.*]] to i64*

llvm-svn: 371503
2019-09-10 10:00:15 +00:00
Clement Courbet
612c260ec3 Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
With a fix for sanitizer breakage (see explanation in D60318).

llvm-svn: 371502
2019-09-10 09:18:00 +00:00
Clement Courbet
2851248fa1 Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
Breaks sanitizers:
    libFuzzer :: cxxstring.test
    libFuzzer :: memcmp.test
    libFuzzer :: recommended-dictionary.test
    libFuzzer :: strcmp.test
    libFuzzer :: value-profile-mem.test
    libFuzzer :: value-profile-strcmp.test

llvm-svn: 364416
2019-06-26 12:13:13 +00:00
Clement Courbet
7b3a5f0e6d [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
This allows later passes (in particular InstCombine) to optimize more
cases.

One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.

llvm-svn: 364412
2019-06-26 11:50:18 +00:00
Fangrui Song
ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Clement Courbet
36a3480385 Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads.
Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change.

llvm-svn: 349747
2018-12-20 13:01:04 +00:00
Clement Courbet
e22cf4d7cb Revert r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads."
Forgot to update PowerPC tests for the GEP->bitcast change.

llvm-svn: 349733
2018-12-20 09:58:33 +00:00
Clement Courbet
1bb6e1b0f2 [CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads.
Summary:
This allows expanding {7,11,13,14,15,21,22,23,25,26,27,28,29,30,31}-byte memcmp
in just two loads on X86. These were previously calling memcmp.

Reviewers: spatel, gchatelet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D55263

llvm-svn: 349731
2018-12-20 09:13:47 +00:00
Sanjay Patel
5a48aef3f0 [x86, MemCmpExpansion] allow 2 pairs of loads per block (PR33325)
This is the last step needed to fix PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325

We're trading branch and compares for loads and logic ops. 
This makes the code smaller and hopefully faster in most cases.

The 24-byte test shows an interesting construct: we load the trailing scalar 
elements into vector registers and generate the same pcmpeq+movmsk code that 
we expected for a pair of full vector elements (see the 32- and 64-byte tests).

Differential Revision: https://reviews.llvm.org/D41714

llvm-svn: 321934
2018-01-06 16:16:04 +00:00
Clement Courbet
063bed9baf re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/.

llvm-svn: 317318
2017-11-03 12:12:27 +00:00
Clement Courbet
82bade615b Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage

This reverts commit eea333c33fa73ad225ef28607795984829f65688.

llvm-svn: 317213
2017-11-02 15:53:10 +00:00
Clement Courbet
1dc37b9c3b [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.

See the discussion in PR33325 for more details.

Reviewers: spatel

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D39456

llvm-svn: 317211
2017-11-02 15:02:51 +00:00