1270 Commits

Author SHA1 Message Date
Guillaume Chatelet
5e32765c15 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 13:47:16 +00:00
Tue Ly
a982431295 [libc] Add platform independent floating point rounding mode checks.
Many math functions need to check for floating point rounding modes to
return correct values.  Currently most of them use the internal implementation
of `fegetround`, which is platform-dependent and blocking math functions to be
enabled on platforms with unimplemented `fegetround`.  In this change, we add
platform independent rounding mode checks and switching math functions to use
them instead. https://github.com/llvm/llvm-project/issues/63016

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152280
2023-06-12 09:36:41 -04:00
Guillaume Chatelet
1ec995cc1c Revert D148717 "[libc] Improve memcmp latency and codegen"
This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703
This reverts commit bd4f978754758d5ef29d1f10370f45362da3de37.
2023-06-12 08:32:00 +00:00
Guillaume Chatelet
bd4f978754 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 07:56:23 +00:00
Tue Ly
37458f6693 [libc][math] Move str method from FPBits class to testing utils.
str method of FPBits class is only used for pretty printing its objects
in tests.  It brings cpp::string dependency to FPBits class, which is not ideal
for embedded use case.  We move str method to a free function in test utils and
remove this dependency of FPBits class.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152607
2023-06-10 02:50:58 -04:00
Jordan Rupprecht
261b693afd [bazel][NFC] Add Dialect/Func/Extensions library and deps
Added in D120368
2023-06-09 17:04:41 -07:00
Mikhail Goncharov
b28614c4fc [bazel] format bazel files NFC 2023-06-09 12:13:07 +02:00
Michael Jones
47fd67ec34 [libc][NFC] land long double table for printf
The Mega Table that printf uses for long doubles with some flags is too
large for the linters, and so has been split out from the main patch.
The main patch: https://reviews.llvm.org/D150399

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152470
2023-06-08 16:14:56 -07:00
Michael Jones
688b9730d1 [libc] add options to printf decimal floats
This patch adds three options for printf decimal long doubles, and these
can also apply to doubles.

1. Use a giant table which is fast and accurate, but takes up ~5MB).
2. Use dyadic floats for approximations, which only gives ~50 digits of
   accuracy but is very fast.
3. Use large integers for approximations, which is accurate but very
   slow.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D150399
2023-06-08 14:23:15 -07:00
Kun Wu
8ed59c53de [mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151775
2023-06-06 23:13:21 +00:00
Benjamin Kramer
e412650726 [bazel] Port 44268271f61e46636619623d52013c3be3e272c0 2023-06-06 22:47:30 +02:00
Benjamin Kramer
ba8c0bf37e [bazel] Port 1117b9a284aa6e4b1f3cbde31825605bd07a2384 2023-06-06 22:47:17 +02:00
Jacques Pienaar
f007bcbc3c [mlir] Convert quantized dialect bytecode to generated.
Serves as rather self-contained documentation for using the generator
from https://reviews.llvm.org/D144820.

Differential Revision: https://reviews.llvm.org/D152118
2023-06-06 11:16:07 -07:00
Aart Bik
eb5308adc4 bazel build fix
Reviewed By: Peiming, manishucsd

Differential Revision: https://reviews.llvm.org/D152214
2023-06-05 17:24:14 -07:00
Aart Bik
62a06d8224 fix build issue on bazel
Needed to fix:
53a5c3ab4d
db7cc0348c

Reviewed By: Peiming, anlunx

Differential Revision: https://reviews.llvm.org/D152202
2023-06-05 15:33:31 -07:00
Siva Chandra Reddy
2bd82c5462 [bazel][libc] Add targets for integer abs and div functions.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152084
2023-06-05 22:15:12 +00:00
Johannes Reifferscheid
70bd660709 [bazel] Merge BytecodeOpInterface target into IR.
Reviewed By: akuegel

Differential Revision: https://reviews.llvm.org/D152133
2023-06-05 11:57:12 +02:00
Guillaume Chatelet
e49a608511 Revert D148717 "[libc] Improve memcmp latency and codegen"
This reverts commit 9ec6ebd3ceabb29482aa18a64b943788b65223dc.

The patch broke RISCV and aarch64 builtbots.
2023-06-05 09:50:30 +00:00
Guillaume Chatelet
9ec6ebd3ce [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-05 09:46:05 +00:00
Adrian Kuegel
bc7f65cbd8 [mlir][Bazel] Adjust BUILD files for a9d003ef855ff7ed1bf4f8229ee9944b55936e6f 2023-06-05 09:57:01 +02:00
Mikhail Goncharov
34866154d6 [bazel] add missing dep for GPUTransforms 2023-06-05 09:20:20 +02:00
Benjamin Kramer
9d531c2dcf [bazel] Port 36f351098cd5 2023-06-04 21:39:52 +02:00
Tue Ly
5a4e344bd9 [libc][NFC] Add LIBC_INLINE and attribute.h header includes to targets' FMA.h.
Targets' FMA.h headers are missing LIBC_INLINE and attributes.h header.

Reviewed By: brooksmoses

Differential Revision: https://reviews.llvm.org/D152024
2023-06-02 21:15:58 -04:00
Haojian Wu
c5564a0075 [bazel] Add include-cleaner targets, fix clang-tidy build for c28506ba4b6961950849f8fdecd0cf7e503a14f9 2023-06-02 19:38:08 +02:00
Matthias Springer
000bc58b63 [mlir][transform] Utilize op interface instead of tensor::TrackingListener
Add a new interface `FindPayloadReplacementOpInterface` to specify ops that should be skipped when looking for payload replacement ops. Such ops are typically metadata-only ops.

With this change, we no longer need to maintain a custom TrackingListener in the tensor dialect.

Note: `CastOpInterface` by itself is not sufficient. Some metadata-only ops such as "tensor.reshape" are not casts, and it would be incorrect for them to implement the `CastOpInterface`.

Differential Revision: https://reviews.llvm.org/D151888
2023-06-02 14:50:43 +02:00
Matthias Springer
e66f2beba8 [mlir][IR][NFC] Move CastOpInterface helpers to mlir/Interfaces
These helpers should not be part of the IR build unit.

The interface is now implemented on `builtin.unrealized_conversion_cast` with an external model.

Also rename the CastOpInterfaces Bazel target name to CastInterfaces to be consistent with the CMake target name.

Differential Revision: https://reviews.llvm.org/D146972
2023-06-02 08:39:46 +02:00
Matthias Springer
26864d8fb4 [mlir][tensor] Add pattern to drop redundant insert_slice rank expansion
Drop insert_slice rank expansions if they are directly followed by an inverse rank reduction.

Differential Revision: https://reviews.llvm.org/D151800
2023-06-01 08:47:53 +02:00
Benjamin Chetioui
981766a3d6 [mlir][bazel] Disable Transform/test-repro-dump.mlir test in bazel build. 2023-05-31 11:50:48 +00:00
Haojian Wu
b2f4e75b66 [bazel] Port for 301eb6b68f30074ee3a90e2dfbd11dfd87076323 2023-05-31 12:49:21 +02:00
Benjamin Chetioui
96816a1249 [mlir][bazel] Follow-up fix for ce954e1cda5c9b55325903d51285cd742152a0c3. 2023-05-31 08:20:10 +00:00
Adrian Kuegel
6118cb4bd4 [mlir][Bazel] Adapt Bazel BUILD for ce954e1cda5c9b55325903d51285cd742152a0c3 2023-05-31 10:13:19 +02:00
yijia1212
d146fc8fba add missing dependency for TosaToLinalg 2023-05-31 01:37:14 +00:00
Fangrui Song
43bec3376c Remove HAVE_STRERROR
Most systems support strerror_r. For the remaining systems (e.g. MSVC) strerror_s and
strerror can be used as fallbacks. We don't have a supported operating
system/compiler that doesn't provide `strerror`.

Close https://github.com/llvm/llvm-project/issues/62804

https://github.com/flang-compiler/f18/pull/1068 added a fallback
when strerror is unavailable, but I think the code path is dead.

Reviewed By: serge-sans-paille, vzakhari

Differential Revision: https://reviews.llvm.org/D151718
2023-05-30 14:12:20 -07:00
Benjamin Kramer
c644341c2c Revert "[mlir][bazel] Port for 660f714, third attempt"
This reverts commit 421a7f814fb15dedde1b0b13a9e4ddcf7b502086. Dependency
doesn't seem to be necessary and would pull in all of LLVM's codegen
into mlir users that don't require it.
2023-05-30 11:41:24 +02:00
Haojian Wu
d9118b9eea [bazel] Port for 9f6250f591057e68c0bda564716b6918b8e39a84, part2.
The part1 was missing the generation of
arm_sme_draft_spec_subject_to_change.h, this patch adds it.
2023-05-30 08:05:39 +02:00
Haojian Wu
a0b0bf38e5 [bazel] Port for 9f6250f591057e68c0bda564716b6918b8e39a84. 2023-05-29 07:19:57 +02:00
Eugene Burmako
421a7f814f [mlir][bazel] Port for 660f714, third attempt
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151618
2023-05-27 13:26:06 -07:00
Eugene Burmako
35d7fa45bd [MLIR] Reformat the Bazel build
This patch normalizes formatting of the the root BUILD.bazel file by: 1) adjusting indentation a little bit, 2) alphabetically ordering dependencies. These small deviations were introduced by some yesterday's patches:
  * https://reviews.llvm.org/D151104
  * https://reviews.llvm.org/D151346
  * https://reviews.llvm.org/rG16fe2b37365c00b0c6d0ed22c2e6521f2d5de01a
  * https://reviews.llvm.org/rG4d1cd1d8caab13d6b76ce6fc4ff76a01a7931c34

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151499
2023-05-27 13:00:52 -07:00
Benjamin Kramer
a218c99181 [bazel] Add missing dependency for ddeab07ca63235f8d952e1171b56fdb0f2d761c9 2023-05-27 12:04:36 +02:00
Haojian Wu
fe01c08424 [mlir][bazel] Port for 660f714e26999d266232a1fbb02712bb879bd34e, second
attempt.
2023-05-27 08:37:45 +02:00
Haojian Wu
5217498dc8 [mlir][bazel] Port for 660f714e26999d266232a1fbb02712bb879bd34e 2023-05-27 08:05:19 +02:00
Benjamin Kramer
198a887fcf [bazel][libc] Add another missing dependency 2023-05-26 13:12:21 +02:00
Benjamin Kramer
596887da2e [bazel][libc] Add file missing for 25174976e19b2ef916bb94f4613662646c95cd46 2023-05-26 12:24:36 +02:00
Benjamin Kramer
3d91caec4b [bazel][libc] Adjust for 4f1fe19df385445fabde47998affca50c7f1bc1e
This also required a build rule for error_to_string, so add that too.
2023-05-26 12:16:48 +02:00
Benjamin Kramer
e1d1cd43ff [bazel] Run buildifier on libc BUILD. NFC. 2023-05-26 12:16:47 +02:00
Eugene Burmako
ecc70b4474 [MLIR] Fixup Bazel build for Add a pattern for transforming gpu.global_id to thread + blockId * blockDim
This patch updates the Bazel build to catch up with changes in https://reviews.llvm.org/D148978.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151496
2023-05-25 14:28:28 -07:00
Siva Chandra Reddy
daeee56798 [libc] Add macro LIBC_THREAD_LOCAL.
It resolves to thread_local on all platform except for the GPUs on which
it resolves to nothing. The use of thread_local in the source code has been
replaced with the new macro.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D151486
2023-05-25 19:53:52 +00:00
Sterling Augustine
023504f29a Add missing dependency for tests. 2023-05-25 10:52:18 -07:00
Matthias Springer
7d36a468aa [mlir][tensor] TrackingListener: Support cast-like InsertSliceOps with dynamic shape
When looking for payload op replacements, rank-expanding InsertSliceOps of dynamically-typed tensors are now supported.

Differential Revision: https://reviews.llvm.org/D151444
2023-05-25 19:15:13 +02:00
Matthias Springer
047e7ff253 [mlir][tensor] TrackingListener: Find replacement ops through cast-like InsertSliceOps
Certain InsertSliceOps, that do not use elements from the destination, are treated like casts when looking for replacement ops. Such InsertSliceOps are typically rank expansions.

Tensors with dynamic shape are not supported at the moment.

Also adds test cases for the TrackingListener.

Differential Revision: https://reviews.llvm.org/D151422
2023-05-25 18:49:24 +02:00