The separation here doesn't make much sense. I think it's a
leftover from the creation of the MC layer that has been
replicated to new targets.
By merging them we can avoid passing the AsmPrinter to the
MCInstLowering functions. We can make them member functions instead.
I think we can still do more integration of lowerSymbolOperand
and lowerRISCVVMachineInstrToMCInst, but I wanted to get feedback
on the direction first.
Reviewed By: asb, barannikov88
Differential Revision: https://reviews.llvm.org/D152311
We recently fixed a bug in "sparsifying" such reductions, since
it incorrectly changed this into reductions over stored elements
only , which only works for add/sub/or/xor. However, we still want
to be able to "sparsify" the reductions even in the general case,
and this is a first step by rewriting them into a custom reduction
that feeds in the implicit zeros. NOTE HOWEVER, that in the long run
we want to do this better and feed in any implicit zero only ONCE
for efficiency.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D152580
Without this patch a `DW_ATE_complex_float` encoding trips an assertion in
`DebugHandlerBase::isUnsignedDIType` with the message `"Unsupported
encoding"`.
By adding a case to the `assert` for `DW_ATE_complex_float` it becomes
supported, behaving in the same way as the already supported `DW_ATE_float`
type (return false).
Note: For the reported reproducer:
#include <complex.h>
int main() {
long double complex r1;
}
The assertion isn't tripped without assignment tracking because instcombine
deletes everything, including the `dbg.declare`, without recovering any
location information. Whereas with assignment tracking we track a zeroing
memset that is emitted by clang.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D151795
This reverts commit 859b05b02d3fd9ab6b77f2bed8df6902fe704806.
Also reverts these follow-ups:
Revert "[RDF] Remove `constexpr` from `hash"
This reverts commit 621507ce20ad8eef2986be2712631165e53b7d91.
Revert "[RDF] Do not use trailing return type after all, NFC"
This reverts commit 46e19e3a2c45e7fb5f501bdb983a7151c158304f.
Revert "[RDF] Stop looking when reached code node in getNextRef with NextOnly"
This reverts commit a049ce9d1bd5a7c1c4fcccc6a801b72b00ea8e0f.
Revert "[RDF] Use trailing return type syntax, NFC"
This reverts commit d3b34b7f3a7cbfc96aea897419f167b5ee19e61a.
Revert "[RDF] Define short type names: NodeAddr<XyzNode*> -> Xyz, NFC"
This reverts commit f8ed60b56d1948422dda924fcf450560591e8a19.
This broke building the TSan runtime on Mac, see comment on
5548843d69
> so that they get an error on other targets. This change uses let statements to
> apply `Flags = [TargetSpecific]` to options (mostly -m*) without specifying `Flags`.
> Follow-up to D151590.
>
> For some options, e.g. -mdefault-build-attributes (D31813), -mbranch-likely
> (D38168), -mfpu=/-mabi= (6890b9b71e525020ab58d436336664beede71575), a warning
> seems desired in at least certain cases. This is not the best practice, but this
> change works around them by not applying `Flags = [TargetSpecific]`.
>
> (
> For Intel CPU errata -malign-branch= family options, we also drop the unneeded
> NotXarchOption flag. This flag reports an error if the option is used with
> -Xarch_*. This error reporting does not seem very useful.
> )
This reverts commit 5548843d692a92a7840f14002debc3cebcb3cdc3.
The last use of getABITypeAlignment was removed by:
commit 26bd6476c61f08fc8c01895caa02b938d6a37221
Author: Guillaume Chatelet <gchatelet@google.com>
Date: Fri Jan 13 15:05:24 2023 +0000
Differential Revision: https://reviews.llvm.org/D152670
This patch migrates the emitOffloadingArrays and EmitNonContiguousDescriptor functions from Clang codegen to OpenMPIRBuilder.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D149872
This patch extends the Linalg vectoriser so that scalar loads are
correctly identified as scalar rather than gather loads. Below is an
example of a scalar load (note that both indices are loop invariant):
```
func.func @example(%arg0: tensor<80x16xf32>, %arg2: tensor<1x4xf32>) -> tensor<1x4xf32> {
%c8 = arith.constant 8 : index
%c16 = arith.constant 16 : index
%1 = linalg.generic {
indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>],
iterator_types = ["parallel", "parallel"]
} outs(%arg2 : tensor<1x4xf32>) {
^bb0(%out: f32):
%2 = linalg.index 0 : index
%extracted = tensor.extract %arg0[%2, %c16] : tensor<80x16xf32>
linalg.yield %extracted : f32
} -> tensor<1x4xf32>
return %1 : tensor<1x4xf32>
}
```
This patch also makes sure that these scalar loads are indeed lowered to
a scalar load followed by a broadcast:
```
%extracted = tensor.extract %arg0[%1, %c16] : tensor<80x16xf32>
%2 = vector.broadcast %extracted : f32 to vector<1x4xf32>
```
Differential Revision: https://reviews.llvm.org/D149678
There is no need to set a big default stack size for PAL code object indirect
calls. The driver knows the max recursion depth, so it can compute a more
accurate value from the minimum scratch size.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D150609
This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
Detail: Follow up to D144999, where we emitted DWARF for non-canonical personality.
Reviewed By: jyknight
Differential Revision: https://reviews.llvm.org/D152540
Many math functions need to check for floating point rounding modes to
return correct values. Currently most of them use the internal implementation
of `fegetround`, which is platform-dependent and blocking math functions to be
enabled on platforms with unimplemented `fegetround`. In this change, we add
platform independent rounding mode checks and switching math functions to use
them instead. https://github.com/llvm/llvm-project/issues/63016
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152280
Fix the verification failure reported in
https://reviews.llvm.org/D141712#4413647. We need to remove the
load from the VN table as well, not just the leader table.
Also make sure that this verification always runs when assertions
are enabled, rather than only when -debug is passed.
Other such tests, of which there are many, are to be updated with
separate patches.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D152557
We would like to move the preamble index out of the critical path.
This patch is an RFC to get feedback on the correct implementation and potential pitfalls to keep into consideration.
I am not entirely sure if the lazy AST initialisation would create using Preamble AST in parallel. I tried with tsan enabled clangd but it seems to work OK (at least for the cases I tried)
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D148088
This patch introduces an MLIR attribute to the OpenMP dialect
representing the clauses that a 'requires' directive can define.
The `OffloadModuleInterface` is also updated to provide methods to get
and set a new dialect attribute `omp.requires`, to allow storing and using this
information during the lowering stages to LLVM IR.
Differential Revision: https://reviews.llvm.org/D147214
This patch fixes the equations on the Quantization page
(https://mlir.llvm.org/docs/Quantization/).
I don't know what caused the equations to be broken, it
might be https://github.com/llvm/mlir-www/pull/152, but
I'm not sure. Irregardless, let's just fix it and be
done with it.
I've fixed the equations by moving some subscripts to
the text. For some reason, the large number of subscripts
caused Mathjax to fail. I've also tried KaTeX, which
failed at exactly the same number of subscripts.
The workflow to inspect the fix is as follows:
```
$ git clone --depth=1 https://github.com/llvm/mlir-www.git /some/path/mlir-www
$ git clone --depth=1 https://github.com/llvm/llvm-project.git /some/path/llvm-project
$ cp /some/path/llvm-project/mlir/docs/Quantization.md \
/some/path/mlir-www/website/content/Quantization.md
$ cd /some/path/mlir-www/website
$ hugo serve
[...]
Web Server is available at http://localhost:1313/ (bind address 127.0.0.1)
Press Ctrl+C to stop
```
and view the page at http://localhost:1313/Quantization/.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D152651
This fold goes against the usual approach of pushing freeze into
operands. The idea behind the fold is that if the setcc feeds into
a brcond, the freeze can be dropped entirely.
Move the fold to brcond, where we can remove the freeze directly.
This ensures that there can be no infinite combine loops due to
conflicting transforms.
Differential Revision: https://reviews.llvm.org/D152544
If we have a load/store with an illegal fixed length vector result type that
needs widened, e.g. `x:v6i32 = load p`
Instead of just widening it to: `x:v8i32 = load p`
We can widen it to the equivalent VP operation and set the EVL to the
exact number of elements needed: `x:v8i32 = vp_load a, b, mask=true, evl=6`
Provided that the target supports vp_load/vp_store on the widened type.
Scalable vectors are already widened this way where possible, so this
largely reuses the same logic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D148713
While working on the ongoing migration to strict handling of value
categories (see https://discourse.llvm.org/t/70086), I ran into issues related
to losing the value associated with an optional.
This issue is hinted at in the existing comments, but the issue didn't become
sufficiently clear to me from those, so I thought it would be worth capturing
more details, along with ideas for how this issue might be fixed.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D152369
This is based on ideas from @nafi to:
- use a branchless version of 'cmp' for 'uint32_t',
- completely resolve the lexicographic comparison through vector
operations when wide types are available. We also get rid of byte
reloads and serializing '__builtin_ctzll'.
I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.
The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.
Reviewed By: nafi3000
Differential Revision: https://reviews.llvm.org/D148717
This commit re-work the methods that dump traces with resource usage to take into account the StartAtCycle value added by https://reviews.llvm.org/D150310.
For each i, the values of the lists StartAtCycle and ReservedCycles is are printed with the interval [StartAtCycle[i], ReservedCycles[i])
```
... | StartAtCycle[i] | ... | ReservedCycles[i] - 1 | ReservedCycles[i] | ...
| xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | |
```
Reviewed By: andreadb
Differential Revision: https://reviews.llvm.org/D150311