The old fir.allocmem operation returned a !fir.heap<.> type. The new
fir.alloca operation returns a !fir.ref<.> type. This patch inserts a
fir.convert so that the old type is preserved. This prevents verifier
failures when types returned from fir.if statements don't match the
expected type.
Differential Revision: https://reviews.llvm.org/D151921
Despite me being convinced that the use of divide didn't produce any
divide instructions, it does in fact add more instructions than using
a plain shift operation.
This patch simply changes the divide to a shift right, with an
assert to check that the "divisor" is a power of two.
Reviewed By: kiranchandramohan, tblah
Differential Revision: https://reviews.llvm.org/D151880
In upstream mlir, the dialect conversion infrastructure is used for
lowering from one dialect to another: the passes are of the form
XToYPass. Whereas, transformations within the same dialect tend to use
applyPatternsAndFoldGreedily.
In this case, the full complexity of applyPatternsAndFoldGreedily isn't
needed so we can get away with the simpler applyOpPatternsAndFold.
This change was suggested by @jeanPerier
The old differential revision for this patch was
https://reviews.llvm.org/D150853
Re-applying here fixing the issue which led to the patch being reverted. The
issue was from erasing uses of the allocation operation while still iterating
over those uses (leading to a use-after-free). I have added a regression
test which catches this bug for -fsanitize=address builds, but it is
hard to reliably cause a crash from the use-after-free in normal builds.
Differential Revision: https://reviews.llvm.org/D151728
This patch makes more than 2D arrays work, with a fix for the way that
loop index is calculated. Removing the restriction of number of
dimensions.
This also changes the way that the actual index is calculated, such that
the stride is used rather than the extent of the previous dimension. Some
tests failed without fixing this - this was likely a latent bug in the
2D version too, but found in a test using 3D arrays, so wouldn't
have been found with 2D only. This introduces a division on the index
calculation - however it should be a nice and constant value allowing
a shift to be used to actually divide - or otherwise removed by using
other methods to calculate the result. In analysing code generated with
optimisation at -O3, there are no divides produced.
Some minor refactoring to avoid repeatedly asking for the "rank" of the
array being worked on.
This improves some of the SPEC-2017 ROMS code, in the same way as the
limited 2D array improvements - less overhead spent calculating array
indices in the inner-most loop and better use of vector-instructions.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D151140
Remove old clause operands from acc.parallel operation since
the new dataOperands is now in place.
private, firstprivate and reductions will receive some redesign but are
not part of the new dataOperands.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D150207
Since the new data operand operations have been added in D148389 and
adopted on acc.data in D149673, the old clause operands are no longer
needed.
The LegalizeDataOpForLLVMTranslation will become obsolete when all
operations will be cleaned. For the time being only the appropriate
part are being removed.
processOperands will also receive some updates once all the operands
will be coming from an acc data operand operation.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D150155
Since the new data operand operations have been added in D148389 and
adopted on acc.exit_data in D149601, the old clause operands are no longer
needed.
The LegalizeDataOpForLLVMTranslation will become obsolete when all
operations will be cleaned. For the time being only the appropriate
part are being removed.
processOperands will also receive some updates once all the operands
will be coming from an acc data operand operation.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D150145
Since the new data operand operations have been added in D148389 and
adopted on acc.enter_data in D148721, the old clause operands are no longer
needed.
The LegalizeDataOpForLLVMTranslation will become obsolete when all
operations will be cleaned. For the time being only the appropriate
part are being removed.
processOperands will also receive some updates once all the operands
will be coming from an acc data operand operation.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D150132
Since the new data operand operations have been added in D148389 and
adopted on acc.update in D149909, the old clause operands are no longer
needed. This is a first patch to start cleaning the OpenACC operations
with data clause operands.
The `LegalizeDataOpForLLVMTranslation` will become obsolete when all
operations will be cleaned. For the time being only the appropriate
part are being removed.
`processOperands` will also receive some updates once all the operands
will be coming from an acc data operand operation.
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D150053
Another test based on review comments added late in the review.
This one confirms that the multiplication and addition of the outer
index to the inner index and thus form the 2D index.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D149265
These two tests were created from little snippets added late
in the review of the loop versioning work. The code was fixed
to cope with the situation and correctly compile these samples.
This adds tests to avoid regressions in this area.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D148649
Introduce conditional code to identify stride of "one element", and simplify the array accesses for that case.
This allows better loop performance in various benchmarks.
Reviewed By: tblah, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D141306
Remove the custoom parser and printer for the acc.parallel
operation and use the assembly format directly.
Reviewed By: PeteSteinfeld, razvanlupusoru
Differential Revision: https://reviews.llvm.org/D148183
Similar to D148039 but for the FIR to LLVM IR
conversion pass.
The inner part of the acc.loop has been removed since the rest of the
pipeline is not ready and would raise an error here. This was passing
until now because the acc.loop was discarded completely.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D148057
fir.if currently isn't treated as a 'proper' conditional, so passes are unable to determine which regions are executed at times.
This patch gives fir.if this interface, which shouldn't do too much on its own but should allow future changes to take advantage
for various purposes
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D145165
Previously the mask would be loaded as the appropriate integer type and cast to I1 to pass to
fir.if, however this truncates the integer and so would cast 6 to 0. By loading values as logicals
and casting to I1 this problem is avoided.
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D144974
Previously COUNT would cast the mask input to logical<4> before passing it
to the runtime function, this has been changed to allow different types of logical.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D144867
This patch adds minloc to the simplify intrinsics pass, supporting calls with KIND or MASK arguments while calls which have BACK, DIM or have a CHARACTER input array are rejected. This patch is targeting exchange2, and in benchmarks provides a ~11% improvement in performance.
Also included are some minor style changes / cleanup in simplifyIntrinsics.cpp.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D144103
Some functions (e.g. the main function) end with a call to the STOP
statement instead of a func.return. This is lowered as a call to the
stop runtime function followed by a fir.unreachable. fir.unreachable is
a terminator and so this can cause functions to have no func.return.
The stack arrays pass looks to see which heap allocations have always
been freed by the time a function returns. Without any returns, the pass
does not detect any freed allocations. This patch changes this behaviour
so that fir.unreachable is checked as well as func.return.
This allows 15 heap allocations for array temporaries in spec2017
exchange2's main function to be moved to the stack.
Differential Revision: https://reviews.llvm.org/D143918
When rank > 1, the inital value would be lost on inner loops, leading to the wrong
value to be returned, e.g. This would return T. This patch fixes this to use the correct
inital value for all cases.
```
Integer :: m(0,10)
Any(m .eq 0)
```
Reviewed By: vdonaldson
Differential Revision: https://reviews.llvm.org/D143899
This patch provides a simplified version of the Any intrinsic as well as the All intrinsic
that can be used for inlining or simpiler use cases. These changes are targeting exchange2, and
provide a ~9% performance increase.
Reviewed By: Leporacanthicus, vzakhari
Differential Revision: https://reviews.llvm.org/D142977
The implementation of -fstack-arrays was added in
https://reviews.llvm.org/D140415
The new macro BoolOptionWithoutMarshalling in Options.td avoids
generating code to store the flags in clang data structures. For
example, writing something like
defm stack_arrays : BoolOption<"f", "stack-arrays",
CodeGenOpts<"StackArrays">, [...]
Would generate code referring to `clang::CodeGenOpts::StackArrays`, which
does not exist.
Differential Revision: https://reviews.llvm.org/D140972
This pass implements the `-fstack-arrays` flag. See the RFC in
`flang/docs/fstack-arrays.md` for more information.
Differential revision: https://reviews.llvm.org/D140415
Simple fix to check for rank in the same way as other intrinsics to allow
runtime count to take over when dealing with unknown dimension arrays.
Fixes#60356
Reviewed By: Leporacanthicus
Differential Revision: https://reviews.llvm.org/D142877
This patch adds a simplfiied version of count for the simplify intrinsics pass, allowing the function to be inlined.
This was done specifically to help improve performance for exchange2, and provides a ~12% performance increase.
Reviewed By: vzakhari, Leporacanthicus
Differential Revision: https://reviews.llvm.org/D142209
This ensures that functions in included files have the correct path
in their file metadata.
Note: This patch also sets all locations to have the full path names.
Reviewed By: vzakhari, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D142263
Previously, the DISubroutineType attribute used an optional result
parameter and an optional argument types array to model the subroutine
signature. LLVM IR debug metadata, on the other hand, has one types
list whose first entry maps to the result type. That entry may be
null to model a void result type. The type list may also be entirely
empty not specifying any type information. The latter is problematic
since the current DISubroutineType attribute cannot express it.
The revision changes DISubroutineTypeAttr to closely follow the
LLVM metadata design. In particular, it uses a single types parameter
array to model the subroutine signature and introduces an explicit
DIVoidResultTypeAttr to model the null entries.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D141261
This reverts commit 81f57b6
and relands commit a960547
Fixes flang build and drop_begin on an empty array ref.
Recent changes to MLIR meant that Flang does not generate any debug line
table information.
This patch adds a pass that provides some foundation work with which
basic line table debug info can be generated. A walk is performed on
all the `func` ops in the module and they are decorated with a fusedLoc
op that contains the debug metadata for the subroutine along with
location information.
Alternatives include populating this info during lowering or during FIR
to LLVM Dialect conversion.
Note: Patches in future will add
-> more realistic debug info for types and other fields.
-> driver flags to control generation of debug.
Fixes#58634.
Reviewed By: awarzynski, vzakhari
Differential Revision: https://reviews.llvm.org/D137956
In general, the meaning of fastmath flags on a call during inlining
is that the call's operation flags must be ignored. For user functions
that means that the fastmath flags used for the function definition
override any call site's fastmath flags. For intrinsic functions
we can use the call site's fastmath flags, but we have to make sure
that the call sites with different flags produce/use different
simplified versions of the same intrinsic function.
Differential Revision: https://reviews.llvm.org/D138048
Create simplified functions for each rank with "x<rank>" suffix
that implement multidimensional reductions. To enable this I had to fix
an issue with taking incorrect box shape in cases of sliced embox/rebox.
Differential Revision: https://reviews.llvm.org/D133820
The SUM function does appear to be safe to use, so remove the
experimental flag for the SUM operation.
Reviewed By: vzakhari, awarzynski
Differential Revision: https://reviews.llvm.org/D132567
Add simplifcation pass for MAXVAL intrinsic function
This refactors some of the code to allow variation on the
initialization value and operation performed within the loop,
reusing the majority of code for both SUM and MAXVAL.
Adding tests for the test-cases that produce different output
than the SUM function.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D132234
Under some conditions, the defining op may be NULL, so
accept that rahter than try to use it and crash!
Adds test to prevent regression
Fixes github issue #57201
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D132238
The current code may not always work correctly, e.g.:
https://github.com/llvm/llvm-project/issues/57201
I added 'enable-experimental' pass option so that SUM simplification
may be enabled in LIT tests, but it is not enabled when the pass
is added to the passes pipeline.
Differential Revision: https://reviews.llvm.org/D131640
Fix one encountered (issue #57072) and two potential scenarios where the
code would ask for an operand that isn't there.
Add test for the encountered case.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D131671
Fortran runtime supports mixed types by casting the loaded values
to the result type, so DOT_PRODUCT simplification has to do the same.
Differential Revision: https://reviews.llvm.org/D131726
Find calls to FortranASum{Real8,Integer4}, check for dim and mask
arguments being absent - then produce an inlineable simple
version of the sum function.
(No longer a prototype, please review for push to llvm/main - not sure how to make Phabricator update the review with actual commit message)
Reviewed By: peixin, awarzynski
Differential Revision: https://reviews.llvm.org/D125407