llvm-project

Author	SHA1	Message	Date
Slava Zakharin	71e0261fb0	[flang][runtime] Added Fortran::common::optional for use on device. This is a simplified implementation of std::optional that can be used in the offload builds for the device code. The methods are properly marked with RT_API_ATTRS so that the device compilation succedes. Reviewers: klausler, jeanPerier Reviewed By: jeanPerier Pull Request: https://github.com/llvm/llvm-project/pull/85177	2024-03-15 14:25:47 -07:00
Slava Zakharin	76facde32c	[flang][runtime] Enable more APIs in the offload build. (#76486 )	2023-12-28 13:50:43 -08:00
Slava Zakharin	b4b23ff7f8	[flang][runtime] Enable more APIs in the offload build. (#75996 ) This patch enables more numeric (mod, sum, matmul, etc.) APIs, and some others. I added new macros to disable warnings about using C++ STD methods like operators of std::complex, which do not have __device__ attribute. This may probably result in unresolved references, if the header files implementation relies on libstdc++. I will need to follow up on this.	2023-12-20 11:52:51 -08:00
Slava Zakharin	4d9771741d	[flang] Improved performance of runtime Matmul/MatmulTranspose. This patch mostly affects performance of the code produced by HLIFR lowering. If MATMUL argument is an array slice, then HLFIR lowering passes the slice to the runtime, whereas FIR lowering would create a contiguous temporary for the slice. Performance might be better than the generic implementation for cases where the leading dimension is contiguous. This patch improves CPU2000/178.galgel making HLFIR version faster than FIR version (due to avoiding the temporary copies for MATMUL arguments). Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D159134	2023-08-29 17:04:00 -07:00
Slava Zakharin	ea7d6a1bd6	[NFC][flang] Distinguish MATMUL and MATMUL-TRANSPOSE printouts. When MatmulTranpose reports incorrect shapes of the arguments it cannot represent itself as MATMUL, because the reading of the first argument's shape will be confusing. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D155911	2023-07-21 12:56:50 -07:00
Tom Eccles	4ff8ba72b5	[flang] add fused matmul-transpose to the runtime This fused operation should run a lot faster than first transposing the lhs array and then multiplying the matrices separately. Based on flang/runtime/matmul.cpp Depends on D145959 Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D145960	2023-03-17 09:30:04 +00:00

6 Commits