llvm-project

Author	SHA1	Message	Date
Michael Kruse	77581e2751	Reapply "[Flang] Remove FLANG_INCLUDE_RUNTIME (#124126 )" This reverts commit 27539c3f903be26c487703943d3c27d45d4542b2. Retry with new buildbot configuration after master restart. Original message: Remove the FLANG_INCLUDE_RUNTIME option which was replaced by LLVM_ENABLE_RUNTIMES=flang-rt. The FLANG_INCLUDE_RUNTIME option was added in #122336 which disables the non-runtimes build instructions for the Flang runtime so they do not conflict with the LLVM_ENABLE_RUNTIMES=flang-rt option added in #110217. In order to not maintain multiple build instructions for the same thing, this PR completely removes the old build instructions (effectively forcing FLANG_INCLUDE_RUNTIME=OFF). As per discussion in https://discourse.llvm.org/t/buildbot-changes-with-llvm-enable-runtimes-flang-rt/83571/2 we now implicitly add LLVM_ENABLE_RUNTIMES=flang-rt whenever Flang is compiled in a bootstrapping (non-standalone) build. Because it is possible to build Flang-RT separately, this behavior can be disabled using `-DFLANG_ENABLE_FLANG_RT=OFF`. Also see the discussion an implicitly adding runtimes/projects in #123964.	2025-04-30 12:32:49 +02:00
Kaviya Rajendiran	857ac4c229	[MLIR][OpenMP] Lowering nontemporal clause to LLVM IR for SIMD directive (#118751 ) This patch, - Added a new attribute `nontemporal` to fir.load and fir.store operation in the FIR dialect. - Added a pass `lower-nontemporal` which is called before FIRToLLVM conversion pass and adds the nontemporal attribute to loads and stores on the list items specified in the nontemporal clause of the SIMD directive. - Set the `UnitAttr:$nontemporal` to llvm.load and llvm.store operations during FIR to LLVM dialect conversion, if the corresponding fir.load or fir.store operations have the nontemporal attribute. - Attached the `nontemporal metadata` to load and store instructions that have the nontemporal attribute, during LLVM dialect to LLVM IR translation.	2025-04-30 11:13:20 +05:30
Alexander Richardson	e4332e4706	[flang] Fix build when examples are disabled Without this change I get a build error due to the missing Bye target when I configure my build with -DLLVM_INCLUDE_EXAMPLES=OFF. This check for LLVM_BUILD_EXAMPLES matches the checks in llvm and lld. Reviewed By: mgorny Pull Request: https://github.com/llvm/llvm-project/pull/137908	2025-04-29 22:42:12 -07:00
Valentin Clement (バレンタインクレメン)	d5272e4f41	[flang][cuda] Only copy global that have effective use (#137890 )	2025-04-29 16:52:55 -07:00
Razvan Lupusoru	1a6b0413e0	[flang][acc] Fix issue with privatization recipe for box ref (#137869 ) When privatizing allocatable/pointer arrays, the code was creating a temporary but this was a box type. This led to inconsistency between the input and output of recipe. The updated logic now creates storage when a box ref is requested.	2025-04-29 13:13:29 -07:00
Pranav Bhandarkar	7dd8122d4e	[Flang][MLIR][OpenMP] - Add support for firstprivate when translating omp.target ops from MLIR to LLVMIR (#131213 ) This patch adds support to translate `firstprivate` clauses on `omp.target` ops when translating from MLIR to LLVMIR. Presently, this PR is restricted to supporting only included tasks, i.e `#omp target nowait firstprivate(some_variable)` will likely not work correctly even if it produces object code.	2025-04-29 14:53:15 -05:00
Slava Zakharin	82c036e2de	[flang] Restructured TBAA trees in AddAliasTags pass. (#136725 ) This patch produces the following TBAA tree for a function: ``` Function root \| "any access" \| \|- "descriptor member" \|- "any data access" \| \|- "dummy arg data" \|- "target data" \| \|- "allocated data" \|- "direct data" \|- "global data" ``` The TBAA tags are assigned using the following logic: * All POINTER variables point to the root of "target data". * Dummy arguments without POINTER/TARGET point to their leafs under "dummy arg data". * Dummy arguments with TARGET point to the root of "target data". * Global variables without descriptors point to their leafs under "global data" (including the ones with TARGET). * Global variables with descriptors point to their leafs under "direct data" (including the ones with TARGET). * Locally allocated variables point to their leafs under "allocated data" (including the ones with TARGET). This change makes it possible to disambiguate globals like: ``` module data real, allocatable :: a(:) real, allocatable, target :: b(:) end ``` Indeed, two direct references to global vars cannot alias even if any/both of them have TARGET attribute. In addition, the dummy arguments without POINTER/TARGET cannot alias any other variable even with POINTER/TARGET. This was not expressed in TBAA before this change. As before, any "unknown" memory references (such as with Indirect source, as classified by FIR alias analysis) may alias with anything, as long as they point to the root of "any access". Please think of the counterexamples for which this structure may not work.	2025-04-29 10:17:11 -07:00
Daniel Chen	3f8035961f	[driver] return immediately in `addArchSpecificRPath` and `getArchSpecificLibPaths` on AIX (#134520 ) `addArchSpecificRPath` should immediately return for AIX as AIX doesn't support `rpath` option. `getArchSpecificLibPaths` should return as well as we don't want `-L/ArchSepcificLibPaths` sent to the linker on AIX.	2025-04-29 10:39:52 -04:00
khaki3	dd2a1590c3	[flang][acc] Generate constructors and destructors for common blocks (#137691 )	2025-04-28 16:11:29 -07:00
Kiran Chandramohan	1b5cd1dfb3	[Flang][OpenMP] Permit loop construct in simd regions (#137020 ) Simdizable constructs are permitted in a simd region. The loop construct is a simdizable construct. Also fixes the TODO corresponding to this.	2025-04-28 16:19:59 +01:00
Raghu Maddhipatla	63d5e64f1e	[Flang] [Semantics] [OpenMP] Add semantic checks for ALLOCATE directive (#123421 ) Add following semantic checks for ALLOCATE directive as per OpenMP 6.0 standard. - List item in ALLOCATE directive must not be a dummy argument - List item in ALLOCATE directive must not have POINTER attribute - List item in ALLOCATE directive must not be a associate name	2025-04-25 09:41:39 -05:00
Valentin Clement (バレンタインクレメン)	09b012fa2d	[flang][openacc] Fix wait clause printer (#137263 ) wait clause printer is failing with case like: ``` !$acc serial device_type(nvidia) wait !$acc end serial ```	2025-04-25 07:35:42 -07:00
Asher Mancinelli	41f1663f11	[flang] Use correct int extension flags for C-ABI calls on aarch64 (#137105 ) The AArch64 procedure call standard does not mandate that the callee extends the return value. Clang does not add signext to functions returning i8 or i16 on linux aarch64, but flang does. This means that runtime routines returning i8's will have signext on the callsite/declaration, but not on the implementation, and the call site will assume the return value has already been sign extended when it has not. This showed up in a test case calling MINVAL on an array of INTEGER*1. Adjust our integer extension flags to match clang and aarch64pcs on linux. The behavior on Darwin should be preserved. This is listed on the apple developer guide as a divergence from aarch64pcs.	2025-04-25 06:57:56 -07:00
Valentin Clement (バレンタインクレメン)	c8dc3ed9c4	[flang][cuda] Convert gpu.launch_func with result (#137231 ) We cannot use `replaceOp` because the newly created operation has not the same number of results.	2025-04-24 12:13:30 -07:00
Erich Keane	2e389cb9aa	[Flang][OpenACC] Make async clause on data consistent with elsewhere (#136866 ) in #136610 we agreed that all async clauses on compute constructs should act as 'only 1 per device-type-group'. On `data`, it has the same specification language, and the same real requirements, so it seems sensible to make it work the same way.	2025-04-23 09:03:03 -07:00
Valentin Clement (バレンタインクレメン)	f11b3decdd	[flang][cuda] Carry over the CUDA attribute in target rewrite (#136811 )	2025-04-23 08:23:02 -07:00
Krzysztof Parzyszek	b5eae19f64	[flang][OpenMP] Introduce OmpHintClause, simplify OmpAtomicClause (#136311 ) The OmpAtomicClause is a variant of a few specific clauses that are used on the ATOMIC construct. The HINT clause, however, was represented as a generic OmpClause, which somewhat complicated the analysis of an OmpAtomicClause. Introduce OmpHintClause to represent the contents of the HINT clause, and use it on OmpAtomicClause similarly to how OmpFailClause is used.	2025-04-22 14:05:32 -05:00
Razvan Lupusoru	b20e063a90	[flang][acc] Generate acc.bounds operation from FIR shape (#136637 ) This PR adds support to be able to generate `acc.bounds` operation through `MappableType`'s `generateAccBounds` when there is no fir.box entity. This is especially useful because the FIR type does not capture size information for explicit-shape arrays and current implementation relied on finding the box entity. This scenario is possible because during HLFIRtoFIR, `fir.array_coor` and `fir.box_addr` operations are often optimized to use raw address. If one tries to map the ssa value that represents such a variable, correct dimensions need extracted from the shape information held in the fir declare operation.	2025-04-22 09:30:49 -07:00
Tom Eccles	97eb416c65	[flang][Parser][OpenMP] Fix unparser for cancellation_construct_type (#136001 ) Previously the unparser would print like ``` !$OMP CANCEL CANCELLATION_CONSTRUCT_TYPE(SECTIONS) ``` This is not valid Fortran. I have fixed it to print without the clause name.	2025-04-22 10:55:50 +01:00
Valentin Clement (バレンタインクレメン)	46e734746d	[flang][cuda] Update stream type for cuf kernel op (#136627 ) Update the type of the stream operand to be similar to KernelLaunchOp.	2025-04-21 19:22:07 -07:00
Razvan Lupusoru	e79d8f6892	[flang][acc] Update stride calculation to include inner-dimensions (#136613 ) The acc.bounds operation allows specifying stride - but it did not clarify what it meant. The dialect was updated to specifically note that stride must capture inner dimension sizes when specified for outer dimensions. Flang lowering was also updated for OpenACC to adhere to this. This was already the case for descriptor-based arrays - but now this is also being done for all arrays.	2025-04-21 16:03:12 -07:00
Valentin Clement (バレンタインクレメン)	3ceb3d96ff	[flang][openacc] Make async clause behavior homogenous (#136610 ) The `async` clause was not handed in a similar way on `serial`, `parallel` and `kernels` directive. This patches updates the `ACC.td` file and the flang semantic to make it homogenous.	2025-04-21 14:48:47 -07:00
Valentin Clement (バレンタインクレメン)	d08e980065	[flang][cuda] Only convert launch from CUDA Fortran kernels (#136221 ) Make sure `gpu.launch_func` has a CUDA proc attribute and update the conversion pattern to only convert those with the attribute.	2025-04-21 10:51:48 -07:00
Daniel Chen	5133b432bc	Enable `-m32`, `-maix32` and `-maix64` for Flang on AIX. (#136202 ) This PR enables `-m32`, -`maix32` and `-maix64` for AIX only. For other platforms, the driver will issue an error that `-m32` is not supported.	2025-04-21 12:05:48 -04:00
Slava Zakharin	86a03367bf	[flang] Support fir.pack_array in FIR alias analysis. (#131946 ) `fir.pack_array` is just a pass-through op for the process of finding the source in FIR alias analysis (as defined in #127147).	2025-04-21 08:59:41 -07:00
Slava Zakharin	50db7a7d26	[flang] Fixed fir.dummy_scope generation to work for TBAA. (#136382 ) The nesting of fir.dummy_scope operations defines the roots of the TBAA forest. If we do not generate fir.dummy_scope in functions that do not have any dummy arguments, then the globals accessed in the function and the dummy arguments accessed by the callee may end up in different sub-trees of the same root. The added tbaa-with-dummy-scope2.fir demonstrates the issue.	2025-04-18 17:19:12 -07:00
Slava Zakharin	15e662bf06	[NFC][flang] Removed literal numerical references from some LIT tests. (#136346 )	2025-04-18 13:12:50 -07:00
Valentin Clement (バレンタインクレメン)	7c9f4f1128	[flang][openacc] Make num_gangs, num_workers and vector_length behavior homogenous with parallel (#136341 )	2025-04-18 13:11:37 -07:00
Peter Klausler	aac53ad4d6	[flang] Don't perform macro replacement on exponents (#136176 ) See new test. I inadvertently broke this behavior with a recent fix for another problem, because the effects of the overloaded TokenSequence::Put() member function on token merging were confusing. Rename and document the various overloads.	2025-04-18 12:52:04 -07:00
Peter Klausler	544940846d	[flang][CUDA] Add error & warning for device argument first dimension… (#136058 ) … discontiguity For dummy assumed-shape/-rank device arrays, test the associated actual argument for stride-1 contiguity, and report an error when the actual argument is known to not be stride-1 contiguous and nonempty, or a warning when when the actual argument is not known to be empty or stride-1 contiguous.	2025-04-18 12:51:38 -07:00
Peter Klausler	6a7044a7b8	[flang] Improve OpenACC SELF clause parser (#135883 ) The current parser can fail on "self(x * 2)" by recognizing just "x" as a one-element list of object names and then failing at a higher level because it never reached the right parenthesis. Add lookahead checks and error recovery. Fixes https://github.com/llvm/llvm-project/issues/135810.	2025-04-18 12:51:00 -07:00
Peter Klausler	b4ff435b84	[flang] Fix fixed-form continuations of !$ OpenMP conditional lines (#135852 ) I broke fixed-form line continuation (without !$) for OpenMP !$ conditional compilation lines. Fix it.	2025-04-18 12:50:34 -07:00
Peter Klausler	46387cd184	[flang] Compile the output of -fdebug-unparse-with-modules (#135696 ) The output of a compilation with the -fdebug-unparse-with-modules option comprises its normal unparsed output along with the regenerated contents of any modules that were required from module files. This is handy for producing stand-alone test cases. The modules' contents are generated by the same code that writes module files, so they can contain some USE associations to private entities in other modules that are necessary to complete local declarations, usually initializers. Such USE associations to private entities are not flagged as fatal errors when modules are read from module files, but they currently are caught when the output produced by this option is being read back in to the compiler. Handle this case by softening the error to a warning when one module uses a private entity from another with an alias containing the non-conforming '$' character. (I could have omitted the message altogether, but there are other valid warnings that will occur due to undefined function result variables; further, I didn't want to provide a general hole around the protection of private names.)	2025-04-18 12:48:55 -07:00
Peter Klausler	21a406c92c	[flang] Improve runtime SAME_TYPE_AS() (#135670 ) The present implementation of the intrinsic function SAME_TYPE_AS() yields false positive .TRUE. results for distinct derived types that happen to have the same name. Replace with an implementation that can now depend on derived type information records being the same type if and only if they are at the same location, or are PDT instantiations of the same uninstantiated derived type. And ensure that the derived type information includes references from instantiated PDTs to their original types. (The derived type information format supports these references already, but they were not being set, perhaps because the current faulty SAME_TYPE_AS implementation didn't need them, and nothing else does.) Fixes https://github.com/llvm/llvm-project/issues/135580.	2025-04-18 12:48:33 -07:00
Slava Zakharin	4091f4dd96	Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 ) (#136246 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-18 11:56:07 -07:00
Valentin Clement (バレンタインクレメン)	d79bb93278	[flang][cuda] Carry over the stream information to kernel launch (#136217 ) In CUDA Fortran the stream is encoded in an INTEGER(cuda_stream_kind) variable. This information is carried over the GPU dialect through the `cuf.stream_cast` and the token in the GPU ops. When converting the `gpu.launch_func` to runtime call, the `cuf.stream_cast` becomes a no-op and the reference to the stream is passed to the runtime. The runtime is adapted to take integer references instead of value for stream.	2025-04-18 10:44:18 -07:00
Slava Zakharin	ce0c472791	Revert "Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 )" This reverts commit 32311a6b68d3de4642599abe14922c686bdb30fc.	2025-04-17 17:26:55 -07:00
Slava Zakharin	32311a6b68	Reland [flang] Generalized simplification of HLFIR reduction ops. (#136071 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-17 16:19:47 -07:00
Slava Zakharin	f39242ceed	Revert "[flang] Generalized simplification of HLFIR reduction ops." (#136218 ) Reverts llvm/llvm-project#136071	2025-04-17 15:47:46 -07:00
Slava Zakharin	655b9db7b9	[flang] Generalized simplification of HLFIR reduction ops. (#136071 ) This change generalizes SumAsElemental inlining in SimplifyHLFIRIntrinsics pass so that it can be applied to ALL, ANY, COUNT, MAXLOC, MAXVAL, MINLOC, MINVAL, SUM. This change makes the special handling of the reduction operations in OptimizedBufferization redundant: once HLFIR operations are inlined, the hlfir.elemental inlining should do the rest of the job.	2025-04-17 15:42:48 -07:00
Razvan Lupusoru	91c2607aac	[flang][acc] Avoid implicitly privatizing IVs already privatized (#136181 ) When generating `acc.loop`, the IV was always implicitly privatized. However, if the user explicitly privatized it, the IR generated wasn't quite right. For example: ``` !$acc loop private(i) do i = 1, n a(i) = b(i) end do ``` The IR generated looked like: ``` %65 = acc.private varPtr(%19#0 : !fir.ref<i32>) -> !fir.ref<i32> {implicit = true, name = "i"} %66:2 = hlfir.declare %65 {uniq_name = "_QFEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %67 = acc.private varPtr(%66#0 : !fir.ref<i32>) -> !fir.ref<i32> {name = "i"} acc.loop private(@privatization_ref_i32 -> %65 : !fir.ref<i32>, @privatization_ref_i32 -> %67 : !fir.ref<i32>) control(%arg0 : i32) = (%c1_i32_46 : i32) to (%c10_i32_47 : i32) step (%c1_i32_48 : i32) { fir.store %arg0 to %66#0 : !fir.ref<i32> ``` In order to fix this, we first process all of the clauses. Then when attempting to generate implicit private IV, we look for an already existing data clause operation. The result is the following IR: ``` %65 = acc.private varPtr(%19#0 : !fir.ref<i32>) -> !fir.ref<i32> {name = "i"} %66:2 = hlfir.declare %65 {uniq_name = "_QFEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) acc.loop private(@privatization_ref_i32 -> %65 : !fir.ref<i32>) control(%arg0 : i32) = (%c1_i32_46 : i32) to (%c10_i32_47 : i32) step (%c1_i32_48 : i32) { fir.store %arg0 to %66#0 : !fir.ref<i32> ```	2025-04-17 13:11:52 -07:00
Valentin Clement (バレンタインクレメン)	91f9f0fa1e	[flang][cuda] Update cuf.kernel_launch stream and conversion (#136179 ) Update `cuf.kernel_launch` to take the stream as a reference. Update the conversion to insert the `cuf.stream_cast` op so the stream can be set as dependency.	2025-04-17 12:55:08 -07:00
Slava Zakharin	da959c92c5	[flang] Fixed out-of-bounds access in SimplifyIntrinsics. (#136171 ) When the mask is scalar, it is incorrect to cast it to !fir.box<!fir.array<1xlogical<>>>, because the coordinate operation will try to read the dim-1 stride from the box to get the address of the first element. Even though the stride value will be multiplied by 0, and does not matter, it is still a read past the allocated box object. Instead, we should just use box_addr to get the address of the scalar mask.	2025-04-17 11:46:06 -07:00
Daniel Chen	e2c382346f	[flang] Add 32-bit AIX target specific in order to build 32-bit flang-rt (#136051 )	2025-04-17 14:26:36 -04:00
Valentin Clement (バレンタインクレメン)	9ee4fdf499	[flang][cuda] Introduce stream cast op (#136050 ) Cast a stream object reference as a GPU async token. This is useful to be able to connect the stream representation of CUDA Fortran and the async mechanism of the GPU dialect. This op will later become a no op.	2025-04-17 07:25:48 -07:00
Tom Eccles	c5e112eed7	[flang][OpenMP][Semantics] Disallow NOWAIT and ORDERED with CANCEL (#135991 ) NOWAIT was a tricky one because the clause can be on either the start or the end directive. I couldn't find a convenient way to access the end directive from the CANCEL directive nested inside of the construct, but there are convenient ways to access the start directive. I have added a list to the start directive context containing the clauses from the end directive.	2025-04-17 10:08:07 +01:00
Asher Mancinelli	f3bf844d2f	[flang] Unwrap sequence types when checking for descriptor members (#136039 ) The TBAA generation gives conservative TBAA metadata when handling an access of a record type with a descriptor member, since the access may be a regular data access OR another descriptor. Array members were being incorrectly identified as non-descriptor-members, and were giving incorrect TBAA metadata which led to bugs showing up in the optimizer when LLVM encountered mismatching TBAA. `fir::isRecordWithDescriptorMember` now unwraps sequence types before checking for descriptor members.	2025-04-16 17:15:15 -07:00
David Truby	e64305096a	[flang] Complete alignment of -x language modes with gfortran (#133775 )	2025-04-16 23:26:20 +01:00
Kareem Ergawy	30990c09c9	Revert "[flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#132904 )" (#135904 ) This reverts commit 04b87e15e40f8857e29ade8321b8b67691545a50. The reasons for reverting is that the following: 1. I still need need to upstream some part of the do concurrent to OpenMP pass from our downstream implementation and taking this in downstream will make things more difficult. 2. I still need to work on a solution for modeling locality specifiers on `hlfir.do_concurrent` ops. I would prefer to do that and merge the entire stack together instead of having a partial solution. After merging the revert I will reopen the origianl PR and keep it updated against main until I finish the above.	2025-04-16 07:20:27 -05:00
Kareem Ergawy	04b87e15e4	[flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#132904 ) Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref<i32> %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref<i32> } } ```	2025-04-16 06:14:38 +02:00

1 2 3 4 5 ...

5724 Commits