The compiler can recognize a compiler directive when one results from a
macro expansion at the beginning of a non-comment source line, as in
"#define FOO !$OMP". But it can't recognize a compiler directive that
initially appears as a comment line, as in "!BAR" after "#define BAR
$OMP". Extend the prescanner to recognize such cases in free form
source. (Fixed form is a much more complicated case for this recognition
and will be addressed later if needed.)
This is the 2nd version of this patch; the first was reverted after
problems with continuation lines were encountered.
Fixes https://github.com/llvm/llvm-project/issues/178481.
…ent (#180062)"
This reverts commit 0d64801bc3b99a73d20032f74df3b87e0a7ed04e.
Lines like "!MACRO ... &" in which MACRO expands to a compiler directive
are now failing because they are not being recognized as having line
continuations. Will fix and try again.
The compiler can recognize a compiler directive when one results from a
macro expansion at the beginning of a non-comment source line, as in
"#define FOO !$OMP". But it can't recognize a compiler directive that
initially appears as a comment line, as in "!BAR" after "#define BAR
$OMP". Extend the prescanner to recognize such cases in free form
source. (Fixed form is a much more complicated case for this recognition
and will be addressed later if needed.)
Fixes https://github.com/llvm/llvm-project/issues/178481.
A character in a fixed-form source line (as opposed to a comment or
compiler directive) can't have a non-digit in its label field, columns 1
through 5. The prescanner presently emits only a warning for this case.
Retain the warning for -E output, but otherwise diagnose an error.
(This change affected a number of tests that relied on this situation
being a warning just so that they could test prescanner warnings, and
those tests were adjusted to use another warning.)
Fixes https://github.com/llvm/llvm-project/issues/50563.
Reapplication of #137828, changes:
* Workaround CMAKE_Fortran_PREPROCESS_SOURCE issue for CMake < 2.24: The
issue is that `try_compile` does not forward manually-defined compiler
flang variables to the test build environment; instead of just a
negative test result, it aborts the configuration step itself. To be
fair, manually defining these variables is deprecated since at least
CMake 3.6.
* Missing flang cmd line flags for CMake < 3.28 `-target=`, `-O2`, `-O3`
* It is now possible to set FLANG_RT_ENABLED_STATIC=OFF and
FLANG_RT_ENABLE_SHARED=OFF at the same and is the default for amdgpu and
nvptx targets. In this mode, only the .mod files are compiled --
necessary for module files in
lib/clang/22/finclude/flang/(nvptx64-nvidia-cuda|amdgpu-amd-amdhsa)/*.mod
to be available.
* For compiling omp_lib.mod for nvptx and amdgpu, the module build
functionality must be hoisted out if openmp's runtime/ directory which
is only included for host targets. This PR now requires #169909.
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Constants such as [`c_long_double` also have different
values](d748c81218/flang-rt/lib/runtime/iso_c_binding.f90 (L77-L81)).
Most other modules have `#ifdef`-enclosed code as well. For instance
this caused offload targets nvptx64-nvidia-cuda/amdgpu-amd-amdhsa to use
the modules files compiled for the host which may contrain uses of the
types REAL(10) or REAL(16) not available for nvptx/amdgpu.
#146876#128015#129742#158790
3. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
4. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted (PR
#169525).
If using Flang for cross-compilation or target-offloading, flang-rt must
now be compiled for each target not only for the library, but also to
get the target-specific module files. For instance in a bootstrapping
runtime build, this can be done by adding:
`-DLLVM_RUNTIME_TARGETS=default;nvptx64-nvidia-cuda;amdgpu-amd-amdhsa`.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Another subdirectory `flang` avoids accidental inclusion of
gfortran-modules which due to compression would result in
user-unfriendly errors. Now the driver adds `-fintrinsic-module-path`
for that location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
Move building the .mod files from openmp/flang to openmp/flang-rt using
a shared mechanism. Motivations to do so are:
1. Most modules are target-dependent and need to be re-compiled for each
target separately, which is something the LLVM_ENABLE_RUNTIMES system
already does. Prime example is `iso_c_binding.mod` which encodes the
target's ABI. Most other modules have `#ifdef`-enclosed code as well.
2. CMake has support for Fortran that we should use. Among other things,
it automatically determines module dependencies so there is no need to
hardcode them in the CMakeLists.txt.
3. It allows using Fortran itself to implement Flang-RT. Currently, only
`iso_fortran_env_impl.f90` emits object files that are needed by Fortran
applications (#89403). The workaround of #95388 could be reverted.
Some new dependencies come into play:
* openmp depends on flang-rt for building `lib_omp.mod` and
`lib_omp_kinds.mod`. Currently, if flang-rt is not found then the
modules are not built.
* check-flang depends on flang-rt: If not found, the majority of tests
are disabled. If not building in a bootstrpping build, the location of
the module files can be pointed to using
`-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone
build. Alternatively, the test needing any of the intrinsic modules
could be marked with `REQUIRES: flangrt-modules`.
* check-flang depends on openmp: Not a change; tests requiring
`lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with
`openmp_runtime`.
As intrinsic are now specific to the target, their location is moved
from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The
mechnism to compute the location have been moved from flang-rt
(previously used to compute the location of `libflang_rt.*.a`) to common
locations in `cmake/GetToolchainDirs.cmake` and
`runtimes/CMakeLists.txt` so they can be used by both, openmp and
flang-rt. Potentially the mechnism could also be shared by other
libraries such as compiler-rt.
`finclude` was chosen because `gfortran` uses it as well and avoids
misuse such as `#include <flang/iso_c_binding.mod>`. The search location
is now determined by `ToolChain` in the driver, instead of by the
frontend. Now the driver adds `-fintrinsic-module-path` for that
location to the frontend call (Just like gfortran does).
`-fintrinsic-module-path` had to be fixed for this because ironically it
was only added to `searchDirectories`, but not
`intrinsicModuleDirectories_`. Since the driver determines the location,
tests invoking `flang -fc1` and `bbc` must also be passed the location
by llvm-lit. This works like llvm-lit does for finding the include dirs
for Clang using `-print-file-name=...`.
The compiler presently tokenizes the bodies of only function-like macro
definitions from the command line, and does so crudely. Tokenize
keyword-like macros too, get character literals right, and handle
numeric constants correctly. (Also delete two needless functions noticed
in characters.h.)
Fixes https://github.com/llvm/llvm-project/issues/168077.
An OpenMP, OpenACC, or CUDA conditional line should be treated as a
comment when that's what its payload contains, not as a conditional
source line that will confuse the parser when it is indeed just a
comment.
Some Fortran source-level features that work for OpenMP !$ conditional
lines, such as free form line continuation, don't work for OpenACC !@acc
or CUDA !@cuf conditional lines. Make them less particular.
Fixes https://github.com/llvm/llvm-project/issues/164470.
Some old code in the prescanner, antedating the current -E output
mechanisms, retains the !DIR$ FIXED and !DIR$ FREE directives in the
input, and will even generate them to append to the scanned source from
source and include files to restore the fixed/free source form
distinction. But these directives have not been needed since the -E
output generator began generating source form insensitive output, and
they can confuse the parser's error recovery when the appended
directives follow the END statement. Change their handling so that
they're read and respected by the prescanner but no longer retained in
either the -E output or the cooked character stream passed on to the
parser.
Fixes a regression reported by @danielcchen after PR 159834.
Predefine the `__pic__/__pie__/__PIC__/__PIE__` macros based on the
configured relocation level. This logic mirrors that of the clang
driver, where `__pic__/__PIC__` are defined for both PIC and PIE modes,
but `__pie__/__PIE__` are only defined for PIE mode.
Fixes https://github.com/llvm/llvm-project/issues/135275
Fixes#148386
The first time the line was classified (using
`Prescanner::ClassifyLine(const char *)`) the line was correctly
classified as a compiler directive. But then later on the token form is
invoked (`Prescanner::ClassifyLine(TokenSequence, Provenance)`). This
one incorrectly classified the line as a comment because there was no
whitespace token right after the sentinel. This fixes the issue by
ensuring this whitespace is added.
When blank tokens arise from macro replacement in token sequences with
token pasting (##), the preprocessor is producing some bogus tokens
(e.g., "name(") that can lead to subtle bugs later when macro names are
not recognized as such.
The fix is to not paste tokens together when the result would not be a
valid Fortran or C token in the preprocessing context.
…ion line
An obsolete flag ("insertASpace_") is being used to signal some cases in
the prescanner's implementation of continuation lines when a token
should be broken when it straddles a line break. It turns out that it's
sufficient to simply note these cases without ever actually inserting a
space, so don't do that (fixing the motivating bug). This leaves some
variables with obsolete names, so change them as well.
This patch handles the third of the three bugs reported in
https://github.com/llvm/llvm-project/issues/146362 .
`%flang` expands to `flang -isysroot <SDK location>` in Mac and probably
other OS as well. `fc1` is only accepted as the first argument and hence
in this case it fails.
Use the `%flang_fc1` option to correctly expand to `flang -fc1 -isysroot
<SDK location>`.
This commit adds support for the `__COUNTER__` preprocessor macro, which
works the same as the one found in clang.
It is useful to generate unique names at compile-time.
When processing free form source line continuation, the prescanner
treats empty keyword macros as if they were spaces or tabs. After
skipping over them, however, there's code that only works if the skipped
characters ended with an actual space or tab. If the last skipped item
was an empty keyword macro's name, the last character of that name would
end up being the first character of the continuation line. Fix.
Address failing Fujitsu test suite cases that were broken by the patch
to defer the handling of !$ lines in -fopenmp vs. normal compilation to
actual compilation rather than processing them immediately in -E mode.
Tested on the samples in the bug report as well as all of the Fujitsu
tests that I could find that use !$ lines.
Fixes https://github.com/llvm/llvm-project/issues/136845.
Fixed the issue, where the extra text on #else line (' Z' in the example
below) caused the data from the "else" clause to be processed together
with the data of "then" clause.
```
#ifndef XYZ42
PARAMETER(A=2)
#else Z
PARAMETER(A=3)
#endif
end
```
See new test. I inadvertently broke this behavior with a recent fix for
another problem, because the effects of the overloaded
TokenSequence::Put() member function on token merging were confusing.
Rename and document the various overloads.
Recent work to better handle macro replacement in literal constant kind
suffixes isn't handling fixed form well, leading to a crash in Fujitsu
test 0113/0113_0073.F. The look-ahead needs to be done with the
higher-level prescanner functions that skip over fixed form comment
fields after column 72. Rework.
See test case. When Fortran line continuation has been used, don't
insert spaces in -E formatted output to put things into the right
column, as this can break up a token.
Fixes https://github.com/llvm/llvm-project/issues/134986.
No longer require -fopenmp or -fopenacc with -E, unless specific version
number options are also required for predefined macros. This means that
most source can be preprocessed with -E and then later compiled with
-fopenmp, -fopenacc, or neither.
This means that OpenMP conditional compilation lines (!$) are also
passed through to -E output. The tricky part of this patch was dealing
with the fact that those conditional lines can also contain regular
Fortran line continuation, and that now has to be deferred when !$ lines
are interspersed.
The preprocessor can perform macro replacement within identifiers when
they are split up with Fortran line continuation, but is failing to do
macro replacement on a continued identifier when none of its parts are
replaced.
`-fd-lines-as-code` and `-fd-lines-as-comments` enables treatment for
lines beginning with `d` or `D` in fixed form sources.
Using these options in free form has no effect.
If the `-fd-lines-as-code` option is given they are treated as if the
first column contained a blank.
If the `-fd-lines-as-comments` option is given, they are treated as
comment lines.
On Darwin, the --isysroot flag must also be specified. This
happens when either %flang or %flang_fc1 is expanded. As -fc1 must
be the first argument, %flang_fc1 must be used in tests, instead of
%flang -fc1.
Commas being optional in FORMAT statements, the tokenization of things
like 3I9HHOLLERITH is tricky. After tokenizing the initial '3', we don't
want to then take apparent identifier "I9HHOLLERITH" as the next token.
So the prescanner just consumes the letter ("I") as its own token in
this context.
A recent bug report complained that this can lead to incorrect results
when (in this case) the letter is a defined preprocessing macro. I
updated the prescanner to check that the letter is actually followed by
an instance of a problematic Hollerith literal.
And this broke two tests in the Fujitsu Fortran test suite that Linaro
runs, as it couldn't detect a following Hollerith literal that wasn't on
the same source line. We can't do look-ahead line continuation
processing in NextToken(), either.
So here's a second attempt at fixing the original problem: namely, the
letter that follows a decimal integer token is checked to see whether
it's the name of a defined macro.
In order to properly expose the Hollerith editing item in something like
FORMAT(3I9HHOLLERITH) as its own token, the tokenization routine in the
prescanner has special handling for digit strings followed by letters
("3I" above). This handler's effects are too broad, and prevent a macro
name from being recognized as such in a reported bug; make the test for
a hidden Hollerith more precise.
Fixes https://github.com/llvm/llvm-project/issues/121931.
A free form source line that begins with a macro should still be
classified as a source line, and have its continuation lines work, even
if the macro expands to an empty replacement.
Fixes https://github.com/llvm/llvm-project/issues/117297.
The character 'e' or 'd' (either case) shouldn't be tokenized as part of
a real literal during preprocessing if it is not followed by an
optionally-signed digit string. Doing so prevents it from being
recognized as a macro name, or as the start of one.
Fixes https://github.com/llvm/llvm-project/issues/115676.
When a fixed form source line begins with the name of a macro, don't
emit the usual warning message about a non-decimal character in the
label field. (The check for a macro was only being applied to free form
source lines, and the label field checking was unconditional).
Fixes https://github.com/llvm/llvm-project/issues/116914.
The preprocessor implements "defined(X)" and "defined X" in if/elif
directive expressions in such a way that they only work at the top
level, not when they appear in macro expansions. Fix that, which is a
little tricky due to the need to detect the "defined" keyword before
applying any macro expansion to its argument, and add a bunch of tests.
Fixes https://github.com/llvm/llvm-project/issues/114064.
When running fixed-form source through the compiler under -E, don't
aggressively remove space characters, since the parser won't be parsing
the result and some tools might need to see the spaces in the -E
preprocessed output.
Fixes https://github.com/llvm/llvm-project/issues/112279.
…INCLUDE lines
The compiler current treats an INCLUDE line as essentially a synonym for
a preprocessing #include directive. The causes macros that have been
defined at the point where the INCLUDE line is processed to be replaced
within the text of the included file.
This behavior is surprising to users who expect an INCLUDE line to be
expanded into its contents *after* preprocessing has been applied to the
original source file, with no further macro expansion.
Change INCLUDE line processing to use a fresh instance of Preprocessor
containing no macro definitions except _CUDA in CUDA Fortran
compilations and, if the original file was being preprocessed, the
standard definitions of __FILE__, __LINE__, and so forth.
Codes using traditional C preprocessors will sometimes put a keyword
macro name in a free form continuation line in order to get macro
replacement of part of an identifier, as in
call subr_&
&N&
&(1.)
where N is a keyword macro. f18 already handles this case, but not when
there is white space between the macro name and the following
continuation marker character '&'. Allow white space to appear.
Fixes https://github.com/llvm/llvm-project/issues/106931.