1400 Commits

Author SHA1 Message Date
Wael Yehia
645f6dcd69 [ThinLTO][AIX] Enable thinlto on AIX
Starting from AIX 7.2 TL5 SP6 and AIX 7.3 TL2 the system linker supports thinLTO.

Reviewed By: ZarkoCA, MaskRay

Differential Revision: https://reviews.llvm.org/D155700
2023-07-19 17:37:15 +00:00
Joseph Huber
d2ac0069a2 [Clang] Only emit CUDA version warnings when creating the CUDA toolchain
This warning primarily applies to users of the CUDA langues as there may
be new features we rely on. The other two users of the toolchain are
OpenMP via `-fopenmp --offload-arch=sm_70` and a cross-compiled build
via `--target=nvptx64-nvida-cuda -march=sm_70`. Both of these do not
rely directly on things that would change significantly between CUDA
versions, and the way they are built can sometims make this warning
print many times.

This patch changees the behaiour to only check for the version when
building for CUDA offloading specifically, the other two will not have
this check.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D155606
2023-07-18 13:48:11 -05:00
Siu Chi Chan
cbc4bbb85c [HIP] Ignore host linker flags for device-only
When compiling in device only mode (e.g. --offload-device-only), the
host linker phase would not happen and therefore, the driver should
ignore all the host linker flags.

Differential Revision: https://reviews.llvm.org/D154881

Change-Id: I8244acef5c33108cf15b1dbb188f974f30099718
2023-07-17 16:29:15 -04:00
Paul Kirth
610fc5cbcc [clang] Preliminary fat-lto-object support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

This patch adds support for that flag in the driver, as well as setting the
necessary codegen options for the backend. Largely, this means we select
the newly added pass pipeline for generating fat objects.

Users are expected to pass -ffat-lto-objects to clang in addition to one
of the -flto variants. Without the -flto flag, -ffat-lto-objects has no
effect.

// Compile and link. Use the object code from the fat object w/o LTO.
clang -fno-lto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select full LTO at link time.
clang -flto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select ThinLTO at link time.
clang -flto=thin -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Use ThinLTO  with the UnifiedLTO pipeline.
clang -flto=thin -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Compile and link. Use full LTO  with the UnifiedLTO pipeline.
clang -flto -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Link separately, using ThinLTO.
clang -c -flto=thin -ffat-lto-objects foo.c
clang -flto=thin -fuse-ld=lld foo.o -ffat-lto-objects  # pass --lto=thin --fat-lto-objects to ld.lld

// Link separately, using full LTO.
clang -c -flto -ffat-lto-objects foo.c
clang -flto -fuse-ld=lld foo.o  # pass --lto=full --fat-lto-objects to ld.lld

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Depends on D146776

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D146777
2023-07-17 16:26:21 +00:00
Tobias Hieta
af744f0b84 [LLD][COFF] Add LLVM toolchain library paths by default.
We want lld-link to automatically find compiler-rt's and
libc++ when it's in the same directory as the rest of the
toolchain. This is because on Windows linking isn't done
via the clang driver - but instead invoked directly.

This prepends: <llvm>/lib <llvm>/lib/clang/XX/lib and
<llvm>/lib/clang/XX/lib/windows automatically to the library
search paths.

Related to #63827

Differential Revision: https://reviews.llvm.org/D151188
2023-07-14 14:37:24 +02:00
Akira Hatanaka
509d051606 [Driver] Warn about -mios-version-min instead of erroring out when
targeting MachO embedded architectures

Sometimes users pass this option when targeting embedded architectures
like armv7m on non-darwin platforms.

Emit a warning instead of erroring out, which restores the behavior
prior to 34d7acd444b88342fc93fca202608c1e16fa5946.
2023-07-13 06:44:54 -07:00
Jeffrey Byrnes
be8a65b598 [HIP]: Add -fhip-emit-relocatable to override link job creation for -fno-gpu-rdc
Differential Revision: https://reviews.llvm.org/D153667

Change-Id: Idcc5c7c25dc350b8dc9a1865fd67982904d06ecd
2023-06-29 08:18:28 -07:00
Haohai Wen
82dff24bde Reland [COFF] Support -gsplit-dwarf for COFF on Windows
This relands 3eee5aa528abd67bb6d057e25ce1980d0d38c445 with fixes.
2023-06-26 15:48:38 +08:00
Nico Weber
b851308b87 Revert "[COFF] Support -gsplit-dwarf for COFF on Windows"
This reverts commit 3eee5aa528abd67bb6d057e25ce1980d0d38c445.

Breaks tests on mac, see https://reviews.llvm.org/D152785#4447118
2023-06-25 14:32:36 -04:00
Haohai Wen
3eee5aa528 [COFF] Support -gsplit-dwarf for COFF on Windows
D152340 has split WinCOFFObjectWriter to WinCOFFWriter. This patch adds
another WinCOFFWriter as DwoWriter to write Dwo sections to dwo file.
Driver options are also updated accordingly to support -gsplit-dwarf in
CL mode.

e.g. $ clang-cl  -c -gdwarf -gsplit-dwarf foo.c

Like what -gsplit-dwarf did in ELF, using this option will create DWARF object
(.dwo) file. DWARF debug info is split between COFF object and DWARF object
file. It can reduce the executable file size especially for large project.

Reviewed By: skan, MaskRay

Differential Revision: https://reviews.llvm.org/D152785
2023-06-25 11:54:39 +08:00
Michael Platings
041ffc155f [Clang][Driver] Warn on invalid Arm or AArch64 baremetal target triple
A common user mistake is specifying a target of aarch64-none-eabi or
arm-none-elf whereas the correct names are aarch64-none-elf &
arm-none-eabi. Currently if a target of aarch64-none-eabi is specified
then the Generic_ELF toolchain is used, unlike aarch64-none-elf which
will use the BareMetal toolchain. This is unlikely to be intended by the
user so issue a warning that the target is invalid.

The target parser is liberal in what input it accepts so invalid triples
may yield behaviour that's sufficiently close to what the user intended.
Therefore invalid triples were used in many tests. This change updates
those tests to use valid triples.
One test (gnu-mcount.c) relies on the Generic_ELF toolchain behaviour so
change it to explicitly specify aarch64-unknown-none-gnu as the target.

Reviewed By: peter.smith, DavidSpickett

Differential Revision: https://reviews.llvm.org/D153430
2023-06-23 11:54:29 +01:00
Fangrui Song
a79995ca60 [Driver] Allow warning for unclaimed TargetSpecific options
For unclaimed target-agnostic options, we can apply clang_ignored_gcc_optimization_f_Group
to accept but warn about them.
```
% clang -c -fexpensive-optimizations a.c
clang: warning: optimization flag '-fexpensive-optimizations' is not supported [-Wignored-optimization-argument]
```

For an unclaimed target-specific option, one target may want to accept but warn
about it. Add `llvm::opt::Arg::IgnoredTargetSpecific` to support this warning
need.

Close https://github.com/llvm/llvm-project/issues/63282

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D152856
2023-06-16 08:32:25 -07:00
Joseph Huber
e96bec9cd8 [OpenMP] Correctly diagnose conflicting target identifierers for AMDGPU
There are static checks on the target identifiers allowed in a single
TU. Previously theses checks were only applied to HIP even though they
should be the same for OpenMP targeting AMDGPU. Simply enable these
checks for OpenMP.

Reviewed By: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D152965
2023-06-15 07:06:44 -05:00
Michael Platings
edc1130c0a [Driver] Enable selecting multiple multilibs
This will enable layering multilibs on top of each other.
For example a multilib containing only a no-exceptions libc++ could be
layered on top of a multilib containing C libs. This avoids the need
to duplicate the C library for every libc++ variant.

This change doesn't expose the functionality externally, it only opens
the functionality up to be potentially used by ToolChain classes.

Differential Revision: https://reviews.llvm.org/D143059
2023-06-14 06:46:41 +01:00
Michael Platings
a794ab92b4 [Driver] Add -print-multi-flags-experimental option
This option causes the flags used for selecting multilibs to be printed.
This is an experimental feature that is documented in detail in D143587.

Differential Revision: https://reviews.llvm.org/D142933
2023-06-14 06:46:41 +01:00
Fangrui Song
cd18efb61d [Driver] Make -G TargetSpecific
so that we report `unsupported option '-G' for target ...` on
unsupported targets (most targets).
This error is tested by one target in aix-err-options.c.

Follow-up to D89897 and D90063.
2023-06-08 09:02:12 -07:00
Fangrui Song
d81ce04587 [Driver] Report error for unsupported -mlarge-endian/-mlittle-endian 2023-05-30 12:45:21 -07:00
Fangrui Song
fbea5aada1 [Driver] Add ClangFlags::TargetSpecific to simplify err_drv_unsupported_opt_for_target processing
clang/lib/Driver/ToolChains/Clang.cpp has a lot of fragments like the following:
```
if (const Arg *A = Args.getLastArg(...)) {
  if (Triple is xxx)
    A->render(Args, CmdArgs);
  else
    D.Diag(diag::err_drv_unsupported_opt_for_target) << ...;
}
```

The problem is more apparent with a recent surge of AIX-specific options.

Introduce the TargetSpecific flag so that we can move the target-specific
options to ToolChains/*.cpp and ToolChains/Arch/*.cpp and overload the
warn_drv_unused_argument mechanism to give an err_drv_unsupported_opt_for_target
error.

Migrate -march=/-mcpu= and some AIX-specific options to use this simplified pattern.

Reviewed By: jansvoboda11

Differential Revision: https://reviews.llvm.org/D151590
2023-05-30 11:21:17 -07:00
Kazu Hirata
ed1539c6ad Migrate {starts,ends}with_insensitive to {starts,ends}_with_insensitive (NFC)
This patch migrates uses of StringRef::{starts,ends}with_insensitive
to StringRef::{starts,ends}_with_insensitive so that we can use names
similar to those used in std::string_view.

Note that the llvm/ directory has migrated in commit
6c3ea866e93003e16fc55d3b5cedd3bc371d1fde.

I'll post a separate patch to deprecate
StringRef::{starts,ends}with_insensitive.

Differential Revision: https://reviews.llvm.org/D150506
2023-05-16 10:12:42 -07:00
Erich Keane
b763d6a4ed Add C++26 compile flags.
Now that we've updated to C++23, we need to add C++26/C++2c command line
flags, as discussed in
https://discourse.llvm.org/t/rfc-lets-just-call-it-c-26-and-forget-about-the-c-2c-business-at-least-internally/70383

Differential Revision: https://reviews.llvm.org/D150450
2023-05-15 08:56:16 -07:00
Fangrui Song
49b87b0572 [Driver] -ftime-trace: derive trace file names from -o and -dumpdir
Inspired by D133662.
Close https://github.com/llvm/llvm-project/issues/57285

When -ftime-trace is specified and the driver performs both compilation and
linking phases, the trace files are currently placed in the temporary directory
(/tmp by default on *NIX). A more sensible behavior would be to derive the trace
file names from the -o option, similar to how GCC derives auxiliary and dump
file names. Use -dumpdir (D149193) to implement the -gsplit-dwarf like behavior.

The following script demonstrates the time trace filenames.

```
#!/bin/sh -e
PATH=/tmp/Rel/bin:$PATH                # adapt according to your build directory
mkdir -p d e f
echo 'int main() {}' > d/a.c
echo > d/b.c

a() { rm $1 || exit 1; }

clang -ftime-trace d/a.c d/b.c         # previously /tmp/[ab]-*.json
a a-a.json; a a-b.json
clang -ftime-trace d/a.c d/b.c -o e/x  # previously /tmp/[ab]-*.json
a e/x-a.json; a e/x-b.json
clang -ftime-trace d/a.c d/b.c -o e/x -dumpdir f/
a f/a.json; a f/b.json
clang -ftime-trace=f d/a.c d/b.c -o e/x
a f/a-*.json; a f/b-*.json

clang -c -ftime-trace d/a.c d/b.c
a a.json b.json
clang -c -ftime-trace=f d/a.c d/b.c
a f/a.json f/b.json

clang -c -ftime-trace d/a.c -o e/xa.o
a e/xa.json
clang -c -ftime-trace d/a.c -o e/xa.o -dumpdir f/g
a f/ga.json
```

The driver checks `-ftime-trace` and `-ftime-trace=`, infers the trace file
name, and passes `-ftime-trace=` to cc1. The `-ftime-trace` cc1 option is
removed.

With offloading, previously `-ftime-trace` is passed to all offloading
actions, causing the same trace file to be overwritten by host and
offloading actions. This patch doesn't attempt to support offloading (D133662),
but makes a sensible change (`OffloadingPrefix.empty()`) to ensure we don't
overwrite the trace file.

Minor behavior differences: the trace file is now a result file, which
will be removed upon an error. -ftime-trace-granularity=0, like
-ftime-trace, can now cause a -Wunused-command-line-argument warning.

Reviewed By: Maetveis

Differential Revision: https://reviews.llvm.org/D150282
2023-05-12 10:46:06 -07:00
Fangrui Song
dbedcfdc20 [Driver] Add -dumpdir and change -gsplit-dwarf .dwo names for linking
When the final phase is linking, Clang currently places `.dwo` files in the
current directory (like the `-c` behavior for multiple inputs).
Strangely, -fdebug-compilation-dir=/-ffile-compilation-dir= is considered, which
is untested.

GCC has a more useful behavior that derives auxiliary filenames from the final
output (-o).
```
gcc -c -g -gsplit-dwarf d/a.c d/b.c      # a.dwo b.dwo
gcc -g -gsplit-dwarf d/a.c d/b.c -o e/x  # e/x-a.dwo e/x-b.dwo
gcc -g -gsplit-dwarf d/a.c d/b.c         # a-a.dwo a-b.dwo
```

Port a useful subset of GCC behaviors that are easy to describe to Clang.

* Add a driver and cc1 option -dumpdir
* When the final phase is link, add a default -dumpdir if not specified by the user
* Forward -dumpdir to -cc1 command lines
* tools::SplitDebugName prefers -dumpdir when constructing the .dwo filename

GCC provides -dumpbase. If we use just one of -dumpdir and -dumpbase,
-dumpbase isn't very useful as it appends a dash.

```
gcc -g -gsplit-dwarf -dumpdir e d/a.c   # ea.dwo
gcc -g -gsplit-dwarf -dumpdir e/ d/a.c  # e/a.dwo
gcc -g -gsplit-dwarf -dumpbase e d/a.c  # e-a.dwo
gcc -g -gsplit-dwarf -dumpbase e/ d/a.c # e/-a.dwo
```

If we specify both `-dumpdir` and `-dumpbase`, we can avoid the influence of the
source filename when there is one input file.
```
gcc -g -gsplit-dwarf -dumpdir f/ -dumpbase x d/a.c        # f/x.dwo
gcc -g -gsplit-dwarf -dumpdir f/ -dumpbase x d/a.c d/b.c  # f/x-a.dwo f/x-b.dwo
```

Given the above examples, I think -dumpbase is not useful.

GCC -save-temps has interesting interaction with -dumpdir as -save-temps
generated files are considered auxiliary files like .dwo files.

For Clang, with this patch, -save-temps and -dumpdir are orthogonal, which is
easier to explain.

```
gcc -g -gsplit-dwarf d/a.c -o e/x -dumpdir f/ -save-temps=obj # e/a.{i,s,o,dwo}
gcc -g -gsplit-dwarf d/a.c -o e/x -save-temps=obj -dumpdir f/ # f/a.{i,s,o,dwo}

clang -g -gsplit-dwarf d/a.c -o e/x -save-temps=obj -dumpdir f/ # e/a.{i,s,o} f/a.dwo
```

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D149193
2023-05-09 14:43:46 -07:00
Mark de Wever
ba15d186e5 [clang] Use -std=c++23 instead of -std=c++2b
During the ISO C++ Committee meeting plenary session the C++23 Standard
has been voted as technical complete.

This updates the reference to c++2b to c++23 and updates the __cplusplus
macro.

Drive-by fixes c++1z -> c++17 and c++2a -> c++20 when seen.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D149553
2023-05-04 19:19:52 +02:00
Kito Cheng
da4fcb0c0b [RISCV][Driver] Allow the use of CPUs with a different XLEN than the triple.
Our downstream toolchain release got some issue is we set the default
triple by creating symbolic link of clang like `riscv64-unknown-elf-clang`,
and has lots of multi-lib config including rv32's config.

However when we trying to set arch by a 32 bit CPU like generic-rv32
but got error message below:
error: unsupported argument 'generic-rv32' to option '-mcpu='

`generic-rv32` is listed in the output of `-mcpu=help`, that
might be confusing for user since help message say supported.

So let clang driver also consider -mcpu option during computing
the target triple to archvie that.

Reviewed By: asb, craig.topper

Differential Revision: https://reviews.llvm.org/D148124
2023-04-27 14:46:01 +08:00
Fangrui Song
7f59dba564 [Driver] Remove no-op -frewrite-map-file=
This option has been a no-op since D99707.
2023-04-24 23:18:59 -07:00
Jianjian GUAN
8e3a5a965a [Driver][NFC] Simplify code.
Reviewed By: benshi001, jhuber6

Differential Revision: https://reviews.llvm.org/D148908
2023-04-23 10:56:27 +08:00
Pavel Kosov
28997feb0c [LLVM][OHOS] Clang toolchain and targets
Add a clang part of OpenHarmony target

Related LLVM part: D138202

~~~

Huawei RRI, OS Lab

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D145227
2023-03-20 12:53:24 +03:00
Volodymyr Sapsai
a845aeb5d6 [Driver] Allow to collect -save-stats data to a file specified in the environment variable.
Using two environment variables `CC_PRINT_INTERNAL_STAT` and
`CC_PRINT_INTERNAL_STAT_FILE` to work like `CC_PRINT_PROC_STAT`.

The purpose of the change is to allow collecting the internal stats
without modifying the build scripts. Write all stats to a single file
to simplify aggregating the data.

Differential Revision: https://reviews.llvm.org/D144981
2023-03-16 11:57:59 -07:00
Kazu Hirata
ea9d404032 [clang] Use *{Set,Map}::contains (NFC) 2023-03-14 19:17:18 -07:00
Daniel Thornburgh
d505d20a62 Revert "[LLVM][OHOS] Clang toolchain and targets"
This change had tests that break whenever LLVM_ENABLE_LINKER_BUILD_ID is
set, as is the case in the Fuchsia target.

This reverts commits:
f81317a54586dbcef0c14cf512a0770e8ecaab3d
72474afa27570a0a1307f3260f0187b703aa6d84
2023-03-14 13:46:21 -07:00
Pavel Kosov
72474afa27 [LLVM][OHOS] Clang toolchain and targets
Add a clang part of OpenHarmony target

Related LLVM part: D138202

~~~

Huawei RRI, OS Lab

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D145227
2023-03-14 12:24:44 +03:00
Ben Langmuir
fcab930cd3 [clang][deps] Handle response files in dep scanner
Extract the code the driver uses to expand response files and reuse it
in the dependency scanner.

rdar://106155880

Differential Revision: https://reviews.llvm.org/D145838
2023-03-13 15:47:35 -07:00
David Tenty
9a733e8a2c [clang][driver] accept maix32/maix64 gcc compat options
GCC on AIX primarily uses the -maix32 and -maix64 to select the bitmode
to target. In order to be compatible with existing build configurations,
clang should accept these options as well. In this patch we implement
these options for AIX targets.

Differential Revision: https://reviews.llvm.org/D145610
2023-03-13 17:05:52 -04:00
Yaxun (Sam) Liu
1f8a3ce325 [HIP] Fix temporary files
Currently HIP toolchain uses Driver::GetTemporaryDirectory to
create a temporary directory for some temporary files during
compilation. The temporary directories are not automatically
deleted after compilation. This slows down compilation
on Windows.

Switch to use GetTemporaryPath which only creates temporay
files which will be deleted automatically.

Keep the original input file name convention for Darwin host
toolchain since it is needed for deterministic binary
(https://reviews.llvm.org/D111269)

Fixes: SWDEV-386058

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D145509
2023-03-09 21:41:58 -05:00
Alex Brachet
3e57aa304f [llvm-driver] Reinvoke clang as described by llvm driver extra args
Differential Revision: https://reviews.llvm.org/D137800
2023-02-10 19:42:32 +00:00
Archibald Elliott
d768bf994f [NFC][TargetParser] Replace uses of llvm/Support/Host.h
The forwarding header is left in place because of its use in
`polly/lib/External/isl/interface/extract_interface.cc`, but I have
added a GCC warning about the fact it is deprecated, because it is used
in `isl` from where it is included by Polly.
2023-02-10 09:59:46 +00:00
Andrew Ng
0b704d9db7 [Support] Emulate SIGPIPE handling in raw_fd_ostream write for Windows
Prevent errors and crash dumps for broken pipes on Windows.

Fixes: https://github.com/llvm/llvm-project/issues/48672

Differential Revision: https://reviews.llvm.org/D142224
2023-02-09 10:39:09 +00:00
Mariya Podchishchaeva
fe082124fa [clang][driver] Emit an error for /clang:-x
`/clang:-x` emits an error instead of a warning. And if the error is suppressed,
`/clang:-x` takes no effect.
Considering that `/clang:` is a recent addition in 2018-11 and there are MSVC
style alternatives, therefore `/clang:-x` doesn't seem useful and we just reject
it since properly supporting it would add lots of complexity.

Fixes #59307

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D142757
2023-02-02 11:48:33 -05:00
Xiang Li
d5a7439e22 [HLSL] [Dirver] add dxv as a VerifyDebug Job
New option --dxv-path is  added for dxc mode to set the installation path for dxv.
If cannot find dxv, a warning will be report.

dxv will be executed with command line dxv file_name -o file_name.
It will validate and sign the file and overwrite it.

Differential Revision: https://reviews.llvm.org/D141705
2023-02-01 20:07:25 -05:00
Joseph Huber
d50dacd7c3 [Clang] Only emit textual LLVM-IR in device only mode
Currently, we embed device code into the host to perform
multi-architecture linking and handling of device code. If the user
specified `-S -emit-llvm` then the embedded output will be textual
LLVM-IR. This is a problem because it can't be used by the LTO backend
and it makes reading the file confusing.

This patch changes the behaviour to only emit textual device IR if we
are in device only mode, that is, if the device code is presented
directly to the user instead of being embedded. Otherwise we should
always embed device bitcode instead.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D141717
2023-01-24 15:11:30 -06:00
Yaxun (Sam) Liu
c487b84d75 [HIP] Change default offload arch to gfx906
Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D142246
2023-01-22 20:26:30 -05:00
serge-sans-paille
6ad1b40951
Optimize OptTable::findNearest implementation and usage
When used to find an exact match, some extra context can be used to
totally cut some computations.

This saves 1% of the instruction count when pre processing sqlite3.c
through

valgrind --tool=callgrind ./bin/clang -E sqlite3.c -o/dev/null

Differential Revision: https://reviews.llvm.org/D142026
2023-01-19 14:16:29 +01:00
Joseph Huber
0660397e68 [CUDA] Allow targeting NVPTX directly without a host toolchain
Currently, the NVPTX compilation toolchain can only be invoked either
through CUDA or OpenMP via `--offload-device-only`. This is because we
cannot build a CUDA toolchain without an accompanying host toolchain for
the offloading. When using `--target=nvptx64-nvidia-cuda` this results
in generating calls to the GNU assembler and linker, leading to errors.

This patch abstracts the portions of the CUDA toolchain that are
independent of the host toolchain or offloading kind into a new base
class called `NVPTXToolChain`. We still need to read the host's triple
to build the CUDA installation, so if not present we just assume it will
match the host's system for now, or the user can provide the path
explicitly.

This should allow the compiler driver to create NVPTX device images
directly from C/C++ code.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D140158
2023-01-18 18:18:25 -06:00
Chuanqi Xu
3e9e8d6ef4 [Driver] [C++20] [Modules] Support -fmodule-output= (2/2)
The patch implements `-fmodule-output=`. This is helpful if the build
systems want to generate these output files in other places which is not
the same with -o specified or the input file lived.

Reviewed By: dblaikie, iains

Differential Revision: https://reviews.llvm.org/D137059
2023-01-16 14:01:05 +08:00
Chuanqi Xu
f89327e28b [Driver] [Modules] Support -fmodule-output (1/2)
Patches to support the one-phase compilation model for modules.

The behavior:
(1) If -o and -c is specified , the module file is in the same path
within the same directory as the output the -o specified and with a new
suffix .pcm.
(2) Otherwise, the module file is in the same path within the working
directory directory with the name of the input file with a new suffix
.pcm

For example,

```
Hello.cppm Use.cpp
```

A trivial one and the contents are ignored. When we run:

```
clang++ -std=c++20 -fmodule-output Hello.cppm -c
```

The directory would look like:

```
Hello.cppm  Hello.o  Hello.pcm Use.cpp
```

And if we run:

```
clang++ -std=c++20 -fmodule-output Hello.cppm -c -o output/Hello.o
```

Then the `output` directory may look like:

```
Hello.o  Hello.pcm
```

Reviewed By: dblaikie, iains, tahonermann

Differential Revision: https://reviews.llvm.org/D137058
2023-01-16 11:05:33 +08:00
Joseph Huber
d5ac28efff [OpenMP] Fix unused capature and name
Summary:
This capture isn't used, get rid of it and change the name since it's
more generic now.
2023-01-11 11:05:24 -06:00
Joseph Huber
0d9afee3d1 [OpenMP] Adjust phases for AMDGPU offloading for OpenMP in save-temps mode
Currently, the behaviour of `-save-temps` changes the generated output
when offloading to AMDGPU. This is because we only have a single phase
and it contains the `-disable-llvm-passes` flags which results in
unoptimized bitcode. We need to make sure we generate another phase that
produces both the optimized and unoptimized bitcode. There used to be a
check that turned these phases into a no-op. But I believe it is more
correct to not generate them this way in the first place. Doing this
requires a bit of a hack, replacing an already generated phase action,
but it should be fine.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D141440
2023-01-11 10:31:47 -06:00
Joseph Huber
a17ab7aa3b [OpenMP] Add support for '--offload-arch=native' to OpenMP offloading
This patch adds support for '--offload-arch=native' to OpenMP
offloading. This will automatically generate the toolchains required to
fulfil whatever GPUs the user has installed. Getting this to work
requires a bit of a hack. The problem is that we need the ToolChain to
launch its searching program. But we do not yet have that ToolChain
built. I had to temporarily make the ToolChain and also add some logic
to ignore regular warnings & errors.

Depends on D141078

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D141105
2023-01-11 10:30:38 -06:00
Joseph Huber
fada902860 [CUDA][HIP] Support '--offload-arch=native' for the new driver
This patch applies the same handling for the `--offload-arch=native'
string to the new driver. The support for OpenMP will require some extra
logic to infer the triples from the derived architecture strings.

Depends on D141051

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D141078
2023-01-11 10:30:34 -06:00
Joseph Huber
56ebfca4bc [CUDA][HIP] Add support for --offload-arch=native to CUDA and refactor
This patch adds basic support for `--offload-arch=native` to CUDA. This
is done using the `nvptx-arch` tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D141051
2023-01-11 10:30:30 -06:00