282 Commits

Author SHA1 Message Date
Nikita Popov
8a72391f60 [IR] Require intrinsic struct return type to be anonymous
This is an alternative to D122376. Rather than working around the
problem, this patch requires that struct return types in intrinsics
are anonymous/literal and adds auto-upgrade code to convert
existing uses of intrinsics with named struct types.

This ensures that the mapping between intrinsic name and
intrinsic function type is actually bijective, as it is supposed
to be.

This also fixes https://github.com/llvm/llvm-project/issues/37891.

Differential Revision: https://reviews.llvm.org/D122471
2022-03-30 09:51:24 +02:00
Yaxun (Sam) Liu
d41445113b [CUDA][HIP] Fix hostness check with -fopenmp
CUDA/HIP determines whether a function can be called based on
the device/host attributes of callee and caller. Clang assumes the
caller is CurContext. This is correct in most cases, however, it is
not correct in OpenMP parallel region when CUDA/HIP program
is compiled with -fopenmp. This causes incorrect overloading
resolution and missed diagnostics.

To get the correct caller, clang needs to chase the parent chain
of DeclContext starting from CurContext until a function decl
or a lambda decl is reached. Sema API is adapted to achieve that
and used to determine the caller in hostness check.

Reviewed by: Artem Belevich, Richard Smith

Differential Revision: https://reviews.llvm.org/D121765
2022-03-24 15:19:47 -04:00
Daniil Kovalev
828b63c309 [NVPTX] Enhance vectorization of ld.param & st.param
Since function parameters and return values are passed via param space, we
can force special alignment for values hold in it which will add vectorization
options. This change may be done if the function has private or internal
linkage. Special alignment is forced during 2 phases.

1) Instruction selection lowering. Here we use special alignment for function
   prototypes (changing both own return value and parameters alignment), call
   lowering (changing both callee's return value and parameters alignment).

2) IR pass nvptx-lower-args. Here we change alignment of byval parameters that
   belong to param space (or are casted to it). We only handle cases when all
   uses of such parameters are loads from it. For such loads, we can change the
   alignment according to special type alignment and the load offset. Then,
   load-store-vectorizer IR pass will perform vectorization where alignment
   allows it.

Special alignment calculated as maximum from default ABI type alignment and
alignment 16. Alignment 16 is chosen because it's the maximum size of
vectorized ld.param & st.param.

Before specifying such special alignment, we should check if it is a multiple
of the alignment that the type already has. For example, if a value has an
enforced alignment of 64, default ABI alignment of 4 and special alignment
of 16, we should preserve 64.

This patch will be followed by a refactoring patch that removes duplicating
code in handling byval and non-byval arguments.

Differential Revision: https://reviews.llvm.org/D120129
2022-03-24 12:36:52 +03:00
Daniil Kovalev
a034878564 Revert "[NVPTX] Enhance vectorization of ld.param & st.param"
This reverts commit f854434f0f2a01027bdaad8e6fdac5a782fce291.

Placed URL to wrong differential revision in commit message.
2022-03-24 12:32:06 +03:00
Daniil Kovalev
f854434f0f [NVPTX] Enhance vectorization of ld.param & st.param
Since function parameters and return values are passed via param space, we
can force special alignment for values hold in it which will add vectorization
options. This change may be done if the function has private or internal
linkage. Special alignment is forced during 2 phases.

1) Instruction selection lowering. Here we use special alignment for function
   prototypes (changing both own return value and parameters alignment), call
   lowering (changing both callee's return value and parameters alignment).

2) IR pass nvptx-lower-args. Here we change alignment of byval parameters that
   belong to param space (or are casted to it). We only handle cases when all
   uses of such parameters are loads from it. For such loads, we can change the
   alignment according to special type alignment and the load offset. Then,
   load-store-vectorizer IR pass will perform vectorization where alignment
   allows it.

Special alignment calculated as maximum from default ABI type alignment and
alignment 16. Alignment 16 is chosen because it's the maximum size of
vectorized ld.param & st.param.

Before specifying such special alignment, we should check if it is a multiple
of the alignment that the type already has. For example, if a value has an
enforced alignment of 64, default ABI alignment of 4 and special alignment
of 16, we should preserve 64.

This patch will be followed by a refactoring patch that removes duplicating
code in handling byval and non-byval arguments.

Differential Revision: https://reviews.llvm.org/D121549
2022-03-24 12:25:36 +03:00
Changpeng Fang
dd5895cc39 AMDGPU: Use the implicit kernargs for code object version 5
Summary:
  Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1].
To get aperture bases when targets do not have aperture registers, we load
private_base or shared_base directly from the implicit kernarg. In clang, we use
implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}.

Reviewers: arsenm, sameerds, yaxunl

Differential Revision: https://reviews.llvm.org/D120265
2022-03-17 14:12:36 -07:00
Stanislav Mekhanoshin
9eabea3968 [AMDGPU] Set noclobber metadata on loads instead of cast to constant
A load via pointer cast to constant will return true from
pointsToConstantMemory which is not necessarily so.

Fixes: SWDEV-326463

Differential Revision: https://reviews.llvm.org/D121172
2022-03-07 23:13:02 -08:00
Yaxun (Sam) Liu
9d899d8f01 [HIP] Support -fgpu-default-stream
Introduce -fgpu-default-stream={legacy|per-thread} option to
support per-thread default stream for HIP runtime.

When -fgpu-default-stream=per-thread, HIP kernels are
launched through hipLaunchKernel_spt instead of
hipLaunchKernel. Also HIP_API_PER_THREAD_DEFAULT_STREAM=1
is defined by the preprocessor to enable other per-thread stream
API's.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120298
2022-02-23 22:28:29 -05:00
Stanislav Mekhanoshin
b0aa1946df [AMDGPU] Promote recursive loads from kernel argument to constant
Not clobbered pointer load chains are promoted to global now. That
is possible to promote these loads itself into constant address
space. Loaded pointers still need to point to global because we
need to be able to store into that pointer and because an actual
load from it may occur after a clobber.

Differential Revision: https://reviews.llvm.org/D119886
2022-02-17 11:07:03 -08:00
Sameer Sahasrabuddhe
d8f99bb6e0 [AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216
2022-02-11 22:51:56 +05:30
Yaxun (Sam) Liu
1d97cb1f6e [HIP] Emit amdgpu_code_object_version module flag
code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).

The amdgpu_code_object_version value is code object version times 100.

LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.

The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D119026
2022-02-08 21:58:40 -05:00
Yaxun (Sam) Liu
8428c75da1 [CUDA][HIP] Do not treat host var address as constant in device compilation
Currently clang treats host var address as constant in device compilation,
which causes const vars initialized with host var address promoted to
device variables incorrectly and results in undefined symbols.

This patch fixes that.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D118153

Fixes: SWDEV-309881

Change-Id: I0a69357063c6f8539ef259c96c250d04615f4473
2022-01-28 16:04:52 -05:00
Florian Hahn
67aa314bce
[IRGen] Do not overwrite existing attributes in CGCall.
When adding new attributes, existing attributes are dropped. While
this appears to be a longstanding issue, this was highlighted by D105169
which dropped a lot of attributes due to adding the new noundef
attribute.

Ahmed Bougacha (@ab) tracked down the issue and provided the fix in
CGCall.cpp. I bundled it up and updated the tests.
2022-01-20 13:45:19 +00:00
Yaxun (Sam) Liu
85c2bd2a0e Prevent adding module flag amdgpu_hostcall multiple times
HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Roman Lebedev

Differential Revision: https://reviews.llvm.org/D116216
2022-01-19 12:52:33 -05:00
hyeongyu kim
1b1c8d83d3 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2022-01-16 18:54:17 +09:00
Matt Arsenault
33315ef321 clang/AMDGPU: Don't set implicit arg attribute to default size
Since 2959e082e1427647e107af0b82770682eaa58fe1, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
2022-01-14 18:43:30 -05:00
Yaxun (Sam) Liu
3b172f60c6 [HIP] Fix -fgpu-rdc for Windows
This patch fixes issues for -fgpu-rdc for Windows MSVC
toolchain:

Fix COFF specific section flags and remove section types
in llvm-mc input file for Windows.

Escape fatbin path in llvm-mc input file.

Add -triple option to llvm-mc.

Put __hip_gpubin_handle in comdat when it has linkonce_odr
linkage.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D115039
2021-12-06 16:42:23 -05:00
Anshil Gandhi
df0560ca00 [HIP] Add atomic load, atomic store and atomic cmpxchng_weak builtin support in HIP-clang
Introduce `__hip_atomic_load`, `__hip_atomic_store` and `__hip_atomic_compare_exchange_weak`
builtins in HIP.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D114553
2021-11-29 12:07:13 -07:00
Yaxun (Sam) Liu
38211bbab1 [HIP] Fix device stub name for Windows
This is a follow up of https://reviews.llvm.org/D68578
where device stub name is changed for Itanium
mangling but not Microsoft mangling.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D113491
2021-11-23 12:03:49 -05:00
Yaxun (Sam) Liu
e13246a2ec [HIP] Add HIP scope atomic operations
Add an AtomicScopeModel for HIP and support for OpenCL builtins
that are missing in HIP.

Patch by: Michael Liao

Revised by: Anshil Ghandi

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D113925
2021-11-23 10:13:37 -05:00
Yaxun (Sam) Liu
4b3881e9f3 Emit hidden hostcall argument for sanitized kernels
this patch - https://reviews.llvm.org/D110337 changes the way how hostcall
hidden argument is emitted for printf, but the sanitized kernels also use
hostcall buffer to report a error for invalid memory access, which is not
handled by the above patch and it leads to vdi runtime error:

Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT:
Agent attempted to access an inaccessible address. code: 0x2b

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D112820
2021-11-10 17:05:57 -05:00
Yaxun (Sam) Liu
80072fde61 [CUDA][HIP] Allow comdat for kernels
Two identical instantiations of a template function can be emitted by two TU's
with linkonce_odr linkage without causing duplicate symbols in linker. MSVC
also requires these symbols be in comdat sections. Linux does not require
the symbols in comdat sections to be merged by linker but by default
clang puts them in comdat sections.

If a template kernel is instantiated identically in two TU's. MSVC requires
that them to be in comdat sections, otherwise MSVC linker will diagnose them as
duplicate symbols. However, currently clang does not put instantiated template
kernels in comdat sections, which causes link error for MSVC.

This patch allows putting instantiated template kernels into comdat sections.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D112492
2021-11-10 16:42:23 -05:00
hsmahesha
3b9a85d10a [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas
at the start of the entry block, which in turn would aid better code transformation/optimization.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D110257
2021-11-10 08:45:21 +05:30
hyeongyu kim
fd9b099906 Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953eb705af2ecfeb95a6262818fa85dd92.

Revert "Fix lit test failures in CodeGenCoroutines"

This reverts commit 63fff0f5bffe20fa2c84a45a41161afa0043cb34.
2021-11-09 02:15:55 +09:00
hyeongyukim
aacfbb953e [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169

[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)

This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453

Resolve lit failures in clang after 8ca4b3e's land

Fix lit test failures in clang-ppc* and clang-x64-windows-msvc

Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc

Fix internal_clone(aarch64) inline assembly
2021-11-06 19:19:22 +09:00
Juneyoung Lee
89ad2822af Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit 7584ef766a7219b6ee5a400637206d26e0fa98ac.
2021-11-06 15:39:19 +09:00
Juneyoung Lee
7584ef766a [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2021-11-06 15:36:42 +09:00
Anshil Gandhi
0567f03331 [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
By default clang emits complete contructors as alias of base constructors if they are the same.
The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
@yaxunl observed that this issue is related to the llvm options `-amdgpu-early-inline-all=true`
and `-amdgpu-function-calls=false`. This issue is resolved by only inlining global values
with internal linkage. The `getCalleeFunction()` in AMDGPUResourceUsageAnalysis also had
to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.

Reviewed By: yaxunl, #amdgpu

Differential Revision: https://reviews.llvm.org/D109707
2021-10-18 16:53:15 -06:00
Juneyoung Lee
f193bcc701 Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits:
37ca7a795b277c20c02a218bf44052278c03344b
9aa6c72b92b6c89cc6d23b693257df9af7de2d15
705387c5074bcca36d626882462ebbc2bcc3bed4
8ca4b3ef19fe82d7ad6a6e1515317dcc01b41515
80dba72a669b5416e97a42fd2c2a7bc5a6d3f44a
2021-10-18 23:52:46 +09:00
Juneyoung Lee
8ca4b3ef19 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453
2021-10-16 12:01:41 +09:00
Anshil Gandhi
1830ec94ac Revert "[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols"
This reverts commit 03375a3fb33b11e1249d9c88070b7f33cb97802a.
2021-10-15 16:16:18 -06:00
Anshil Gandhi
f92db6d3ff [HIP] Relax conditions for address space cast in builtin args
Allow (implicit) address space casting between LLVM-equivalent
target address spaces.

Reviewed By: yaxunl, tra

Differential Revision: https://reviews.llvm.org/D111734
2021-10-15 15:35:52 -06:00
Anshil Gandhi
53fc5100e0 Revert "[HIP] Relax conditions for address space cast in builtin args"
This reverts commit 3b48e1170dc623a95ff13a1e34c839cc094bf321.
2021-10-15 14:42:28 -06:00
Anshil Gandhi
3b48e1170d [HIP] Relax conditions for address space cast in builtin args
Allow (implicit) address space casting between LLVM-equivalent
target address spaces.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D111734
2021-10-15 14:06:47 -06:00
Anshil Gandhi
03375a3fb3 [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
By default clang emits complete contructors as alias of base constructors if they are the same.
The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
@yaxunl observed that this issue is related to the llvm options `-amdgpu-early-inline-all=true`
and `-amdgpu-function-calls=false`. This issue is resolved by only inlining global values
with internal linkage. The `getCalleeFunction()` in AMDGPUResourceUsageAnalysis also had
to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.

Reviewed By: yaxunl, #amdgpu

Differential Revision: https://reviews.llvm.org/D109707
2021-10-15 11:39:15 -06:00
hsmahesha
393581d8a5 [CFE][Codegen] Update auto-generated check lines for few GPU lit tests
which is essentially required as a pre-commit for https://reviews.llvm.org/D110257.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D110676
2021-10-07 09:05:39 +05:30
Yaxun (Sam) Liu
c4afb5f81b [HIP] Fix linking of asanrt.bc
HIP currently uses -mlink-builtin-bitcode to link all bitcode libraries, which
changes the linkage of functions to be internal once they are linked in. This
works for common bitcode libraries since these functions are not intended
to be exposed for external callers.

However, the functions in the sanitizer bitcode library is intended to be
called by instructions generated by the sanitizer pass. If their linkage is
changed to internal, their parameters may be altered by optimizations before
the sanitizer pass, which renders them unusable by the sanitizer pass.

To fix this issue, HIP toolchain links the sanitizer bitcode library with
-mlink-bitcode-file, which does not change the linkage.

A struct BitCodeLibraryInfo is introduced in ToolChain as a generic
approach to pass the bitcode library information between ToolChain and Tool.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D110304
2021-09-27 13:25:46 -04:00
Jonas Hahnfeld
ea08c4cd1c [CUDA] Fix static device variables with -fgpu-rdc
NVPTX does not allow dots in the identifier, so ptxas errors out with
   fatal   : Parsing error near '.static': syntax error
because it parses .static as a directive. Avoid this problem by using
two underscores, similar to what OpenMP does for outlined functions.

Differential Revision: https://reviews.llvm.org/D108456
2021-08-25 09:31:22 +02:00
Anshil Gandhi
7063ac1afa [HIP] Allow target addr space in target builtins
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.

It is NFC for non-HIP languages.

Differential Revision: https://reviews.llvm.org/D102405
2021-08-19 23:51:58 -06:00
Anshil Gandhi
f5d5f17d3a Revert "[HIP] Allow target addr space in target builtins"
This reverts commit a35008955fa606487f79a050f5cc80fc7ee84dda.
2021-08-18 21:38:42 -06:00
Anshil Gandhi
f22ba51873 [Remarks] Emit optimization remarks for atomics generating CAS loop
Implements ORE in AtomicExpand pass to report atomics generating a
compare and swap loop.

Differential Revision: https://reviews.llvm.org/D106891
2021-08-16 14:56:01 -06:00
Dávid Bolvanský
49de6070a2 Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop"
This reverts commit 435785214f73ff0c92e97f2ade6356e3ba3bf661. Still same compile time issues for -O0 -g, eg. +1.3% for sqlite3.
2021-08-15 11:44:13 +02:00
Anshil Gandhi
435785214f [Remarks] Emit optimization remarks for atomics generating CAS loop
Implements ORE in AtomicExpand pass to report atomics generating
a compare and swap loop.

Differential Revision: https://reviews.llvm.org/D106891
2021-08-14 23:37:23 -06:00
Anshil Gandhi
29e11a1aa3 Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop"
This reverts commit c4e5425aa579d21530ef1766d7144b38a347f247.
2021-08-13 23:58:04 -06:00
Anshil Gandhi
c4e5425aa5 [Remarks] Emit optimization remarks for atomics generating CAS loop
Implements ORE in AtomicExpandPass to report atomics generating a compare
and swap loop.

Differential Revision: https://reviews.llvm.org/D106891
2021-08-13 22:44:08 -06:00
Anshil Gandhi
a35008955f [HIP] Allow target addr space in target builtins
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.

It is NFC for non-HIP languages.

Differential Revision: https://reviews.llvm.org/D102405
2021-08-09 16:38:04 -06:00
Michael Liao
6ec36d18ec [cuda] Mark builtin texture/surface reference variable as 'externally_initialized'.
- They need to be preserved even if there's no reference within the
  device code as the host code may need to initialize them based on the
  application logic.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D107718
2021-08-09 13:27:40 -04:00
Yaxun (Sam) Liu
44dbbe6106 [HIP] Preserve ASAN bitcode library functions
Address sanitizer passes may generate call of ASAN bitcode library
functions after bitcode linking in lld, therefore lld cannot add
those symbols since it does not know they will be used later.

To solve this issue, clang emits a reference to a bicode library
function which calls all ASAN functions which need to be
preserved. This basically force all ASAN functions to be
linked in.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D106315
2021-07-23 10:35:52 -04:00
Michael Liao
4e5d9c8803 [Internalize] Preserve variables externally initialized.
- ``externally_initialized`` variables would be initialized or modified
  elsewhere. Particularly, CUDA or HIP may have host code to initialize
  or modify ``externally_initialized`` device variables, which may not
  be explicitly referenced on the device side but may still be used
  through the host side interfaces. Not preserving them triggers the
  elimination of them in the GlobalDCE and breaks the user code.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D105135
2021-07-08 10:48:19 -04:00
Sameer Sahasrabuddhe
280593bd3f [Clang] [NFC] fix CHECK lines for convergent attribute tests 2021-06-29 00:21:07 +05:30