106 Commits

Author SHA1 Message Date
Justin Lebar
fab2bb8bfd
Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)
For some reason this was missing from STLExtras.
2024-03-10 20:00:13 -07:00
Jay Foad
a6dabed348
[AMDGPU] Fix nondeterminism in SIFixSGPRCopies (#70644)
There are a couple of loops that iterate over V2SCopies. The iteration
order needs to be deterministic, otherwise we can call moveToVALU in
different orders, which causes temporary vregs to be allocated in
different orders, which can affect register allocation heuristics.
2023-10-31 11:47:42 +00:00
Stanislav Mekhanoshin
fe8335babb
[AMDGPU] Select 64-bit imm moves if can be encoded as 32 bit operand (#70395)
This allows folding of 64-bit operands if fit into 32-bit. Fixes
https://github.com/llvm/llvm-project/issues/67781
2023-10-30 08:12:28 -07:00
Stanislav Mekhanoshin
945e943db7
[AMDGPU] Fix subreg check in the SIFixSGPRCopies (#70007)
It checks for the copy of subregs, but it checks destination which may
never happen in SSA. It misses the subreg check and happily produces
S_MOV_B64 out of a subreg COPY.

The affected test should have never been formed in the first place
because the pass is running in SSA and copies into a subreg shall never
happen.
2023-10-24 01:44:58 -07:00
Diana Picus
2e1718adc8 Reland "AMDGPU: Duplicate instead of COPY constants from VGPR to SGPR (#66882)"
Teach the si-fix-sgpr-copies pass to deal with REG_SEQUENCE, PHI or
INSERT_SUBREG where the result is an SGPR, but some of the inputs are
constants materialized into VGPRs. This may happen in cases where for
instance several instructions use an immediate zero and SelectionDAG
chooses to put it in a VGPR to satisfy all of them. This however causes
the si-fix-sgpr-copies to try to switch the whole chain to VGPR and may
lead to illegal VGPR-to-SGPR copies. Rematerializing the constant into
an SGPR fixes the issue.

This was originally reverted because it triggered an unrelated bug in
PEI on one of the OpenMP buildbots. That bug has been fixed in #68299,
so it should be ok to try again.
2023-10-06 10:03:50 +02:00
Diana Picus
327fdcf789 Revert "AMDGPU: Duplicate instead of COPY constants from VGPR to SGPR (#66882)"
This reverts commit a04603993b43e5ebac1531293d288315f1885886 because
it broke the OpenMP buildbot.
2023-09-25 13:40:38 +02:00
Diana
a04603993b
AMDGPU: Duplicate instead of COPY constants from VGPR to SGPR (#66882)
Teach the si-fix-sgpr-copies pass to deal with REG_SEQUENCE, PHI or
INSERT_SUBREG where the result is an SGPR, but some of the inputs are
constants materialized into VGPRs. This may happen in cases where for
instance several instructions use an immediate zero and SelectionDAG
chooses to put it in a VGPR to satisfy all of them. This however causes
the si-fix-sgpr-copies to try to switch the whole chain to VGPR and may
lead to illegal VGPR-to-SGPR copies. Rematerializing the constant into
an SGPR fixes the issue.
2023-09-25 13:20:08 +02:00
David Stuttard
ce031fc17a
[AMDGPU] Fix non-deterministic iteration order in SIFixSGPRCopies (#66617)
Use of DenseSet was causing some non-deteminism in SIFixSGPRSopies. Changing
to SetVector fixes the problem.
2023-09-18 10:08:53 +01:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
skc7
b434051dc8 [AMDGPU] Introduce SIInstrWorklist to process instructions in moveToVALU
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D147168
2023-04-10 11:34:14 +05:30
Jay Foad
a07584d57d [CodeGen] Make more use of MachineOperand::getOperandNo. NFC.
Differential Revision: https://reviews.llvm.org/D143252
2023-02-07 11:50:57 +00:00
Carl Ritson
5bc703f755 [AMDGPU] Replace getPhysRegClass with getPhysRegBaseClass
Accelerate finding the base class for a physical register by
building a statically mapping table from physical registers
to base classes using TableGen.

Replace uses of SIRegisterInfo::getPhysRegClass with
TargetRegisterInfo::getPhysRegBaseClass in order to use
the computed table.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D139422
2022-12-20 16:22:14 +09:00
Alexander Timofeev
2877b87666 [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant
Sometimes we have a constant value loaded to VGPR. In case we further
need to rematrerialize it in the physical scalar register we may avoid VGPR to
SGPR copy replacing it with S_MOV_B32.

Reviewed By: JonChesterfield, arsenm

Differential Revision: https://reviews.llvm.org/D139874
2022-12-16 00:38:10 +01:00
Jay Foad
6443c0ee02 [AMDGPU] Stop using make_pair and make_tuple. NFC.
C++17 allows us to call constructors pair and tuple instead of helper
functions make_pair and make_tuple.

Differential Revision: https://reviews.llvm.org/D139828
2022-12-14 13:22:26 +00:00
Alexander Timofeev
2e8817b90a [AMDGPU] SIFixSGPRCopies reworking to use one pass over the MIR for analysis and lowering.
This change finalizes the series of patches aiming to replace the old strategy of VGPR to SGPR copy lowering.

  # Following the https://reviews.llvm.org/D128252 and https://reviews.llvm.org/D130367 code parts that are no longer used were removed.
  # The first pass over the MachineFunctoin collects all the necessary information.
  # Lowering is done in 3 phases:
     - VGPR to SGPR copies analysis  lowering
     - REG_SEQUENCE, PHIs, and SGPR to VGPR copies lowering
     - SCC copies lowering is done in a separate pass over the Machine Function

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D131246
2022-09-19 23:31:45 +02:00
Matt Arsenault
7834194837 TableGen: Introduce generated getSubRegisterClass function
Currently there isn't a generic way to get a smaller register class
that can be produced from a subregister of a larger class. Replaces a
manually implemented version for AMDGPU. This will be used to improve
subregister support in the allocator.
2022-09-12 09:03:37 -04:00
Kazu Hirata
fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Kazu Hirata
9861a68a7c [Target] Qualify auto in range-based for loops (NFC) 2022-08-28 10:41:50 -07:00
Evgenii Stepanov
8ea1cf3111 Revert "[AMDGPU] SIFixSGPRCopies refactoring"
Breaks ASan tests.

This reverts commit 3f8ae7efa866e581a16e9ccc8e29744722f13fff.
2022-08-10 11:32:46 -07:00
alex-t
3f8ae7efa8 [AMDGPU] SIFixSGPRCopies refactoring
This change finalizes the series of patches aiming to replace old
strategy of VGPR to SGPR copies loweriong.  Following the
https://reviews.llvm.org/D128252 and https://reviews.llvm.org/D130367 code
parts that are no longer used were removed.  Pass main loop is no longer used
for the MIR changes but collect information for further analysis.  Actual MIR
lowering happens further according the analysys result in the set of separate
functions. Another important change concerns the order of lowering: VGPR to
SGPR copies lowering is done first to have priority on the rest of the MIR
changes.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D131246
2022-08-10 00:51:57 +02:00
Alexander Timofeev
a321d95b59 [AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs
In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367
2022-08-02 18:37:57 +02:00
Alexander Timofeev
d7ae1a9097 Revert "[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs"
This reverts commit 76d9ae924cc361578ecbb5688559f7cebc78ab87.
because it causes several VK CTS tests to fail
2022-07-29 14:19:07 +02:00
Alexander Timofeev
76d9ae924c [AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs
In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367
2022-07-28 14:30:29 +02:00
Alexander Timofeev
2e29b0138c [AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.
Since the divergence-driven instruction selection has been enabled for AMDGPU,
 all the uniform instructions are expected to be selected to SALU form, except those not having one.
 VGPR to SGPR copies appear in MIR to connect values producers and consumers. This change implements an algorithm
 that evolves a reasonable tradeoff between the profit achieved from keeping the uniform instructions in SALU form
 and overhead introduced by the data transfer between the VGPRs and SGPRs.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D128252
2022-07-14 23:59:02 +02:00
Kazu Hirata
387927bbaf [Target] Use range-based for loops (NFC) 2021-11-26 21:21:17 -08:00
Christudasan Devadasan
654c89d85a [AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to enable copy between VGPR and
AGPR registers during regalloc.

Also, added the missing AV register classes from
192b to 1024b.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D109300
2021-11-26 00:42:12 -05:00
alex-t
1a33294652 [AMDGPU] Filtering out the inactive lanes bits when lowering copy to SCC
Normally, given that the DA results are kept consistent over the selection DAG, uniform comparisons get selected to S_CMP_* but divergent to V_CMP_*.  Sometimes, for the sake of efficiency,  SSA subgraphs may be converted to VALU to avoid repeatedly copying data back and forth. Hence we have to be able to sustain the correctness passing the i1 from VALU to SALU context and vice versa.

VALU operations only process the active lanes of the VGPR and ignore inactive ones.
Active lanes correspond to 1 bit in the EXEC mask register.
SALU represents i1 as just one bit but VALU as 64bits: 0/1 and 0/(0xffffffffffffffff & EXEC) respectively.
SALU uses one-bit conditional flag SCC but VALU - VCC that is a pair of 32-bit SGPRs

To expose SCC to the VALU context we need to convert the one-bit boolean value to the appropriate 64bit.
To return back to the SALU context we need to do the opposite.

To correctly convert 64bit VALU boolean to either 0 or 1 we need to filter out the bits corresponding to the inactive lanes.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D109900
2021-09-21 21:19:31 +03:00
alex-t
ed0f4415f0 [AMDGPU] Divergence-driven compare operations instruction selection
Description: This change enables the compare operations to be selected to SALU/VALU form
             dependent of the SDNode divergence flag.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D106079
2021-08-25 18:30:49 +03:00
Piotr Sobczak
4672bac177 [AMDGPU] Introduce Strict WQM mode
* Add amdgcn_strict_wqm intrinsic.
* Add a corresponding STRICT_WQM machine instruction.
* The semantic is similar to amdgcn_strict_wwm with a notable difference that not all threads will be forcibly enabled during the computations of the intrinsic's argument, but only all threads in quads that have at least one thread active.
* The difference between amdgc_wqm and amdgcn_strict_wqm, is that in the strict mode an inactive lane will always be enabled irrespective of control flow decisions.

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D96258
2021-03-03 14:19:16 +01:00
Piotr Sobczak
c3ce7bae80 [AMDGPU] Rename amdgcn_wwm to amdgcn_strict_wwm
* Introduce the new intrinsic amdgcn_strict_wwm
 * Deprecate the old intrinsic amdgcn_wwm

The change is done for consistency as the "strict"
prefix will become an important, distinguishing factor
between amdgcn_wqm and amdgcn_strictwqm in the future.

The "strict" prefix indicates that inactive lanes do not
take part in control flow, specifically an inactive lane
enabled by a strict mode will always be enabled irrespective
of control flow decisions.

The amdgcn_wwm will be removed, but doing so in two steps
gives users time to switch to the new name at their own pace.

Reviewed By: critson

Differential Revision: https://reviews.llvm.org/D96257
2021-03-03 09:33:57 +01:00
dfukalov
560d7e0411 [NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036
2021-01-20 22:22:45 +03:00
Joe Nash
314e29ed2b [AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only  the mir.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94341

Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
2021-01-12 18:33:18 -05:00
dfukalov
6a87e9b08b [NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813
2021-01-07 22:22:05 +03:00
Kazu Hirata
0e219b6443 [Target] Construct SmallVector with iterator ranges (NFC) 2021-01-03 09:57:45 -08:00
Matt Arsenault
29ed846d67 AMDGPU: Fix assert when checking for implicit operand legality 2020-12-22 20:56:24 -05:00
Sebastian Neubauer
31a0b2834f [AMDGPU] Fix iterating in SIFixSGPRCopies
The insertion of waterfall loops splits the current basic block into
three blocks. So the basic block that we iterate over must be updated.

This failed assert(!NodePtr->isKnownSentinel()) in ilist_iterator for
divergent calls in branches before.

Differential Revision: https://reviews.llvm.org/D90596
2020-11-04 18:43:19 +01:00
Piotr Sobczak
8d7fd73c3a [AMDGPU] Fix merging m0 inits
Fix incorrect merges of m0 inits in loops.

It was assumed that if a clobbering instruction appears in
the same block as an init and the clobbering instruction
does not dominate the init then it does not interfere with
init.

This does not work in the presence of loops, where in this
scenario, the clobbering instruction does interfere with
the init in another iteration.

To fix this, do not check for block equality and defer the
decision to the predecessor check.

Differential Revision: https://reviews.llvm.org/D87882
2020-09-23 09:13:43 +02:00
Jay Foad
45eeb8c2a9 [AMDGPU] Remove unused variable introduced in r251860 2020-08-27 13:28:32 +01:00
Jay Foad
98de0d22f5 [AMDGPU] Apply llvm-prefer-register-over-unsigned from clang-tidy 2020-08-21 10:14:35 +01:00
Jay Foad
3497860203 [AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister
... in favour of the isPhysical/isVirtual methods.
2020-08-20 17:59:11 +01:00
alex-t
b726d071b4 [AMDGPU] Reject moving PHI to VALU if the only VGPR input originated from move immediate
Summary:
PHIs result register class is set to VGPR or SGPR depending on the cross block value divergence.
         In some cases uniform PHI need to be converted to return VGPR to prevent the oddnumber of moves values from VGPR to SGPR and back.
         PHI should certainly return VGPR if it has at least one VGPR input. This change adds the exception.
         We don't want to convert uniform PHI to VGPRs in case the only VGPR input is a VGPR to SGPR COPY and definition od the
         source VGPR in this COPY is move immediate.

  bb.0:

     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     %2:sreg_32 = .....

  bb.1:
     %3:sreg_32 = PHI %1, %bb.3, %2, %bb.1
     S_BRANCH %bb.3

  bb.3:
     %1:sreg_32 = COPY %0
     S_BRANCH %bb.2

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80434
2020-05-28 19:25:51 +03:00
Stanislav Mekhanoshin
0462795095 [AMDGPU] Propagate AGPR RC from PHI to its PHI operands
We can fix register class of PHI based on its all AGPR uses.
That leaves behind all PHIs which were already processed
earlier. Propagate RC back to PHI operands of a PHI.

Differential Revision: https://reviews.llvm.org/D77344
2020-04-03 11:23:02 -07:00
Matt Arsenault
d9e8b2cbcc AMDGPU/GlobalISel: Skip DAG hack passes on selected functions
The way fallback to SelectionDAG works is somewhat surprising to
me. When the fallback path is enabled, the entire set of SelectionDAG
selector passes is added to the pass pipeline, and each one needs to
check if the function was selected. This results in the surprising
behavior of running SIFixSGPRCopies for example, but only if
-global-isel-abort=2 is used.

SIAddIMGInitPass is also added in addInstSelector, but I'm not sure
why we have this pass or if it should be added somewhere else for
GlobalISel.
2020-02-17 08:33:17 -08:00
Reid Kleckner
05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Stanislav Mekhanoshin
0fab220eb5 [AMDGPU] move PHI nodes to AGPR class
If all uses of a PHI are in AGPR register class we should
avoid unneeded copies via VGPRs.

Differential Revision: https://reviews.llvm.org/D69200

llvm-svn: 375297
2019-10-18 22:48:45 +00:00
David Stuttard
2d6a2303f8 [AMDGPU] Fix-up cases where writelane has 2 SGPR operands
Summary:
Even though writelane doesn't have the same constraints as other valu
instructions it still can't violate the >1 SGPR operand constraint

Due to later register propagation (e.g. fixing up vgpr operands via
readfirstlane) changing writelane to only have a single SGPR is tricky.

This implementation puts a new check after SIFixSGPRCopies that prevents
multiple SGPRs being used in any writelane instructions.

The algorithm used is to check for trivial copy prop of suitable constants into
one of the SGPR operands and perform that if possible. If this isn't possible
put an explicit copy of Src1 SGPR into M0 and use that instead (this is
allowable for writelane as the constraint is for SGPR read-port and not
constant-bus access).

Reviewers: rampitec, tpr, arsenm, nhaehnle

Reviewed By: rampitec, arsenm, nhaehnle

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, mgorny, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D51932

Change-Id: Ic7553fa57440f208d4dbc4794fc24345d7e0e9ea
llvm-svn: 375004
2019-10-16 14:37:39 +00:00
Austin Kerbow
527e9f9a3f AMDGPU: Fix infinite searches in SIFixSGPRCopies
Summary:
Two conditions could lead to infinite loops when processing PHI nodes in
SIFixSGPRCopies.

The first condition involves a REG_SEQUENCE that uses registers defined by both
a PHI and a COPY.

The second condition arises when a physical register is copied to a virtual
register which is then used in a PHI node. If the same virtual register is
copied to the same physical register, the result is an endless loop.

%0:sgpr_64 = COPY $sgpr0_sgpr1
%2 = PHI %0, %bb.0, %1, %bb.1
$sgpr0_sgpr1 = COPY %0

Reviewers: alex-t, rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68970

llvm-svn: 374944
2019-10-15 19:59:45 +00:00
Alexander Timofeev
c4d256a590 [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'
Detailed description:

    After https://reviews.llvm.org/D59990 submit several issues were discovered.
    Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly.

    Discovered issues were addressed in the following commits:

    https://reviews.llvm.org/D67662
    https://reviews.llvm.org/D67101
    https://reviews.llvm.org/D63953
    https://reviews.llvm.org/D63731

    This change brings back AMDGPU specific changes.

  Reviewed by: rampitec, arsenm

  Differential Revision: https://reviews.llvm.org/D68635

llvm-svn: 374767
2019-10-14 12:01:10 +00:00
Matt Arsenault
e114be608f AMDGPU: Fix typos
llvm-svn: 374253
2019-10-09 22:44:47 +00:00
Austin Kerbow
cf321f48be AMDGPU: Fix bug in r371671 on some builds.
llvm-svn: 371761
2019-09-12 19:12:21 +00:00