For inline asm with memory operands, we can merge the offset into
the second operand of memory constraint operands.
Differential Revision: https://reviews.llvm.org/D158062
We can get `BlockAddress` in user code via `Labels as Values` so
we should be able to merge the access to `BlockAddress`.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D159429
For inline asm with memory operands, we can merge the offset into
the second operand of memory constraint operands.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D158062
The register being replaced might have a more restrictive register
class due to requirements of the using instruction. We should
constrain the register class to preserve any restrictions.
This was found in our downstream on a custom instruction. I don't
have a test case for upstream currently.
Differential Revision: https://reviews.llvm.org/D154920
The LUI and AUIPC share quite a few similarities. This refactors the code
to share what we can.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D140345
Primarily this allows us to fold the addi from PseudoLLA expansion
into a load.
If the linker is able to GP relax the constant pool access we'll
end up with a GP relative load.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D140341
This used to print from the ADDI where the operand number was
correct. It recently changed to print from the LUI or AUIPC which
needs to use operand 1 instead of 2.
This shows up as a crash with -debug.
It's possible we have:
lui a0, %hi(sym)
addi a0, %lo(sym)
addi a0, <offset1>
lw a0, <offset2>(a0)
We want to arrive at
lui a0, %hi(sym+offset1+offset2)
lw a0, %lo(sym+offset1+offset2)
We currently fail to do this because we only consider loads/stores
if we didn't find any arithmetic.
This patch splits arithmetic folding and load/store folding into
two separate phases. The load/store folding can no longer assume
the offset in hi/lo is 0 so we must combine the offsets. I've applied
the same simm32 limit that we applied in the arithmetic folding.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D130931
The only iterator we're holding points to HiLUI and we never
delete that so I think it is safe to delete everything else
immediately.
I want to split detectAndFoldOffset into two phases. First, combine
LUI+ADDI with any ADD/ADDI/SHXADD that comes after it. This may
open opportunities to fold the ADDI from the LUI+ADDI into a
load/store address. So the load/store folding should run as a
second phase even if the ADD/ADDI/SHXADD made changes.
In order to do this we need to eagerly delete instructions in the
first phase so that we don't have dead users of the LUI+ADDI
when we start the second phase.
Patches to split the phases will come later.
Reviewed By: asb, luismarques
Differential Revision: https://reviews.llvm.org/D130119
Builds upon D123264, adding support for merging the low part of the LLA
address into the load/store instruction offsets.
Differential Revision: https://reviews.llvm.org/D123265
The pass was previously limited to LUI+ADDI being used by a single
instruction.
This patch allows the pass to optimize multiple memory operations
that use the same offset. Each of them will receive a separate %lo
relocation. My main motivation is to handle a read-modify-write
where we have a load and store to the same address, but I didn't
restrict it to that case.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D128599
For an addition with simm14 and simm15 immediates with 2 or 3 trailing bits,
we can use a shXadd instruction and an addi to do the addition.
This patch teaches RISCVMergeBaseOffset to see through this pattern.
I don't think the sh1add case occurs because we use two addis for that,
but I implemented it for completeness.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127376
Add with immediates in the range [-4096, -2049] or [2048, 4095] get
convert to two ADDIs. Teach RISCVMergeBaseOffset to recognize this
pattern as well.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D126843
LUI+ADDIW always produces a simm32. This allows us to always
fold it into a global offset.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D126729
The immediate for LUI is stored as 20-bit unsigned value. We need
to sign extend if after shifting by 12 to match the instruction
behavior.
If we find an LUI+ADDI on RV64, it means the constant isn't a
simm32. If it was, we would have emitted LUI+ADDIW from constant
materialization. Make sure the constant is a simm32 before folding.
This appears to match gcc.
A future patch will add support for LUI+ADDIW on RV64.
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Return false from runOnFunction if nothing changed. Curiously
we already returned a bool from detectAndFoldOffset, but didn't
use it.
Fix a couple breaks after returns that I saw while auditing
detectAndFoldOffset.
Differential Revision: https://reviews.llvm.org/D107303
Only in public interfaces that have not yet been converted should there remain
registers with unsigned type.
Differential Revision: https://reviews.llvm.org/D66252
llvm-svn: 369114
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
In r333455 we added a peephole to fix the corner cases that result
from separating base + offset lowering of global address.The
peephole didn't handle some of the cases because it only has a basic
block view instead of a function level view.
This patch replaces that logic with a machine function pass. In
addition to handling the original cases it handles uses of the global
address across blocks in function and folding an offset from LW\SW
instruction. This pass won't run for OptNone compilation, so there
will be a negative impact overall vs the old approach at O0.
Reviewers: asb, apazos, mgrang
Reviewed By: asb
Subscribers: MartinMosbeck, brucehoult, the_o, rogfer01, mgorny, rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, llvm-commits, edward-jones
Differential Revision: https://reviews.llvm.org/D47857
llvm-svn: 335786