For non-GlobalValue references, the small and medium code models can use
32 bit constants.
For GlobalValue references, use TargetMachine::isLargeGlobalObject().
Look through aliases for determining if a GlobalValue is small or large.
Even the large code model can reference small objects with 32 bit
constants as long as we're in no-pic mode, or if the reference is offset
from the GOT.
Original commit broke the build...
First reland broke large PIC builds referencing small data since it was using GOTOFF as a 32-bit constant.
For non-GlobalValue references, the small and medium code models can use
32 bit constants.
For GlobalValue references, use TargetMachine::isLargeGlobalObject().
Look through aliases for determining if a GlobalValue is small or large.
Even the large code model can reference small objects with 32 bit
constants as long as we're in no-pic mode, or if the reference is offset
from the GOT.
Original commit broke the build...
For non-GlobalValue references, the small and medium code models can use
32 bit constants.
For GlobalValue references, use TargetMachine::isLargeGlobalObject().
Look through aliases for determining if a GlobalValue is small or large.
Even the large code model can reference small objects with 32 bit
constants as long as we're in no-pic mode, or if the reference is offset
from the GOT.
This reverts commit 323451ab88866c42c87971cbc670771bd0d48692.
Code with these section names in the wild doesn't compile because
support for large globals in the small code model is not complete yet.
Using the GlobalVariable code_model property added in #72077.
code_model = "small" means the global should be treated as small
regardless of the TargetMachine code model.
code_model = "large" means the global should be treated as large
regardless of the TargetMachine code model.
Inferring small/large based on a known section name still takes
precedence for correctness.
The intention is to use this for globals that are accessed very
infrequently but also take up a lot of space in the binary to mitigate
relocation overflows. Prime examples are globals that go in
"__llvm_prf_names" for coverage/PGO instrumented builds and
"asan_globals" for ASan builds.
Globals marked with the .lbss/.ldata/.lrodata should automatically be
treated as large.
Do this regardless of the code model for consistency when mixing object
files compiled with different code models.
Basically the other half of #70748.
Example in the wild:
https://codebrowser.dev/qt5/qtbase/src/testlib/qtestcase.cpp.html#1664
So that when mixing small and large text, large text stays out of the
way of the rest of the binary.
This is useful for mixing precompiled small code model object files and
built-from-source large code model binaries so that the the text
sections don't get merged.
The reland fixes an issue where a function in the large code model would reference small data without GOTOFF.
This was incorrectly reverted in 76f78ecc789d58baa3a88b2fe2a57428f07e5362.
This reverts commit 4bf8a688956a759b7b6b8d94f42d25c13c7af130.
This commit seems to be breaking the semantics of the
ObjectFile::isSectionText method, which breaks numba/llvmlite bindings.
So that when mixing small and large text, large text stays out of the
way of the rest of the binary.
This is useful for mixing precompiled small code model object files and
built-from-source large code model binaries so that the the text
sections don't get merged.
The reland fixes an issue where a function in the large code model would reference small data without GOTOFF.
So that when mixing small and large text, large text stays out of the
way of the rest of the binary.
This is useful for mixing precompiled small code model object files and
built-from-source large code model binaries so that the the text
sections don't get merged.
Commit f3ea73133f91c1c23596d45680c8f2269c1dd289 allows SHF_X86_64_LARGE
for all global variables with an explicit section. For the following
variables, their data sections will be annotated as SHF_X86_64_LARGE.
```
const char relro[512] __attribute__((section(".rodata"))) = "a";
const char *const relro __attribute__((section(".data.rel.ro"))) = "a";
char data[512] __attribute__((section(".data"))) = "a";
```
The typical linker requirement is that we do not create more than one
output section with the same name, and the only output section should
have the bitwise OR value of all input section flags. Therefore, the
output .data section will have the SHF_X86_64_LARGE flag and be
moved away from the regular sections. This is undesired but benign.
However, .data.rel.ro having the SHF_X86_64_LARGE flag is problematic
because dynamic loaders do not support more than one PT_GNU_RELRO
program header, and LLD produces the error
`error: section: .jcr is not contiguous with other relro sections`.
I believe the most appropriate solution is to disallow SHF_X86_64_LARGE
on variables with an explicit section of certain prefixes (
.bss/.data/.bss) and allow others (e.g. metadata sections for various
instrumentation). Fortunately, global variables with an explicit
.bss/.data/.bss section are rare, so they should not cause excessive
relocation overflow pressure.
7d81813d says that this was used because functions missing certain
attributes (e.g. fast math) would inherit behavior from previous
functions with those attributes. However, later c378e52c explicitly set
those attributes if they were missing and removed the use of
DefaultOptions.
Currently clang's medium code model treats all data as large, putting them in a large data section and using more expensive instruction sequences to access them.
Following gcc's -mlarge-data-threshold, which allows putting data under a certain size in a normal data section as opposed to a large data section. This allows using cheaper code sequences to access some portion of data in the binary (which will be implemented in LLVM in a future patch).
And under the medium codel mode, only put data above the large data threshold into large data sections, not all data.
Reviewed By: MaskRay, rnk
Differential Revision: https://reviews.llvm.org/D149288
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
Because the code layout is not known during compilation, the distance of
cross-section jumps is not knowable at compile-time. Because of this, we
should assume that any cross-sectional jumps are out of range. This
assumption is necessary for machine function splitting on AArch64, which
introduces cross-section branches in the middle of functions. The linker
relaxes out-of-range unconditional branches, but it clobbers X16 to do
so; it doesn't relax conditional branches, which must be manually
relaxed by the compiler.
Differential Revision: https://reviews.llvm.org/D145211
And also set the SHF_X86_64_LARGE section flag.
gcc only uses the "l" prefix and SHF_X86_64_LARGE in the medium code model for data larger than -mlarge-data-threshold. But it seems more consistent to use it in the large code model as well in case separate parts of the binary aren't compiled with the large code model and also have a .data/.bss/.rodata section.
Reviewed By: MaskRay, tkoeppe
Differential Revision: https://reviews.llvm.org/D148836
Currently clangDriver passes -femulated-tls and -fno-emulated-tls to cc1.
cc1 forwards the option to LLVMCodeGen and ExplicitEmulatedTLS is used
to decide the value. Simplify this by moving the Clang decision to
clangDriver and moving the LLVM decision to InitTargetOptionsFromCodeGenFlags.
Currently, the code-model specified in IR can't be captured by [llc].
This patch fixes that.
Reviewed By: shchenz, MaskRay
Differential Revision: https://reviews.llvm.org/D128623
Most notably, Pass.h is no longer included by TargetMachine.h
before: 1063570306
after: 1063332844
Differential Revision: https://reviews.llvm.org/D121168
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
This patch introduces the conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX.
Differential: https://reviews.llvm.org/D101759
Reviewer: bmahjour
- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D111662
Intended to be NFC. ARM/AArch64 don't appear to need adjustment.
TargetMachine::shouldAssumeDSOLocal is expected to be very simple, ideally
matching isDSOLocal(). The IR producers are expected to set dso_local correctly.
(While some may think this function can make producers' work easier, the
function is really not in a good position to set dso_local. See the various
special cases we duplicate from clang CodeGenModule.cpp.)
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D108514
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from
if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
to
if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")
Introduce a getValueAsBool that normalize the check, with the following
behavior:
no attributes or attribute set to "false" => return false
attribute set to "true" => return true
Differential Revision: https://reviews.llvm.org/D99299
There are two use cases.
Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).
Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld". You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727https://sourceware.org/bugzilla/show_bug.cgi?id=22969)
Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).
This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.
`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.
Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).
Differential Revision: https://reviews.llvm.org/D85474
TargetMachine::shouldAssumeDSOLocal currently implies dso_local for such definitions.
Since clang -fno-pic add the dso_local specifier, we don't need to special case.
This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the
decision to use dso_local. For LLVM synthesized functions/globals, they may lose
inferred dso_local but such optimizations are probably not very useful.
Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now.
(llvm/CodeGen/X86/semantic-interposition-comdat.ll)
(Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)
687b83ceabafe81970cd4639e7f0c89036402081 has fixed the X86FastISel bug.
We can revert the workaround now. Actually, the commit introduced a
bug that ppc64 should be excluded.
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.
Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.
This is NFC in terms of Clang emitted assembly.
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)
By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.
This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
The function accrues many `GV` nullness checks. Process `!GV`
(ExternalSymbolSDNode) early to simplify code.
Also improve a comment added in r327198 (intrinsics is a subset of
ExternalSymbolSDNode).
Intended to be NFC.
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead.
Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.
This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).