With the new behaviour, the /MD or similar options aren't added to
e.g. CMAKE_CXX_FLAGS_RELEASE, but are added separately by CMake.
They can be changed by the cmake variable
CMAKE_MSVC_RUNTIME_LIBRARY or with the target property
MSVC_RUNTIME_LIBRARY.
LLVM has had its own custom CMake flags, e.g. LLVM_USE_CRT_RELEASE,
which affects which CRT is used for release mode builds. Deprecate
these and direct users to use CMAKE_MSVC_RUNTIME_LIBRARY directly
instead (and do a best effort attempt at setting CMAKE_MSVC_RUNTIME_LIBRARY
based on the existing LLVM_USE_CRT_ flags). This only handles the
simple cases, it doesn't handle multi-config generators with
different LLVM_USE_CRT_* variables for different configs though,
but that's probably fine - we should move over to the new upstream
CMake mechanism anyway, and push users towards that.
Change code in compiler-rt, that previously tried to override the
CRT choice to /MT, to set CMAKE_MSVC_RUNTIME_LIBRARY instead of
meddling in the old variables.
This resolves the policy issue in
https://github.com/llvm/llvm-project/issues/63286, and should
handle the issues that were observed originally when the
minimum CMake version was bumped, in
https://github.com/llvm/llvm-project/issues/62719 and
https://github.com/llvm/llvm-project/issues/62739.
Differential Revision: https://reviews.llvm.org/D155233
The next sections in GettingStarted assume you are still in the root
directory llvm-project when using ninja.
Make the `cmake --build` command match it as well.
Note: I am a new cmake user and this confused me.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D153727
According the discuss on D154953, we need to make the LangRef change
before the optimization relied on the new behaviour:
vscale_range implies vscale is a power-of-two value, parse of the
attribute to reject values that are not a power-of-two.
Thanks nikic for the wonderful summary of discussing on D154953:
To provide a bit more context here. We would like to have power of two vscale exposed in a target-independent way, so we can make use of this in places like ValueTracking, just like we currently do the vscale range. Some options that have been discussed are:
- Remove support for non-power-of-two vscales entirely. (This is my personal preference, but this is hard to undo if it turns out someone does need them.)
- Add an extra attribute vscale_pow2, or a data layout property.
- Make vscale_range imply power-of-two vscale, as a compromise solution (what this patch does). This would be relatively easy to turn into one of the two above at a later point.
Reviewed By: paulwalker-arm, nikic, efriedma
Differential Revision: https://reviews.llvm.org/D155193
This is a mostly NFC change cleaning up and clarifying components of the
in-tree CORE-V (xcv*) extensions following discussions on the remaining
extensions.
This makes the following changes to the xcbitmanip and xcvmac support:
1. Add missing extensions from RISCVISAInfo, such that they can be
supported in clang's -march option.
2. Clarify the extension version number is 1.0.0 in documentation.
3. Clarify the extensions are by OpenHW Group, and the capitilization
of the CORE-V extension family.
4. Add CORE-V to extension name in RISCVFeatures, both to be consistent
with other vendors, and also better distinguish e.g. CORE-V bit
manipulation vs RISC-V's standard Zb extensions.
Differential Revision: https://reviews.llvm.org/D155283
The test migration to opaque pointers has finished, so we can finally
drop typed pointer support from LLVM \o/
This removes the ability to disable typed pointers, as well as the
-opaque-pointers option, but otherwise doesn't yet touch any API
surface. I'll leave deprecation/removal of compatibility APIs to
future changes.
This also drops a few tests: These are either testing errors that
only occur with typed pointers, or type linking behavior that, to
the best of my knowledge, only applies to typed pointers.
Note that this will break some tests in the experimental SPIRV
backend, because the maintainers have failed to update their tests
in a reasonable time-frame, despite multiple warnings. In accordance
with our experimental target policy, this is not a blocking concern.
This issue is tracked at https://github.com/llvm/llvm-project/issues/60133.
Differential Revision: https://reviews.llvm.org/D155079
Do the LDS frame calculation once, in the IR pass, instead of repeating the work in the backend.
Prior to this patch:
The IR lowering pass sets up a per-kernel LDS frame and annotates the variables with absolute_symbol
metadata so that the assembler can build lookup tables out of it. There is a fragile association between
kernel functions and named structs which is used to recompute the frame layout in the backend, with
fatal_errors catching inconsistencies in the second calculation.
After this patch:
The IR lowering pass additionally sets a frame size attribute on kernels. The backend uses the same
absolute_symbol metadata that the assembler uses to place objects within that frame size.
Deleted the now dead allocation code from the backend. Left for a later cleanup:
- enabling lowering for anonymous functions
- removing the elide-module-lds attribute (test churn, it's not used by llc any more)
- adjusting the dynamic alignment check to not use symbol names
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D155190
Extend D127824 to the 32-bit Power architecture.
AFAICT GNU objdump -d dumps all instructions for 32-bit as well.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D155114
This is a reboot of the original design and implementation by
Nicolai Haehnle <nicolai.haehnle@amd.com>:
https://reviews.llvm.org/D85603
This change also obsoletes an earlier attempt at restarting the work on
convergence tokens:
https://reviews.llvm.org/D104504
Changes relative to D85603:
1. Clean up the definition of a "convergent operation", a convergent
call and convergent function.
2. Clean up the relationship between dynamic instances, sets of threads and
convergence tokens.
3. Redistribute the formal rules into the definitions of the convergence
intrinsics.
4. Expand on the semantics of entering a function from outside LLVM,
and the environment-defined outcome of the entry intrinsic.
5. Replace the term "cycle" with "closed path". The static rules are defined
in terms of closed paths, and then a relation is established with cycles.
6. Specify that if a function contains a controlled convergent operation, then
all convergent operations in that function must be controlled.
7. Describe an optional procedure to infer tokens for uncontrolled convergent
operations.
8. Introduce controlled maximal convergence-before and controlled m-converged
property as an update to the original properties in UniformityAnalysis.
9. Additional constraint that a cycle heart can only occur in the header of a
reducible cycle (natural loop).
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D147116
Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.
A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D150803
This implements the v1.0-rc1 draft extension.
amocas.d on RV32 and amocas.q have the restriction that rd and rs2 must
be even registers. I've opted to implement this restriction in
RISCVAsmParser::validateInstruction even though for codegen we'll need a
new register class and can then remove this validation. This also
sidesteps, for now, the issue of amocas.d being different on rv32 vs
rv64.
See <https://github.com/riscv-non-isa/riscv-c-api-doc/issues/37> for the
issue of needing an agreed asm register constraint for register pairs.
Differential Revision: https://reviews.llvm.org/D149248
Summary:
Adding a new option -traceback-table to print out the traceback info of xcoff ojbect file.
Reviewers: James Henderson, Fangrui Song, Stephen Peckham, Xing Xue
Differential Revision: https://reviews.llvm.org/D89049
Currently, our GEP specification has a special case that makes
gep inbounds (null, 0) legal. This patch proposes to expand this
special case to all gep inbounds (ptr, 0), where ptr is no longer
required to point to an allocated object.
This was previously discussed in some detail at
https://discourse.llvm.org/t/question-about-getelementptr-inbounds-with-offset-0/62533.
The motivation for this change is twofold:
* Rust relies on getelementptr inbounds with zero offset to be
legal for arbitrary pointers to support zero-sized types. The
current rules are unclear on whether this is legal or not
(saying that there is a zero-size "allocated object" at every
address may be consistent with our current rules, but more
clarity is desired here).
* The current semantics require us to drop the inbounds flag
when materializing zero-index GEPs, which is done by some
InstCombine transforms. Preserving the inbounds flag can
substantially improve optimization quality in some cases, as
illustrated in D154055.
As far as I know, the only analysis/transforms affected by this
semantics change are:
* A special-case for comparisons with null in CaptureTracking,
which is fixed by D154054. As far as I can tell, that special
case is not particularly valuable and should be recovered by
other transforms.
* Folding gep inbounds undef, idx to poison. We now need to fold
to undef instead (D154215).
Differential Revision: https://reviews.llvm.org/D154051
The library expansion has too many paths for all the permutations of
DAZ, unsafe and the 3 exp functions. It's easier to expand it in the
backend when we know all of these things. The library currently misses
the no-infinity check on the overflow, which this handles optimizing
out.
Some of the <3 x half> fast tests regress due to vector widening
dropping flags which will be fixed separately.
Apparently there is no exp10 intrinsic, but there should be. Adds some
deadish code in preparation for adding one while I'm following along
with the current library expansion.
Previously we expanded these in a fast-math way and the device
libraries were relying on this behavior. The libraries have a pending
change to switch to the new target intrinsic.
Unlike the library version, this takes advantage of no-infinities on
the result overflow check.
This adds a type_checked_load_relative intrinsic whose semantics it is to
load a relative function pointer.
A relative function pointer is a pointer to a 32bit value that when
added to its address yields the address of the function.
Differential Revision: https://reviews.llvm.org/D143204
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.
Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.
After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.
Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in `.text` are congruent with the existing output produced by the
default and LTO pipelines.
Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977
Earlier versions of this patch were missing REQUIRES lines for llc
related tests in Transforms/EmbedBitcode. Those tests are now under
CodeGen/X86, which should avoid running the check on unsupported
platforms.
The EmbedbBitcodePass also returned PreservedAnalyses::all when adding a
metadata section, which failed expensive checks, since it modified the
module. This is now corrected.
Reviewed By: tejohnson, MaskRay, nikic
Differential Revision: https://reviews.llvm.org/D146776
Add an intrinsic which returns the two pieces as multiple return
values. Alternatively could introduce a pair of intrinsics to
separately return the fractional and exponent parts.
AMDGPU has native instructions to return the two halves, but could use
some generic legalization and optimization handling. For example, we
should be able to handle legalization of f16 on older targets, and for
bf16. Additionally antique targets need a hardware workaround which
would be better handled in the backend rather than in library code
where it is now.
The symbolizer markup syntax is structured such that fields require only
previous fields for their interpretation; this was originally intended
to make adding new fields a natural extension mechanism for existing
elements. This codifies this into the spec and makes the behavior of the
llvm-symbolizer match. Extra fields are now warned about, but ignored,
rather than ignoring the whole element.
Reviewed By: mcgrathr
Differential Revision: https://reviews.llvm.org/D153821
This updates the documentation to match the implementation. Warning and
Min interact in the same way as Warning and Max.
Differential Revision: https://reviews.llvm.org/D153012
This was discussed somewhat in D148315. As it stands, we require in
RISCVISAInfo::parseArchString (used for e.g. -march parsing in Clang)
that extensions are given in the order of z, then s, then x prefixed
extensions (after the standard single-letter extensions). However, we
recently (in D148315) moved to that order from z/x/s as the canonical
ordering was changed in the spec. In addition, recent GCC seems to
require z* extensions before s*.
My recollection of the history here is that we thought keeping -march as
close to the rules for ISA naming strings as possible would simplify
things, as there's an existing spec to point to. My feeling is that now
we've had incompatible changes, and an incompatibility with GCC there's
no real benefit to sticking to this restriction, and it risks making it
much more painful than it needs to be to copy a -march= string between
GCC and Clang.
This patch removes all ordering restrictions so you can freely mix x/s/z
extensions.
To be very explicit, this doesn't change our behaviour when emitting a
canonically ordered extension string (e.g. in build attributes). We of
course sort according to the canonical order (as we understand it) in
that case.
Differential Revision: https://reviews.llvm.org/D149246
Switch the parse of command line options fromllvm::cl to OptTable.
The motivation for this change is to continue adding llvm based tools
to the llvm driver multicall.
Differential Revision: https://reviews.llvm.org/D153665
I meant to fold this into 47601815ec3a4f31c797c75748af08acfabc46dc
but failed to do so.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D152671
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.
Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.
After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.
Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in `.text` are congruent with the existing output produced by the
default and LTO pipelines.
Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977
Earlier versions of this patch were missing REQUIRES lines for llc
related tests in Transforms/EmbedBitcode. Those tests are now under
CodeGen/X86, which should avoid running the check on unsupported
platforms.
Reviewed By: tejohnson, MaskRay, nikic
Differential Revision: https://reviews.llvm.org/D146776
There seems to be a problem on arm buildbots. Reverting until I can
investigate. https://lab.llvm.org/buildbot#builders/245/builds/10184
This reverts commit a67208e1c697649ce432e6497f56a93675273dd8
and dependent commit e54a3112cee5ae0a9117359ecbea878e1388f51e.
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.
Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.
After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.
Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in `.text` are congruent with the existing output produced by the
default and LTO pipelines.
Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977
Reviewed By: tejohnson, MaskRay, nikic
Differential Revision: https://reviews.llvm.org/D146776
My MC layer support patches missed adding these to RISCVUsage. Also
update the link to the most recent spec PDF (including the recently
committed encoding fix for vfwmaccbf16.
We previously directly codegened to v_log_f32, which is broken for
denormals. The lowering isn't complicated, you simply need to scale
denormal inputs and adjust the result. Note log and log10 are still
not accurate enough, and will be fixed separately.
The _e64_dpp suffix can be added to an instruction to force the
AsmParser to encode it as VOP3 with DPP if possible on GFX11+. This has
been the behavior since GFX11 was introduced; this patch only updates
the documentation.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153564