Canonicalize on storing FP options in LangOptions instead of
redundantly in CodeGenOptions. Incorporate -ffast-math directly
into the values of those LangOptions rather than considering it
separately when building FPOptions. Build IR attributes from
those options rather than a mix of sources.
We should really simplify the driver/cc1 interaction here and have
the driver pass down options that cc1 directly honors. That can
happen in a follow-up, though.
Patch by Michele Scandale!
https://reviews.llvm.org/D80315
If we're going to assume references are dereferenceable, we should also
assume they're aligned: otherwise, we can't actually dereference them.
See also D80072.
Differential Revision: https://reviews.llvm.org/D80166
This patch adds a matrix type to Clang as described in the draft
specification in clang/docs/MatrixSupport.rst. It introduces a new option
-fenable-matrix, which can be used to enable the matrix support.
The patch adds new MatrixType and DependentSizedMatrixType types along
with the plumbing required. Loads of and stores to pointers to matrix
values are lowered to memory operations on 1-D IR arrays. After loading,
the loaded values are cast to a vector. This ensures matrix values use
the alignment of the element type, instead of LLVM's large vector
alignment.
The operators and builtins described in the draft spec will will be added in
follow-up patches.
Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D72281
test cases
Add support for #pragma float_control
Reviewers: rjmccall, erichkeane, sepavloff
Differential Revision: https://reviews.llvm.org/D72841
This reverts commit 85dc033caccaa6ab919d57f9759290be41240146, and makes
corrections to the test cases that failed on buildbots.
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().
I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.
Differential Revision: https://reviews.llvm.org/D78882
This reverts commit 61ba1481e200b5b35baa81ffcff81acb678e8508.
I'm reverting this because it breaks the lldb build with
incomplete switch coverage warnings. I would fix it forward,
but am not familiar enough with lldb to determine the correct
fix.
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
^
lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]
switch (qual_type->getTypeClass()) {
Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.
However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns. Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.
We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch. We are
proposing the syntax _ExtInt(N). We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance. An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14. We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).
[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf
Differential Revision: https://reviews.llvm.org/D73967
Now compiler defines 5 sets of constants to represent rounding mode.
These are:
1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.
2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.
3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.
4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.
5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.
The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.
This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.
Differential Revision: https://reviews.llvm.org/D77379
The problem was reported in PR45468, applying target features to an
always_inline constructor/destructor runs afoul of GlobalDecl
construction assert when checking for target-feature compatibility.
The core problem is fixed by using the version of the check that takes a
FunctionDecl rather than the GlobalDecl. However, while writing the
test, I discovered that source locations weren't properly set for this
check on ctors/dtors. This patch also fixes constructors and CALLED destructors.
Unfortunately, it doesn't seem too possible to get a meaningful source
location for a 'cleanup' destructor, so those are still 'frontend' level
errors unfortunately. A fixme was added to the test to cover that
situation.
Summary: This allows instrumenting things like global initializers
Reviewers: dberris, MaskRay, smeenai
Subscribers: cfe-commits, johnislarry
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77191
This patch adds 'q' to mean 'scalable vector' in the builtin
type string, and for SVE will return the matching builtin
type as defined in the C/C++ language extensions for SVE.
This patch also adds some scaffolding to generate the arm_sve.h
header file, and some builtin definitions (+CodeGen) to be able
to implement some simple masked load intrinsics that use the
ACLE types, such as:
svint8_t test_svld1_s8(svbool_t pg, const int8_t *base) {
return svld1_s8(pg, base);
}
Reviewers: efriedma, rjmccall, rovka, rsandifo-arm, rengolin
Reviewed By: efriedma
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75298
norecurse function attr indicates the function is not called recursively
directly or indirectly.
Add norecurse to OpenCL functions, SYCL functions in device compilation
and CUDA/HIP kernels.
Although there is LLVM pass adding norecurse to functions, it only works
for whole-program compilation. Also FE adding norecurse can make that
pass run faster since functions with norecurse do not need to be checked
again.
Differential Revision: https://reviews.llvm.org/D73651
Reapply 8a56d64d7620b3764f10f03f3a1e307fcdd72c2f with minor fixes.
The problem was that cancellation can cause new edges to the parallel
region exit block which is not outlined. The CodeExtractor will encode
the information which "exit" was taken as a return value. The fix is to
ensure we do not return any value from the outlined function, to prevent
control to value conversion we ensure a single exit block for the
outlined region.
This reverts commit 3aac953afa34885a72df96f2b703b65f85cbb149.
In order to fix PR44560 and to prepare for loop transformations we now
finalize a function late, which will also do the outlining late. The
logic is as before but the actual outlining step happens now after the
function was fully constructed. Once we have loop transformations we
can apply them in the finalize step before the outlining.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D74372
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.
Differential Revision: https://reviews.llvm.org/D73842
Extend -fxray-instrumentation-bundle to split function-entry and
function-exit into two separate options, so that it is possible to
instrument only function entry or only function exit. For use cases
that only care about one or the other this will save significant overhead
and code size.
Differential Revision: https://reviews.llvm.org/D72890
XRay allows tuning by minimum function size, but also always instruments
functions with loops in them. If the minimum function size is set to a
large value the loop instrumention ends up causing most functions to be
instrumented anyway. This adds a new flag, -fxray-ignore-loops, to disable
the loop detection logic.
Differential Revision: https://reviews.llvm.org/D72873
The option will limit debug info by only emitting complete class
type information when its constructor is emitted.
This patch changes comparisons with LimitedDebugInfo to use the new
level instead.
Differential Revision: https://reviews.llvm.org/D72427
In the backend, this feature is implemented with the function attribute
"patchable-function-entry". Both the attribute and XRay use
TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are
incompatible.
Reviewed By: ostannard, MaskRay
Differential Revision: https://reviews.llvm.org/D72222
This feature is generic. Make it applicable for AArch64 and X86 because
the backend has only implemented NOP insertion for AArch64 and X86.
Reviewed By: nickdesaulniers, aaron.ballman
Differential Revision: https://reviews.llvm.org/D72221
The validateOutputSize and validateInputSize need to check whether
AVX or AVX512 are enabled. But this can be affected by the
target attribute so we need to factor that in.
This patch moves some of the code from CodeGen to create an
appropriate feature map that we can pass to the function.
Differential Revision: https://reviews.llvm.org/D68627
GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the
option.
aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’
Allowing this option can cause failures when building Linux kernel for
aarch64, powerpc64, etc, which will think the feature is available if
the clang command returns 0.
Recognize -mrecord-mcount from the command line and add a function attribute
"mrecord-mcount" when passed.
Only valid on SystemZ (when used with -mfentry).
Review: Ulrich Weigand
https://reviews.llvm.org/D71627
Let the "mnop-mcount" function attribute simply be present or non-present.
Update SystemZ backend as well to use hasFnAttribute() instead.
Review: Ulrich Weigand
https://reviews.llvm.org/D71669
Recognize -mpacked-stack from the command line and add a function attribute
"mpacked-stack" when passed. This is needed for building the Linux kernel.
If this option is passed for any other target than SystemZ, an error is
generated.
Review: Ulrich Weigand
https://reviews.llvm.org/D71441
The validateOutputSize and validateInputSize need to check whether
AVX or AVX512 are enabled. But this can be affected by the
target attribute so we need to factor that in.
This patch copies some of the code from CodeGen to create an
appropriate feature map that we can pass to the function. Probably
need some refactoring here to share more code with Codegen. Is
there a good place to do that? Also need to support the cpu_specific
attribute as well.
Differential Revision: https://reviews.llvm.org/D68627
This unbreaks the debuginfo-tests testsuite by replacing the assertion
with a default location. There are cleanups in helper functions that
don't have a valid source location such as block copy helpers and it's
not worth tracking each of them down.
rdar://57630879
Patch was reverted because https://bugs.llvm.org/show_bug.cgi?id=44048
The original patch is modified to set the strictfp IR attribute
explicitly in CodeGen instead of as a side effect of IRBuilder.
In the 2nd attempt to reapply there was a windows lit test fail, the
tests were fixed to use wildcard matching.
Differential Revision: https://reviews.llvm.org/D62731
AggValueSlot
This reapplies 8a5b7c35709d9ce1f44a99f0c5b084bf2696ea17 after a null
dereference bug in CGOpenMPRuntime::emitUserDefinedMapper.
Original commit message:
This is needed for the pointer authentication work we plan to do in the
near future.
a63a81bd99/clang/docs/PointerAuthentication.rst
checkTargetFeatures was incorrectly checking for cpu_specific instead of
just 'target'. While this function was never called in that situation,
it seemed correct to fix the condition. Additionally, multiversion
functions can never be always_inline, but if any function accidentially
ended up here we shouldn't diagnose.
Note that the adding of target-features to the list is unnecessary since
the getFunctionFeatureMap actually considers attribute target,
however adding it results in significantly better error messages by
putting the 'target' features first (and thus first to fail).
Otherwise, the error message would be the first feature 'implied' by the
target attribute, and not necessarily the feature listed in the
attribute itself.
and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread.
This reverts commits af57dbf12e54f3a8ff48534bf1078f4de104c1cd and e6584b2b7b2de06f1e59aac41971760cac1e1b79
Enumerations that describe rounding mode and exception behavior were
defined inside ConstrainedFPIntrinsic. It makes sense to use the same
definitions to represent the same properties in other cases, not only
in constrained intrinsics. It was however inconvenient as required to
include constrained intrinsics definitions even if they were not needed.
Also using long scope prefix reduced readability.
This change moves these definitioins to the namespace llvm::fp.
No functional changes.
Differential Revision: https://reviews.llvm.org/D69552
This patch is motivated by (and factored out from)
https://reviews.llvm.org/D66121 which is a debug info bugfix. Starting
with DWARF 5 all Objective-C methods are nested inside their
containing type, and that patch implements this for synthesized
Objective-C properties.
1. SemaObjCProperty populates a list of synthesized accessors that may
need to inserted into an ObjCImplDecl.
2. SemaDeclObjC::ActOnEnd inserts forward-declarations for all
accessors for which no override was provided into their
ObjCImplDecl. This patch does *not* synthesize AST function
*bodies*. Moving that code from the static analyzer into Sema may
be a good idea though.
3. Places that expect all methods to have bodies have been updated.
I did not update the static analyzer's inliner for synthesized
properties to point back to the property declaration (see
test/Analysis/Inputs/expected-plists/nullability-notes.m.plist), which
I believed to be more bug than a feature.
Differential Revision: https://reviews.llvm.org/D68108
rdar://problem/53782400