If a function's address is taken, which means it may be called via a function pointer,
we need the function descriptor for it.
Otherwise, the function descriptor can be omitted for external symbols.
Math library code has quite a few places with complex bit
logic that are ultimately fed into a copysign. This helps
avoid some regressions in a future patch.
This assumes the position in the float type, which should
at least be valid for IEEE types. Not sure if we need to guard
against ppc_fp128 or anything else weird.
There appears to be some value in simplifying the value operand
as well, but I'll address that separately.
Currently, the alignment of toc-data symbol might be changed during
instcombine
```
IC: Visiting: %global = alloca %struct.widget, align 8
Found alloca equal to global: %global = alloca %struct.widget, align 8
memcpy = call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 1 %global, ptr align 1 @global, i64 3, i1 false)
```
The `alloca` is created with `PrefAlign` which is 8 and after IC, the
alignment of `@global` is enforced into `8`, same as the `alloca`. This
is not expected, since toc-data symbol has the same alignment as toc
entry and should not be increased during optimizations.
---------
Co-authored-by: Sean Fertile <sd.fertile@gmail.com>
Co-authored-by: Eli Friedman <efriedma@quicinc.com>
This removes the uses of target flags to disable subreg liveness,
relying on the `-enable-subreg-liveness` flag instead. The
`-enable-subreg-liveness` flag has been changed to take precedence over
the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()`
has been changed to `MRI->subRegLivenessEnabled()` to make sure the
option properly applies.
Relanding this PR now that
https://github.com/llvm/llvm-project/pull/90503 has merged. with `FTAN`
landing in
[TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63
) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm
backends.
In LLVM, the `llvm.experimental.constrained.cos` and
`llvm.experimental.constrained.sin` intrinsics are used for performing
cosine and sine calculations with additional constraints on
floating-point operations. This behavior is expected for all
floating-point math intrinsics. This change adds these constraints for
the `tan` intrinsic.
- `Builtins.td` - replace TanF128 with F16F128MathTemplate
- `CGBuiltin.cpp` - map existing tan builtins to `tan` and
`constrained_tan` intrinsic
- `ConstrainedOps.def` map tan and constrained_tan to an ISDOpcode.
resolves #91421
---------
Co-authored-by: Farzon Lotfi <farzon@farzon.com>
This pull request port `regallocfast` to new pass manager. It exposes
the parameter `filter` to handle different register classes for AMDGPU.
IIUC AMDGPU need to allocate different register classes separately so it
need implement its own `--<reg-class>-regalloc`. Now users can use e.g.
`-passe=regallocfast<filter=sgpr>` to allocate specific register class.
The command line option `--regalloc-npm` is still in work progress, plan
to reuse the syntax of passes, e.g. use
`--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to
replace `--sgpr-regalloc` and `--vgpr-regalloc`.
Implementation is NOT compatible with IBM XL C 16.1 and earlier but is
compatible with GCC.
It handles all ByVals with greater alignment then pointer width the same
way IBM XL C handles Byvals
that have vector members. For overaligned objects that do not contain
vectors IBM XL C does not align them properly if they are passed in the
GPR
argument registers.
This patch was originally written by Sean Fertile @mandlebug.
Previously on Phabricator https://reviews.llvm.org/D105659
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
The getValidShiftAmountConstant/getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant helpers only worked with constant shift amounts, which could be problematic after type legalization (e.g. v2i64 might be partially scalarized or split into v4i32 on some targets such as 32-bit x86, Thumb2 MVE).
This patch proposes we generalize these helpers to work with ConstantRange+KnownBits if a scalar/buildvector constant isn't available.
Most restrictions are the same - the helper fails if any shift amount is out of bounds, getValidShiftConstant must be a specific constant uniform etc.
However, getValidMinimumShiftAmount/getValidMaximumShiftAmount now can return bounds values that aren't values in the actual data, as they are based off the common KnownBits of every vector element.
This addresses feedback on #92096
The Prefix instruction is introduced on PowerPC ISA3_1.
In the PR,
1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector`
,`FeatureP9Vector` .
2. `FeaturePrefixInstrs` implies only the FeatureISA3_1.
3. For the prefix instructions `paddi` and `pli` , they have `Predicates
= [PrefixInstrs] `
4. For the prefix instructions `plfs` and `plfd`, they have `Predicates
= [PrefixInstrs, HasFPU] `
5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have
`Predicates = [PrefixInstrs, HasP10Vector]`
Fixes#62372
In #88846 I changed this code to use RAUW to perform the replacement
instead of manual updates -- but kept the outer loop, which means we try
to perform RAUW once per user. However, some of the users might be freed
by the RAUW operation, resulting in use-after-free.
The case where this happens is constant users where the replacement
might result in the destruction of the original constant.
Fixes https://github.com/llvm/llvm-project/issues/92991.
This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2.
This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7.
Causes major compile-time regressions for unoptimized builds.
Support the address selection for toc-data globals in fast isel. This
benefits instruction selection for fast-isel for toc data symbol for
example for load selection. This also aligns the code generation
with/without -mtocdata.
Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase.
The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass.
Co-authored-by: Nuri Amari <nuriamari@fb.com>
If we have an exit condition of the form IV <= Limit, we will first try
to convert it into IV < Limit+1 or IV-1 < Limit based on range info (in
icmp simplification). If that fails, we try to convert it to IV < Limit
+ 1 based on controlling exits in non-infinite loops.
However, if all else fails, we can still determine the exit count by
rewriting to ext(IV) < ext(Limit) + 1, where the zero/sign extension
ensures that the addition does not overflow.
Proof: https://alive2.llvm.org/ce/z/iR-iYd
This patch adds support for toc-data for 64-bit large code-model on AIX.
The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly
from the TOC.
When emitting the instruction ADDIStocHA8, we check if the symbol has
toc-data attribute before creating a toc entry for it. When emitting the
instruction ADDItocL8, we use the LA8 instruction to load the address.
Perform bitcast lowering requires 64-bit to be native supported,
However this is not true on 32-bit targets. Explicitly require
64-bit target.
Fixes#92233
This allows us to handle cases where the constant has already been type legalized behind a bitcast
Despite calling ComputeKnownBits I'm not seeing any notable change in compile time.
String pool merging currently, for a reason that's not entirely clear to
me, tries to create GEP instructions instead of GEP constant expressions
when replacing constant references. It only uses constant expressions in
cases where this is required. However, it does not catch all cases where
such a requirement exists. For example, the landingpad catch clause has
to be a constant.
Fix this by always using the constant expression variant, which also
makes the implementation simpler.
Additionally, there are some edge cases where even replacement with a
constant GEP is not legal. The one I am aware of is the
llvm.eh.typeid.for intrinsic, so add a special case to forbid
replacements for it.
Fixes https://github.com/llvm/llvm-project/issues/88844.
Under some circumstance (library loaded with the main program), TLS
initial-exec model can be applied to local-dynamic access(es). We
could use some simple heuristic to decide the update at function level:
* If there is equal or less than a number of TLS local-dynamic access(es)
in the function, use TLS initial-exec model. (the threshold which default to
1 is controlled by hidden option)
If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file`
string. The revision can be set with `LLVM_FORCE_VC_REVISION`.
Before:
`.file "git_revision.cpp",,"LLVM version 19.0.0git"`
After:
`.file "git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`
Add pre-commit MIR test for PR "[Promote Pseudo Opcode from 32-bit to
64-bit after eliminating the extsw instruction in PPCMIPeepholes
optimization](https://github.com/llvm/llvm-project/pull/85451)" which
fixes bug reported in the issue "[Inconsistent Output at -O1 and -O2
Optimization Levels on PowerPC64 Due to Complex Type Casting and Nested
Loop Structure](https://github.com/llvm/llvm-project/issues/71030)".