We can support these via libcalls in libgcc/compiler-rt or integer
operations for fneg/fabs/fcopysign. fp128 values will be passed in two
64-bit GPRs according to the psABI.
Supporting RV32 requires sret which is not supported by libcall handling
in LegalizerHelper.cpp yet. It doesn't call canLowerReturn.
For Frames, we prefer the inline notation for the brevity.
For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.
* Add missing semantics to the `Semantics` enum.
* Move all documentation of the semantics to the header file.
* Also rename some functions for consistency.
If parts of a physical register for a given liverange, as assigned by
the register allocator, can be used to store other values not
represented by this liverange, then `LiveIntervals::addKillFlags`
normally avoids adding a kill flag on the use of this register
when the value's liverange ends.
However, if all the other regunits are artificial, then we can
still safely add the kill flag, since those parts of the register
can never be accessed independently.
The DAG has the same instructions: the signed and unsigned absolute
difference of it's input. For AArch64, they map to uabd and sabd for
Neon and SVE. The Neon and SVE instructions will require custom
patterns.
They are pseudo opcodes and are not imported by the IRTranslator. We
need combines to create them.
PowerPC, ARM, and AArch64 have native instructions.
/// i.e trunc(abs(sext(Op0) - sext(Op1))) becomes abds(Op0, Op1)
/// or trunc(abs(zext(Op0) - zext(Op1))) becomes abdu(Op0, Op1)
For GlobalISel, we are going to write the combines in MIR patterns.
see:
llvm/test/CodeGen/AArch64/abd-combine.ll
- [ ] combine into abd
- [ ] legalize and add td patterns
This consists of:
* Make these instructions part of FPMathOperator.
* Adjust bitcode/ir readers/writers to expect fast math flags on these
instructions.
* Make IRBuilder set the fast math flags on these instructions.
* Update langref and release notes.
* Update a bunch of tests. Some of these are due to InstCombineCasts
incorrectly adding fast math flags to fptrunc, which will be fixed in a
later patch.
The existing analysis was already a pimpl wrapper.
I have extracted legacy pass logic to a LDVImpl wrapper named
`LiveDebugVariables` which is the analysis::Result now. This controls
whether to activate the LDV (depending on `-live-debug-variables` and
DIsubprogram) itself.
The legacy and new analysis only construct the LiveDebugVariables.
VirtRegRewriter will test this.
This patch extends the PGO infrastructure with an option to prefer the
instrumentation of loop entry blocks.
This option is a generalization of
19fb5b467b,
and helps to cover cases where the loop exit is never executed.
An example where this can occur are event handling loops.
Note that change does NOT change the default behavior.
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
This patch ensures that IndexedMemProfRecord::clear clears every field
of IndexedMemProfRecord.
This fix is not critical at the moment. The only use of this function
is in RecordWriterTrait::EmitData to release the memory we are done
with. That is, we never clear the data structure for the purpose of
reusing it.
This update enhances the implementation of structural hashing for global
variables, using their initial contents. Private global variables or
constants are often used for metadata, where their names are not unique.
This can lead to the creation of different hash results although they
could be merged by the linker as they are effectively identical.
- Refine the hashing of GlobalVariables for strings or certain
Objective-C metadata cases that have section names. This can be further
extended to other scenarios.
- Expose StructuralHash for GlobalVariable so that this API can be
utilized by MachineStableHashing, which is also employed in the global
function outliner.
This change significantly improves size reduction by an additional 1% on
the LLD binary when the global function outliner and merger are enabled
together. As discussed in the RFC
https://discourse.llvm.org/t/loh-conflicting-with-machineoutliner/83279/8?u=kyulee-com,
if we disable or relocate the LOH pass, the size impact could increase
to 4%.
This reverts commit acf3b1aa932b2237c181686e52bc61584a80a3ff.
Broke https://lab.llvm.org/buildbot/#/builders/76/builds/5002
tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/BackendUtil.cpp.o:(.toc+0x258): undefined reference to `vtable for llvm::DroppedVariableStatsIR'
This reverts commit 249755cedb17ffa707253edcef1a388f807caa35.
Broke https://lab.llvm.org/buildbot/#/builders/160/builds/9420
Note: This is test shard 99 of 154.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from DroppedVariableStatsMIR
[ RUN ] DroppedVariableStatsMIR.InlinedAt
--
exit: -11
Moved the IR unit test to the CodeGen folder to resolve linker errors:
`error: undefined reference to 'vtable for
llvm::DroppedVariableStatsIR'`
This patch is trying to reland
https://github.com/llvm/llvm-project/pull/115563
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.
We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.
The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
Reverts llvm/llvm-project#110898
This change has caused a cyclic module dependency `fatal error: cyclic
dependency in module 'LLVM_Utils': LLVM_Utils ->
LLVM_Config_ABI_Breaking -> LLVM_Utils`. Reverting for now until we the
right fix.
This reapplies aba6bb0820b, which was reverted in 28e2a891210 due to bot
failures. It contains fixes to silence warnings for uncovered switches,
and for incorrect initializer-symbol handling on ELF and COFF.
Alternative for https://github.com/llvm/llvm-project/pull/113764
It builds on a minimalistic approach with the legality check in match
and a blind apply. The precise patterns are used for better compile-time
and modularity. It also moves the pattern check into combiner. While
unary_undef_to_zero and propagate_undef_any_op rely on custom C++ code
for pattern matching.
Is there a limit on the number of patterns?
G_ANYEXT of undef -> undef
G_SEXT of undef -> 0
G_ZEXT of undef -> 0
The combine is not a member of the post legalizer combiner for AArch64.
Test:
llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
SideEffectsOnly is a new jitlink::Scope value that corresponds to the
JITSymbolFlags::MaterializationSideEffectsOnly flag: Symbols with this scope
can be looked up (and form part of the initial interface of a LinkGraph) but
never actually resolve to an address (so can only be looked up with a
WeaklyReferencedSymbol lookup).
Previously ObjectLinkingLayer implicitly treated JITLink symbols as having this
scope, regardless of a Symbol's actual scope, if the
MaterializationSideEffectsOnly flag was set on the corresponding symbol in the
MaterializationResponsibility object. Using an explicit scope in JITLink for
this (1) allows JITLink plugins to identify and correctly handle
side-effects-only symbols, and (2) allows raw LinkGraphs to define
side-effects-only symbols without clients having to manually modify their
`MaterializationUnit::Interface`.
When `-import-declaration` option is enabled, declaration import is
supported for functions. https://github.com/llvm/llvm-project/pull/88024
has the context for this option.
This patch supports declaration import for global variables in
distributed ThinLTO. The motivating use case is to propagate `dso_local`
attribute of global variables across modules, to optimize global
variable access when a binary is built with
`-fno-direct-access-external-data`.
* With `-fdirect-access-external-data`, non thread-local global
variables will [have `dso_local`
attributes](fe3c23b439/clang/lib/CodeGen/CodeGenModule.cpp (L1730-L1746)).
This optimizes the global variable access as shown by
https://gcc.godbolt.org/z/vMzWcKdh3
This introduces `__builtin_hlsl_resource_getpointer`, which lowers to
`llvm.dx.resource.getpointer` and is used to implement indexing into
resources.
This will only work through the backend for typed buffers at this point,
but the changes to structured buffers should be correct as far as the
frontend is concerned.
Note: We probably want this to return a reference in the HLSL device
address space, but for now we're just using address space 0. Creating a
device address space and updating this code can be done later as
necessary.
Fixes#95956
Add an extra know to UnrollingPreferences to let backends control the
maximum budget for SCEV expansions.
This gives backends more fine-grained control on the cost of the runtime
checks for runtime unrolling.
PR: https://github.com/llvm/llvm-project/pull/118316
Again, this simplifies the semantic checks and lowering quite a bit.
Update the check for positive alignment to use a more informative
message, and to highlight the modifier itsef, not the whole clause.
Remove the checks for the allocator expression itself being positive:
there is nothing in the spec that says that it should be positive.
Remove the "simple" modifier from the AllocateT template, since both
simple and complex modifiers are the same thing, only differing in
syntax.
Reland: https://github.com/llvm/llvm-project/pull/110092
Buildbot was failing due to an unused variable warning as there were 2
implementations (with and without libedit). Compared to last version,
wrap the variable inside the `#ifdef HAVE_LIBEDIT`.
Apologies for the oversight @vgvassilev
fixes#112974
partially fixes#70103
An earlier version of this change was reverted so some issues could be fixed.
### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case
### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)