so that we can remove the global `ctx` from toString implementations.
Rename lld::toString (to lld:🧝:toStr) to simplify name lookup (we
have many llvm::toString and another lld::toString(const llvm::opt::Arg
&)).
isObviouslySafeToFold can look between a load and an instruction it can be
folded into, to check that no other memory operations prevents the fold. It
doesn't handle multiple basic blocks which we needs to guard against.
Set the iterator category for graph iterators to input_iterator_tag when
the visited set is stored externally. In that case we can't provide
multi-pass guarantee, so we should not claim to be a forward iterator.
Fixes: #116400
Recently GitHub changed something on their side so we no longer can
rebase release PR's with the API. This means that we now have to
manually rebase the PR locally and then push the results. This fixes
the script that I use to merge PRs to the release branch by changing
the rebase part to do the local rebase and also adds a new option
--rebase-only so that you can rebase the PRs easier.
Minor change is that the script now can take a URL to the pull request
as well as just the PR ID.
This commit adds support for column breakpoints to lldb-dap.
To do so, support for the `breakpointLocations` request was
added. To find all available breakpoint positions, we iterate over
the line table.
The `setBreakpoints` request already forwarded the column correctly to
`SBTarget::BreakpointCreateByLocation`. However, `SourceBreakpointMap`
did not keep track of multiple breakpoints in the same line. To do so,
the `SourceBreakpointMap` is now indexed by line+column instead of by
line only.
See http://jonasdevlieghere.com/post/lldb-column-breakpoints/ for a
high-level introduction to column breakpoints.
Update the documentation for the noalias attribute, !alias.scope and
!loop.parallel_accesses metadata to clarify they are UB on voilation the
noalias property.
PR: https://github.com/llvm/llvm-project/pull/116220
---------
Co-authored-by: Nuno Lopes <nuno.lopes@tecnico.ulisboa.pt>
This PR is one of 3 in a PR stack, this is the primary change set which
seeks to extend the current derived type explicit member mapping support
to handle descriptor member mapping at arbitrary levels of nesting. The
PR stack seems to do this reasonably (from testing so far) but as you
can create quite complex mappings with derived types (in particular when
adding allocatable derived types or arrays of allocatable derived types)
I imagine there will be hiccups, which I am more than happy to address.
There will also be further extensions to this work to handle the
implicit auto-magical mapping of descriptor members in derived types and
a few other changes planned for the future (with some ideas on
optimizing things).
The changes in this PR primarily occur in the OpenMP lowering and the
OMPMapInfoFinalization pass.
In the OpenMP lowering several utility functions were added or extended
to support the generation of appropriate intermediate member mappings
which are currently required when the parent (or multiple parents) of a
mapped member are descriptor types. We need to map the entirety of these
types or do a "deep copy" for lack of a better term, where we map both
the base address and the descriptor as without the copying of both of
these we lack the information in the case of the descriptor to access
the member or attach the pointers data to the pointer and in the latter
case we require the base address to map the chunk of data. Currently we
do not segment descriptor based derived types as we do with regular
non-descriptor derived types, we effectively map their entirety in all
cases at the moment, I hope to address this at some point in the future
as it adds a fair bit of a performance penalty to having nestings of
allocatable derived types as an example. The process of mapping all
intermediate descriptor members in a members path only occurs if a
member has an allocatable or object parent in its symbol path or the
member itself is a member or allocatable. This occurs in the
createParentSymAndGenIntermediateMaps function, which will also generate
the appropriate address for the allocatable member within the derived
type to use as a the varPtr field of the map (for intermediate
allocatable maps and final allocatable mappings). In this case it's
necessary as we can't utilise the usual Fortran::lower functionality
such as gatherDataOperandAddrAndBounds without causing issues later in
the lowering due to extra allocas being spawned which seem to affect the
pointer attachment (at least this is my current assumption, it results
in memory access errors on the device due to incorrect map information
generation). This is similar to why we do not use the MLIR value
generated for this and utilise the original symbol provided when mapping
descriptor types external to derived types. Hopefully this can be
rectified in the future so this function can be simplified and more
closely aligned to the other type mappings. We also make use of
fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version
doesn't support the appropriate lowering to FIR necessary at the moment,
we also cannot use a single CoordinateOp (similarly to a single GEP) as
when we index through a descriptor operation (BoxType) we encounter
issues later in the lowering, however in either case we need access to
intermediate descriptors so individual CoordinateOp's aid this
(although, being able to compress them into a smaller amount of
CoordinateOp's may simplify the IR and perhaps result in a better end
product, something to consider for the future).
The other large change area was in the OMPMapInfoFinalization pass,
where the pass had to be extended to support the expansion of box types
(or multiple nestings of box types) within derived types, or box type
derived types. This requires expanding each BoxType mapping from one
into two maps and then modifying all of the existing member indices of
the overarching parent mapping to account for the addition of these new
members alongside adjusting the existing member indices to support the
addition of these new maps which extend the original member indices (as
a base address of a box type is currently considered a member of the box
type at a position of 0 as when lowered to LLVM-IR it's a pointer
contained at this position in the descriptor type, however, this means
extending mapped children of this expanded descriptor type to
additionally incorporate the new member index in the correct location in
its own index list). I believe there is a reasonable amount of comments
that should aid in understanding this better, alongside the test
alterations for the pass.
A subset of the changes were also aimed at making some of the utilities
for packing and unpacking the DenseIntElementsAttr containing the member
indices shareable across the lowering and OMPMapInfoFinalization, this
required moving some functions to the Lower/Support/Utils.h header, and
transforming the lowering structure containing the member index data
into something more similar to the version used in
OMPMapInfoFinalization. There we also some other attempts at tidying
things up in relation to the member index data generation in the
lowering, some of which required creating a logical operator for the
OpenMP ID class so it can be utilised as a map key (it simply utilises
the symbol address for the moment as ordering isn't particularly
important).
Otherwise I have added a set of new tests encompassing some of the
mappings currently supported by this PR (unfortunately as you can have
arbitrary nestings of all shapes and types it's not very feasible to
cover them all).
This is one of 3 PRs in a PR stack that aims to add support for explicit
mapping of allocatable members in derived types.
The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp
changes, which are small and seek to alter the current member mapping to
add an additional map insertion for pointers. Effectively, if the member
is a pointer (currently indicated by having a varPtrPtr field) we add an
additional map for the pointer and then alter the subsequent mapping of
the member (the data) to utilise the member rather than the parents base
pointer. This appears to be necessary in certain cases when mapping
pointer data within record types to avoid segfaulting on device (due to
incorrect data mapping). In general this record type mapping may be
simplifiable in the future.
There are also additions of tests which should help to showcase the
affect of the changes above.
This PR is one in a series of 3 that aim to add support for explicit
member mapping of allocatable components in derived types within
OpenMP+Fortran for Flang.
This PR provides all of the runtime tests that are currently
upstreamable, unfortunately some of the other tests would require
linking of the fortran runtime for offload which we currently do not do.
But regardless, this is plenty to ensure that the mapping is working in
most cases.
The corresponding enum members were only used by `EmitMOPS`, which
immediately translated them to machine opcodes. Just pass the machine
opcodes instead.
This is in preparation for adding a new optimization to the pass that
cares about the order of instructions. The existing optimization does
not care, so this just causes minor codegen differences.
Instead respect what the toolchain default is (or what the user sets via
CMAKE_CXX_FLAGS).
This fixes builds with libcxx, with mingw toolchains targeting
msvcrt.dll, after 5d8be4c036aa5ce4a94f1f37a9155d5c877e23db; after that
commit, the libcxx public headers reference symbols such as iswspace_l,
which are unavailable when targeting msvcrt.dll on older versions of
Windows (it's only available in msvcrt.dll since Windows Vista).
into zext x
LegalizerHelper has two padding strategies: undef or zero.
see LegalizerHelper:273
see LegalizerHelper:315
This PR is about zero sugar and Coke Zero.
; CHECK-NEXT: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES %a(s32),
[[C]](s32)
Please continue padding merge values.
// %bits_8_15:(s8) = G_CONSTANT i8 0
// %0:(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8)
%bits_8_15 is defined by zero. For optimization, we pick zext.
// %0:_(s16) = G_ZEXT %bits_0_7:(s8)
The upper bits of %0 are zero and the lower bits come from %bits_0_7.
The runtime Assign function is not meant to initialize an array from a
scalar. For that we need to use DoAssignFromSource. Update the data
transfer from scalar to descriptor to use a new entry point that use
this function underneath.
In order to support f16 load/store we need to make load/stores with s16
register type legal. If regbank selection doesn't pick the FPR bank,
we'll be left with a GPR load or store which we don't have isel patterns
for from SelectionDAG.
In order to add the patterns we need to make i16 a legal type for the
GPR register class.
Tests are currently disabling the legality check because I haven't
update the legalizer yet.
Up until now we could only support packing of scalar elements. This
patch fixes this by implementing packing of vector elements, by
generating extractelement and insertelement instruction pairs.
This patch removes MemProf format Version 0 now that version 2 and 3
seem to be working well.
I'm not touching version 1 for now because some tests still rely on
version 1.
Note that Version 0 is identical to Version 1 except that the MemProf
section of the indexed format has a MemProf version field.