One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.
However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).
The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
I recently went to merge a PR that had a merge conflict:
$ gh pr merge --squash --delete-branch
X Pull request #66003 is not mergeable: the merge commit cannot be
cleanly created.
To have the pull request merged after all the requirements have been
met, add the `--auto` flag.
Run the following to resolve the merge conflicts locally:
gh pr checkout 66003 && git fetch origin main && git merge origin/main
This is how I resolved it; we should recommend this explicitly for
fellow contributors.
Make codegen emit correctly rounded sqrt by default.
Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare
based on !fpmath, like the fdiv case. Hack around visitation ordering
problems from AMDGPUCodeGenPrepare using forward iteration instead of
a well behaved combiner.
https://reviews.llvm.org/D158129
The FileIndex values returned from GetFileInformationByHandle are
considered stable and uniquely identifying a file, as long as the
handle is open. When handles are closed, there are no guarantees
for their stability or uniqueness. On some file systems (such as
NTFS), the indices are documented to be stable even across handles.
But with some file systems, in particular network mounts, file
indices can be reused very soon after handles are closed.
When such file indices are used for LLVM's UniqueID, files are
considered duplicates as soon as the filesystem driver happens to
have used the same file index for the handle used to inspect the
file. This caused widespread, non-obvious (seemingly random)
breakage. This can happen e.g. if running on a directory that is
shared via Remote Desktop or VirtualBox.
To avoid the issue, use a hash of the canonicalized path for the
file as unique identifier, instead of using FileIndex.
This fixes https://github.com/llvm/llvm-project/issues/61401 and
https://github.com/llvm/llvm-project/issues/22079.
Performance wise, this adds (usually) one extra call to
GetFinalPathNameByHandleW for each call to getStatus(). A test
cases such as running clang-scan-deps becomes around 1% slower
by this, which is considered tolerable.
Change the equivalent() function to use getUniqueID instead of
checking individual file_status fields. The
equivalent(Twine,Twine,bool& result) function calls status() on
each path successively, without keeping the file handles open,
which also is prone to such false positives. This also gets rid
of checks of other superfluous fields in the
equivalent(file_status, file_status) function - the unique ID of
a file should be enough (that is what is done for Unix anyway).
This comes with one known caveat: For hardlinks, each name for
the file now gets a different UniqueID, and equivalent() considers
them different. While that's not ideal, occasional false negatives
for equivalent() is usually that fatal (the cases where we strictly
do need to deduplicate files with different path names are quite
rare) compared to the issues caused by false positives for
equivalent() (where we'd deduplicate and omit totally distinct files).
The FileIndex is documented to be stable on NTFS though, so ideally
we could maybe have used it in the majority of cases. That would
require a heuristic for whether we can rely on FileIndex or not.
We considered using the existing function is_local_internal for that;
however that caused an unacceptable performance regression
(clang-scan-deps became 38% slower in one test, even more than that
in another test).
Differential Revision: https://reviews.llvm.org/D155579
GNU readelf introduced --extra-sym-info/-X to display the section name
for --syms (https://sourceware.org/PR30684). Port the feature, which is
currently llvm-readelf only.
For STO_AARCH64_VARIANT_PCS/STO_RISCV_VARIANT_PCS, the Ndx and Name
columns may not be aligned.
The only real requirement is that entry and loop intrinsics should not
be preceded by convergent operations in the same basic block. They do
not need to be the first in the block.
Relaxing the constraint on the entry and loop intrinsics avoids having
to make changes in the construction of LLVM IR, such as
getFirstInsertionPt(). It also avoids added complexity in the lowering
to Machine IR, where COPY instructions may be added to the start of the
basic block.
Currently s_getreg_b32 is missing the possible mode use. Really we
need separate pseudos for mode-only accesses, but leave this as a
pre-existing issue.
https://reviews.llvm.org/D152710
A field `FilterClassField` is added to `GenericTable` class, which
is an optional bit field of `FilterClass`. If specified, only those
records with this field being true will have corresponding entries
in the table.
- For a long time I assumed that `inbounds` means "in-bounds of a *live*
allocation". @nikic told me that is not correct. I think this definitely
needs clarification in the docs.
- The point about successively adding the offsets to the current address
confused be because it talked about the successive addition of "an"
offset -- which one? My interpretation was, the total accumulated offset
computed in the previous step. But @nikic told me that's not correct,
adding each offset individually has to stay in-bounds for each step. I
hope by saying "each offset" this becomes more clear; I then also change
the previous bullet to use the same terminology.
Some people may not have access to `gh` or may prefer to use `git` and
the GitHub web interface to make a PR. This patch adds an example of
making a PR using this approach.
Adds a new feature to MIR patterns: builtin instructions.
They offer some additional capabilities that currently cannot be expressed without falling back to C++ code.
There are two builtins added with this patch, but more can be added later as new needs arise:
- GIReplaceReg
- GIEraseRoot
Depends on D158714, D158713
Reviewed By: arsenm, aemerson
Differential Revision: https://reviews.llvm.org/D158975
We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10
to fix this asymmetry. AMDGPU already has most of the code for f32
exp10 expansion implemented alongside exp, so the current
implementation is duplicating nearly identical effort between the
compiler and library which is inconvenient.
https://reviews.llvm.org/D157871
This adds first version of a GitHub workflow in the documentation and marks some
sections as deprecated. We should clean up these sections ASAP. I was
just keen to get something on the documentation site as soon as
possible.
Enable color highlighting of disassembly in llvm-objdump. This patch
introduces a new flag --disassembler-color=<mode> that enables or
disables highlighting disassembly with ANSI escape codes. The default
mode is to enable color highlighting if outputting to a color-enabled
terminal.
Differential revision: https://reviews.llvm.org/D159224
Add support for syntax highlighting assembly. The patch introduces new
RAII helper called WithMarkup that takes care of both emitting colors
and markup annotations. It makes adding markup easier and ensures colors
and annotations remain consistent.
This patch adopts the new helper in the AArch64 backend. If your backend
already uses markup annotations, adoption is as easy as using the new
MCInstPrinter::markup overload.
Differential revision: https://reviews.llvm.org/D159162
The MatchTable-based GlobalISel Combiner backend is the new default. There are no in-tree users left of the old backend.
- Removed implementation of old MatchDAG-based Combiner, including tests, the backend itself and all supporting code.
- Renamed MatchTable backend to `GlobalISelCombinerEmitter.cpp` + removed "-matchtable" from its CL option.
- no need to have a verbose name as it's the only backend left now.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D158710
There are really two rounding modes, so only return the standard
values if both modes are the same. Otherwise, return a bitmask
representing the two modes.
Annoyingly the register doesn't use the same values as FLT_ROUNDS. Use
a simple integer table we can shift into to convert.
https://reviews.llvm.org/D153158
If llvm-symbolizer finds a malformed command, it echoes it to the
standard output. New versions of binutils (starting from 2.39) allow to
specify an address by a symbols. Implementation of this feature in
llvm-symbolizer makes the current reaction on invalid input
inappropriate. Almost any invalid command may be treated as a symbol
name, so the right reaction should be "symbol not found" in such case.
The exception are commands that are recognized but have incorrect
syntax, like "FILE:FILE:". The utility must produce descriptive
diagnostic for such input and route it to the stderr.
This change implements the new reaction on invalid input and is a
prerequisite for implementation of symbol lookup in llvm-symbolizer.
Differential Revision: https://reviews.llvm.org/D157210
This patch and D156954 were discussed in
<https://discourse.llvm.org/t/rfc-improving-lits-debug-output/72839>.
**Motivation**: -a shows output from all tests, and -v shows output
from just failed tests. Without this patch, that output from each
test includes a section called "Script:", which includes all shell
commands that lit has computed from RUN directives and will attempt to
run for that test. The effect of -vv (which also implies -v if
neither -a or -v is specified) is to extend that output with shell
commands as they are executing so you can easily see which one failed.
For example, when using lit's internal shell and -vv:
```
Script:
--
: 'RUN: at line 1'; echo hello world
: 'RUN: at line 2'; 3c40 hello world
: 'RUN: at line 3'; echo hello world
--
Exit Code: 127
Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "echo" "hello" "world"
hello world
$ ":" "RUN: at line 2"
$ "3c40" "hello" "world"
'3c40': command not found
error: command failed with exit status: 127
--
```
Notice that all shell commands that actually execute appear in the
output twice, once for "Script:" and once for -vv. Especially for
tests with many RUN directives, the result is noisy. When searching
through the output for a particular shell command, it is easy to get
lost and mistake shell commands under "Script:" for shell commands
that actually executed.
**Change**: With this patch, a test's output changes in two ways.
First, the "Script:" section is never shown. Second, omitting -vv no
longer disables printing of shell commands as they execute. That is,
-a and -v imply -vv, and so -vv is deprecated as it is just an alias
for -v.
**Secondary motivation**: We are also working to introduce a PYTHON
directive, which can appear between RUN directives. How should PYTHON
directives be represented in the "Script:" section, which has
previously been just a shell script? We could probably think of
something, but adding info about PYTHON directive execution in the -vv
trace seems more straight-forward and more useful.
(This patch also removes a confusing point in the -vv documentation:
at least when using bash as an external shell, -vv echoes commands to
the shell's stderr not stdout.)
Reviewed By: awarzynski, Endill, ldionne, MaskRay
Differential Revision: https://reviews.llvm.org/D154984
This provides a uniform way to lower into the relevant instructions across all generations.
Differential Revision: https://reviews.llvm.org/D158468
Change-Id: I1f7ba4b15ee470738535cf1c7d177a11fc471e43
There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D152993
The patch adds parser, MCExpr, and emitter support for the authenticated
pointer auth relocation.
In assembly, this is expressed using:
.quad <symbol>@AUTH(<key>, <discriminator> [, addr])
For example:
.quad _g3@AUTH(ib, 1234, addr)
The optional 'addr' specifier represents whether the generated pointer
authentication code will also include address diversity (by blending the
address of the storage location of the relocated pointer with the
user-specified constant discriminator).
The @AUTH expression lowers to R_AARCH64_AUTH_ABS64 ELF relocation.
The signing schema is encoded in the place of relocation to be applied
as follows:
```
| 63 | 62 | 61:60 | 59:48 | 47:32 | 31:0 |
| ----------------- | -- | ----- | ----- | ------------- | ------ |
| address diversity | 0 | key | 0 | discriminator | addend |
```
See the following for details:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#static-relocations
Differential Revision: https://reviews.llvm.org/D156505
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
The change introduces intrinsics 'get_fpmode', 'set_fpmode' and
'reset_fpmode'. They manage all target dynamic floating-point control
modes, which include, for instance, rounding direction, precision,
treatment of denormals and so on. The intrinsics do the same
operations as the C library functions 'fegetmode' and 'fesetmode'. By
default they are lowered to calls to these functions.
Two main use cases are supported by this implementation.
1. Local modification of the control modes. In this case the code
usually has a pattern (in pseudocode):
saved_modes = get_fpmode()
set_fpmode(<new_modes>)
...
<do operations under the new modes>
...
set_fpmode(saved_modes)
In the case when it is known that the current FP environment is default,
the code may be shorter:
set_fpmode(<new_modes>)
...
<do operations under the new modes>
...
reset_fpmode()
Such patterns appear not only in user code but also in implementations
of various FP controlling pragmas. In particular, the implementation of
`#pragma STDC FENV_ROUND` requires similar code if the target does not
support static rounding mode.
2. Portable control of FP modes. Usually FP control modes are set by
writing to some control register. Different targets have different
layout of this register, the way the register is accessed also may be
different. Using set of target-specific definitions for the control
register bits together with these intrinsic functions provides enough
portable way to handle control modes across wide range of hardware.
This change defines only llvm intrinsic function, which implement the
access required for the aforementioned use cases.
Differential Revision: https://reviews.llvm.org/D82525
We no longer allow calls to functions with the `amdgpu_gfx` calling
convention from functions with the `amdgpu_cs_chain_preserve` calling
convention. See D153517.
Also mention that we can't have a chain call from
amdgpu_cs_chain_preserve using more VGPRs than it has received.
Differential Revision: https://reviews.llvm.org/D156408
I had to tighten the restrictions on PatFrags a bit to make it consistent: instructions that
define the root of a PF can only have one def.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D157700
This is a GSoC 2023 project ([discourse link](https://discourse.llvm.org/t/coverage-support-a-hierarchical-directory-structure-in-generated-coverage-html-reports/68239)).
llvm-cov currently generates a single top-level index HTML file, which causes rendering scalability issues in large projects. This patch adds support for hierarchical directory structure into the HTML reports to solve scalability issues by introducing the following changes:
- Added a new command line option `--show-directory-coverage` for `llvm-cov show`. It works both for `--format=html` and `--format=text`.
- Two new classes: `CoveragePrinterHTMLDirectory` and `CoveragePrinterTextDirectory` was added to support the new option.
- A tool class `DirectoryCoverageReport` was added to support the two classes above.
- Updated the document.
- Added a new regression test for `--show-directory-coverage`.
Reviewed By: phosek, gulfem
Differential Revision: https://reviews.llvm.org/D151283