This adds CIR handling for the __builtin_flt_rounds and
__builtin_set_flt_rounds builtin functions. Because the LLVM dialect
does not have dedicated operations for these, I have chosen not to
implement them as operations in CIR either. Instead, we just call the
LLVM intrinsic.
This change introduces TableGen support for indicating CIR attributes
that require a CIRAttrToValue visitor, adds the new flag to all
attributes to which it applies, and replaces the explicit declarations
with the tablegen output.
This adds the handling for the case where the address of a static local
variable is used to initialize another static local. In this case, the
address of the first variable is emitted as a constant in the
initializer of the second variable.
When a VFS overlay YAML file contains malformed content such as tabs,
the YAML parser can produce KeyValueNode entries where `getKey` returns
nullptr. The VFS overlay parser then passes the nullptr to
`parseScalarString`, which then calls dyn_cast.
Switch to `dyn_cast_if_present` for the above callsites and a few more.
The Hexagon driver only checked for an explicit -pie flag when
constructing the link command, ignoring the toolchain's PIE default. For
linux-musl targets, isPIEDefault() returns true (via the Linux toolchain
base class), so the compiler generates PIC/PIE code (-pic-level 2
-pic-is-pie) but the linker never received -pie.
This mismatch caused LTO failures: without -pie the linker sets
Reloc::Static for the LTO backend, which generates GP-relative
(small-data) references that lld cannot resolve.
Use hasFlag() to respect the toolchain default, and guard the -pie
emission against -shared and -r (relocatable) modes.
From #185382
Lower `vduph_lane_f16` and `vduph_laneq_f16` to `cir::VecExtractOp`
Tests moved from `v8.2a-neon-instrinsics-generic.c` to a new CIR-enabled
test file.
I tried following from notes made in #185852 (BF16)
Add skip_tail_padding property to cir.copy to handle
potentially-overlapping
subobject copies directly, instead of falling back to cir.libc.memcpy.
When
set, the lowering uses the record's data size (excluding tail padding)
for
the memcpy length. This keeps typed semantics and promotability of
cir.copy.
Also fix CXXABILowering to preserve op properties when recreating
operations,
and expose RecordType::computeStructDataSize() for computing data size
of
padded record types.
Added vector intrinsics for
vshlq_n_s8
vshlq_n_s16
vshlq_n_s32
vshlq_n_s64
vshlq_n_u8
vshlq_n_u16
vshlq_n_u32
vshlq_n_u64
vshl_n_s8
vshl_n_s16
vshl_n_s32
vshl_n_s64
vshl_n_u8
vshl_n_u16
vshl_n_u32
vshl_n_u64
these cover all the vector intrinsics for constant shift
the method followed
1) the vectors for quad words are of the form `64x2`, `32x4`, `16x8`,
`8x16` and the shift is a constant value but for shift left we need both
of them to be vectors so we take the constant shift and convert it into
a vector of respective form, for `64x2` we convert the constant to
`64x2`, I have learnt that this process is also called **splat**
2) After splat we have that the lhs and rhs are of the same size hence
the shift left can be applied
3) There is one issue though, the ops[0] is not of the right size, for
quad words it falls back to the default int8*16 in the function, so I am
converting it to the required size using bit casting, `8x16` = `64x2` so
we can bitcast and get the vector array in the right form.
Wrote the test cases for all the intrinsics listed above
#185382
No real challenge to these, it is effectively a copy/paste of the
classic codegen as it just requires we properly emit the holding
variable. The rest falls out of the rest of our handling of variables.
If the pointer for a reference is constexpr-unknown, use the pointer
itself instead, instead of dereferencing it. Unfortunately, that means
constexpr-unknown pointers to reach a lot more places than before.
Use the canonical type when generating type strings to ensure sugared
(e.g. `AutoType`, `DecltypeType`) are resolved before calling
getFullyQualifiedType.
This will revert a few commits that were added to fix these assertions.
---------
Co-authored-by: Harald van Dijk <hdijk@accesssoftek.com>
Replace CIR_VisibilityAttr with
DefaultValuedProp<EnumProp<CIR_VisibilityKind>>
for global_visibility on GlobalOp and FuncOp. This removes the need for
custom
parse/print functions and simplifies callers to use direct enum values
instead
of wrapping/unwrapping VisibilityAttr.
When both outer and inner pack substitution indexes are present, we
should cache both. Otherwise we will have wrong cached result.
This is a regression fix so no release note.
Fixes https://github.com/llvm/llvm-project/issues/190169
Issue #178324 was actually fixed by #187755
We lost the "declaration does not declare anything" warning since the
regression was introduced, but that was because:
1) Since #78436 we treat __builtin_FUNCSIG in a dependent context
effectivelly as if it contained a template parameter.
2) Our decltype implementation treats eexpressions containing template
parameters as if they were completely opaque (but alas this goes against
the spec, which says in [temp.type]p4 this should be looking only at
type dependence).
3) Since the decltype is opaque, we don't know what lookup will find, so
we can't issue the warning because we don't know if we are going to end
up with a type or an expression.
Fixes#178324
It is generally inappropriate to pass -O flags to IRGen tests because
it makes them sensitive to optimizer behavior. #186548 makes a change to
optimizer behavior that would cause this test to fail without this change.
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/190417
After a68ae7b0cc0922b79114aabe8cf1ec8dc68524d7, calling `<<` with a
`std::string_view` gives:
```
clang/include/clang/Basic/Diagnostic.h:1319:8: error: use of overloaded operator '<<' is ambiguous (with operand types 'const StreamingDiagnostic' and 'const std::string_view')
1319 | DB << V;
| ~~ ^ ~
```
There are two related issues here. On the declaration/definition side,
we need to make sure the markings are conservative. Then on the caller
side, we need to make sure we don't access parameters that don't exist.
Fixes#187535.
Matrix loads and stores are accesses of their element types. Emit TBAA
nodes using their element type to allow more precise TBAA alias
analysis.
PR: https://github.com/llvm/llvm-project/pull/190029
Unify include and library paths by reusing common code to compute path
prefixes. First, determine the effective sysroot by choosing a
user-provided sysroot, "../target/<triple>", or "../target/hexagon",
in the order of precedence. Based on the sysroot, derive the standard
include path, C++ include path, and base library path.
Fix the default -L library paths so they are taken from the external
sysroot, when one specified. Previously, these paths were always
relative to the install directory and sysroot was ignored.
Remove certain locations from considerations, as there are never used
for the corresponding purpose in existing sysroots:
- fallback to install path, typically "../target/bin", as the base path
when other sysroot cannot be found;
- similarly, fallback to "../target/" for startup files;
- "../target/bin" for program paths as there are no program files in
current sysroots.
Other minor changes:
- use windows-correct path delimiting;
- enable hexagon-toolchain-linux.c test for windows hosts.
This updates the CIR constant emitter to use the correct destination
type when emitting a constant initializer for a structure that might be
initialized with non-prototyped function pointers. We were previously
using the type from whatever function declaration we had, but this may
not be the correct type.
This change also updates the `replaceUsesOfNonProtoTypeWithRealFunction`
to ignore global initializer uses, which do not need to be updated after
this change.
When a CIR op specifies a non-empty `llvmOp` field, the lowering
emitter now generates the `matchAndRewrite` body that converts the
result type and forwards all operands to the corresponding LLVM op.
This removes 27 boilerplate lowering patterns from LowerToLLVM.cpp.
Ops needing custom logic (FMaxNumOp/FMinNumOp for FastmathFlags::nsz)
override `llvmOp = ""` to retain hand-written implementations.
Also fixes llvmOp names (TruncOp -> FTruncOp, FloorOp -> FFloorOp)
and adds a diagnostic rejecting conflicting llvmOp + custom constructor.
Fix two bugs in CIR's handling of `[[no_unique_address]]` fields:
- Record layout: Use the base subobject type (without tail padding)
instead of the complete object type for [[no_unique_address]] fields,
allowing subsequent fields to overlap with tail padding.
- Field access: Insert bitcasts from the base subobject pointer to the
complete object pointer after cir.get_member for potentially-overlapping
fields, so downstream code sees the expected type.
- Zero-sized fields: Handle truly empty [[no_unique_address]] fields by
computing their address via byte offsets rather than cir.get_member,
since they have no entry in the record layout.
A known gap (CIR copies 8 bytes where OG copies 5 via
`ConstructorMemcpyizer`) is noted for follow-up.
This PR extracts the write to the in-memory module cache from within
`ASTWriter` into `CompilerInstance.` This brings it closer to other
module cache manipulations, making the ordering much more clear and
explicit.
Closes#189666 .
Fix incorrect printing and parsing of `cir.global` if
`global_visibility` attribute is present. Incorrect assembly format
```
(`` $global_visibility^)?
```
Resulted in keyword sticking to previous word and producing incorrect
cir like this:
```
cir.globalhidden external dso_local @hidden_var = #cir.int<10> : !s32i {alignment = 4 : i64} loc(#loc22)
cir.global "private"hidden internal dso_local @hidden_static_var = #cir.int<10> : !s32i {alignment = 4 : i64} loc(#loc24)
```
Using custom parser/printer that is used in `cir.func` parser fixes this
issue and makes printed/parsed attribute for functions and global values
consistent.
Also added tests for both global values and functions.
If a try block has a catch-all handler and one or more type-specific
catch handlers, we were failing to generate the null type specifier when
lowering from CIR to LLVM IR. This change fixes that problem.
Assisted-by: Cursor / claude-4.6-opus-high
This implements handling for cleanup of temporary variables with
automatic storage duration. This is a simplified implementation that
doesn't yet handle the possibility of exceptions being thrown within
this cleanup scope or the cleanup scope being inside a conditional
operation. Support for those cases will be added later.
Factor this similar to the ARM case for future
expansion. The difference being -mcpu is treated as
an alias for -mcpu instead of something separately
useful.
I don't understand this mutation of the triple into
spirv64. The only test where this appears to matter
does not use -mcpu. Previously this would only match
for -mcpu, but this would change the behavior to prefer
-march before falling back to -mcpu.
The NYI diagnostic in getFunctionTypeForVTable showed up a few times in
testing, so this patch is attempting to fix that up.
The reproducer here is a function type for a vtable that has an
incomplete type in it(return or parameter). Classic codegen chooses to
represent this as an opaque type.
This patch instead removes the special v-table handling here, so that we
can instead just represent the types as incomplete record types.
At the moment, this patch ends up lowering incomplete types as 'empty'
types in LLVM-IR, which we may find we need to modify in the future,
however at the moment, it seems to work.
This patch ALSO changes the definition of RecordType::isSized to only be
true for complete types, which prevents a number of other things from
attempting to add attributes/check the size of the type/etc, but those
are irrelevant for the purposes of vtable emission.
This is a pretty simple one, its just a type of decl-context. The actual
exporty-ness is handled on a per-declaration basis.
This patch just makes sure we emit them, as I suspect this will reveal
quite a bit more issues in module code I suspect.
When using `-Xclang -dump-tokens`, the lexer dump output is currently
difficult to read because the data are misaligned. The existing
implementation simply separates the token name, spelling, flags, and
location using `'\t'`, which results in inconsistent spacing.
For example, the current output looks like this on provided in this
patch example **(BEFORE THIS PR)**:
<img width="2936" height="632" alt="image"
src="https://github.com/user-attachments/assets/ad893958-6d57-4a76-8838-7fc56e37e6a7"
/>
# Changes
This small PR improves the readability of the token dump by:
+ Adding padding after the token name and after the spelling (the
padding amount was chosen empirically to produce good average
alignment).
+ Swapping the order of location and flags (since flags can take up a
lot of space and disrupt alignment).
The result is a more readable output **(AFTER THIS PR)**:
<img width="1470" height="315" alt="image"
src="https://github.com/user-attachments/assets/c24f24e5-a431-42cc-b5b6-232bac5c635e"
/>
This commit removes the class `SwitchNodeBuilder` because it just
obscured the logic of switch handling by hiding some parts of it in
another source file.