There're four possible formats to refer a register in inline assembly,
1. Numeric name without dollar sign ("f0")
2. Numeric name with dollar sign ("$f0")
3. ABI name without dollar sign ("fa0")
4. ABI name with dollar sign ("$fa0")
LoongArch GCC accepts 1 and 2 for FPRs before r15-8284[1] and all these
formats after the chagne. But Clang supports only 2 and 4 for FPRs. The
inconsistency has caused compatibility issues, such as QEMU's case[2].
This patch follows 0bbf3ddf5fea ("[Clang][LoongArch] Add GPR alias
handling without `$` prefix") and accepts FPRs without dollar sign
prefixes as well to keep aligned with GCC, avoiding future compatibility
problems.
Link:
https://gcc.gnu.org/cgit/gcc/commit/?id=d0110185eb78f14a8e485f410bee237c9c71548d
[1]
Link:
https://lore.kernel.org/qemu-devel/20250314033150.53268-3-ziyao@disroot.org/
[2]
There is a lot of redundant code that needs to be modified when new
Hexagon versions are added. Reduce the amount of this redundancy.
- compute ELF flags and attributes based on version feature names;
- simplify EnableHVX option handling by using arch features instead of
arch version enums;
- simplify completeHVXFeatures() by using features;
- delete several unused or redundant functions and constants:
isCPUValid, getCpu, getHexagonCPUSuffix;
- do not set HexagonArchVersion in initializeSubtargetDependencies, it
is set in ParseSubtargetFeatures;
Signed-off-by: Alexey Karyakin <akaryaki@quicinc.com>
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
Support the following packed BCD builtins for PowerPC.
```
__builtin_national2packed - Conversion of National format to Packed decimal format.
__builtin_packed2national - Conversion of Packed decimal format to national format.
__builtin_packed2zoned - Conversion of Packed decimal format to Zoned decimal format.
__builtin_zoned2packed - Conversion of Zoned decimal format to Packed decimal format.
```
### Prototypes:
`vector unsigned char __builtin_national2packed(vector unsigned char a,
unsigned char b);`
`vector unsigned char __builtin_packed2zoned(vector unsigned char,
unsigned char);`
`vector unsigned char __builtin_zoned2packed(vector unsigned char,
unsigned char);`
The condition for the 2nd parameter is consistent over all the 3
prototypes (0 or 1 only).
`vector unsigned char __builtin_packed2national(vector unsigned char);`
Co-authored-by: himadhith <himadhith.v@ibm.com>
Co-authored-by: Tony Varghese <tonypalampalliyil@gmail.com>
Implement parsing and semantic analysis support for the optional
'strict' modifier of the num_threads clause. This modifier has been
introduced in OpenMP 6.0, section 12.1.2.
Note: this is basically 1:1 https://reviews.llvm.org/D138328.
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch takes care of the clang side of the migration.
1. The PR proceeds with a backend target hook to allow front-ends to
determine what target features are available in a compilation based on
the CPU name.
2. Fix a backend target feature bug that supports HTM for
Power8/9/10/11. However, HTM is only supported on Power8/9 according to
the ISA.
3. All target features that are hardcoded in PPC.cpp can be retrieved
from the backend target feature. I have double-checked that the
hardcoded logic for inferring target features from the CPU in the
frontend(PPC.cpp) is the same as in PPC.td.
The reland patch addressed the comment
https://github.com/llvm/llvm-project/pull/137670#discussion_r2143541120
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
1. Added the missing XMEGA2 definition. The avr64 devices use xmega2 which has SPM(X) defined.
2. The avr16/avr32 devices do have SPM and SPMX features, but the current xmega3 definition has not.
Xmega3 is also used for modern attiny series which do not have SPM(X), so that is correct.
Leave the avr16/avr32 devices unchanged (using xmega3 to be in sync with gcc definitions).
Fixes https://github.com/llvm/llvm-project/issues/116116
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:
```
undefined reference to `llvm::DebugInfoCorrelate'
undefined reference to `llvm::ProfileCorrelate'
```
The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.
### Fix:
This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
Add a new __ARM_FEATURE_CSSC macro that can be utilized during the
preprocessing stage.
__ARM_FEATURE_CSSC is defined to 1 if there is hardware support for
CSSC.
Implements the ACLE change:
https://github.com/ARM-software/acle/pull/394
This reverts commit 9208b343e962b9f1140ee345c0050a3920bdcbf2.
TargetParser shouldn't re-run the PPC subtarget tablegen target, it
should define its own `-gen-ppc-target-def` rule like all the other
targets do in llvm/include/llvm/TargetParser/CMakeLists.txt .
One user reported that there are incorrect CMake dependencies after this
change, so I will roll this back in the meantime.
1. The PR proceeds with a backend target hook to allow front-ends to
determine what target features are available in a compilation based on
the CPU name.
2. Fix a backend target feature bug that supports HTM for
Power8/9/10/11. However, HTM is only supported on Power8/9 according to
the ISA.
3. All target features that are hardcoded in PPC.cpp can be retrieved
from the backend target feature. I have double-checked that the
hardcoded logic for inferring target features from the CPU in the
frontend(PPC.cpp) is the same as in PPC.td.
When sharing same compiler instance for multiple compilations, we reset
source manager's file id tables in between runs. Diagnostics engine
keeps a cache based on these file ids, that became dangling references
across compilations.
This patch makes sure we reset those whenever sourcemanager is trashing
its FileIDs.
It makes more sense for this functionality to be all in one place rather
than split up across two files—at least it caused me a bit of a headache
to try and find all places where we were actually forwarding the
diagnostic to the `DiagnosticConsumer`. Moreover, moving these functions
into `DiagnosticsEngine` simplifies the code quite a bit since we access
members of `DiagnosticsEngine` more frequently than those of
`DiagnosticIDs`. There was also a duplicated code snippet that I’ve
moved out into a new function.
The previous mapping we setting the hlsl_groupshared AS to 0, which
translated to either Generic or Function.
Changing this to 3, which translated to Workgroup.
Related to #142804
Handling of va_list on Cygwin environment must be matched to normal
Windows environment.
The existing test `test/CodeGen/ms_abi.c` seems relevant, but it
contains `__attribute__((sysv_abi))`, which is not supported on Cygwin.
The new test is based on the `__attribute__((ms_abi))` portion of that
test.
---------
Co-authored-by: jeremyd2019 <github@jdrake.com>
The LoongArch psABI recently added __bf16 type support. Now we can
enable this new type in clang.
Currently, bf16 operations are automatically supported by promoting to
float. This patch adds bf16 support by ensuring that load extension /
truncate store operations are properly expanded.
And this commit implements support for bf16 truncate/extend on hard FP
targets. The extend operation is implemented by a shift just as in the
standard legalization. This requires custom lowering of the truncate
libcall on hard float ABIs (the normal libcall code path is used on soft
ABIs).
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:
17727e88fd/proposals/0011-inline-spirv.md
Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.
The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md
Co-authored-by: Justin Bogner <mail@justinbogner.com>
Enable _Float16 for LoongArch target. Additionally, this change fixes
incorrect ABI lowering of _Float16 in the case of structs containing
fp16 that are eligible for passing via GPR+FPR or FPR+FPR. Finally, it
also fixes int16 -> __fp16 conversion code gen, which uses generic LLVM
IR rather than llvm.convert.to.fp16 intrinsics.
See https://github.com/llvm/llvm-project/issues/139128
If multiple entries match the source, than the latest entry takes the
precedence.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
OpenCL translator has a `__spirv` namespace, and defining the
`__spirv__` macro causes issues downstream on the OpenCL side. This
macro is needed to keep compatibility with HLSL/DXC, but can be avoided
for other targets/languages.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:
- `-fprofile-generate` for instrumentation-based profile generation
- `-fprofile-use=<dir>/file` for profile-guided optimization
Resolves#74216 (implements IR PGO support phase)
**Key changes:**
- Frontend flag handling aligned with Clang/GCC semantics
- Instrumentation hooks into LLVM PGO infrastructure
- LIT tests verifying:
- Instrumentation metadata generation
- Profile loading from specified path
- Branch weight attribution (IR checks)
**Tests:**
- Added gcc-flag-compatibility.f90 test module verifying:
- Flag parsing boundary conditions
- IR-level profile annotation consistency
- Profile input path normalization rules
- SPEC2006 benchmark results will be shared in comments
For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).
This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
As discussed in https://github.com/llvm/llvm-project/issues/139128, this
PR moves =sanitize handling from `ASTContext::isTypeIgnoredBySanitizer`
to `NoSanitizeList::containsType`.
Before this PR: "=sanitize" has priority regardless of the order
After this PR: If multiple entries match the source, than the latest
entry takes the precedence.
The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which
is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic.
The SPIR-V builtins are now split into 3 differents file:
BuiltinsSPIRVCore.td,
BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for
OpenCL specific builtins
and BuiltinsSPIRVCommon.td for common ones.
The patch also introduces a new header defining its SPIR-V friendly
equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal,
__spirv_GenericCastToPtrExplicit_ToLocal and
__spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared
as aliases to the new builtin allowing C-like languages to have a
definition to rely on as well as gaining proper front-end diagnostics.
The motivation for the header is to provide a stable binding for
applications or library (such as SYCL) and allows non SPIR-V targets to
provide an implementation (via libclc or similar to how it is done for
gpuintrin.h).