By default, `isShuffleMaskLegal` always returns true, which can result
in the expansion of `BUILD_VECTOR` into a `VECTOR_SHUFFLE` node in
certain situations. Subsequently, the `VECTOR_SHUFFLE` node is expanded
again into a `BUILD_VECTOR`, leading to an infinite loop.
To address this, we always return false, allowing the expansion of
`BUILD_VECTOR` through the stack.
```
when a=c=-0.0, b=0.0:
-(a * b + (-c)) = -0.0
-a * b + c = 0.0
(fneg (fma a, b (-c))) != (fma (fneg a), b ,c)
```
See https://reviews.llvm.org/D90901 for a similar discussion on X86.
Previously, bolt could not get FixupKind for BL correctly, because bolt
cannot get target-flags for BL. Here just add support in MCCodeEmitter.
Fixes https://github.com/llvm/llvm-project/pull/72826.
This patch is used to test whether fixupkind for bl can be returned
correctly. When BL has target-flags(loongarch-call), there is no error.
But without this flag, an assertion error will appear. So the test is
just tagged as "Expectedly Failed" now until the following patch fix it.
PR #67391 improved atomic codegen by handling memory ordering specified
by the `cmpxchg` instruction. An acquire barrier needs to be generated
when memory ordering includes an acquire operation. This PR improves the
codegen further by only handling the failure ordering.
These are treated as DBAR 0 on older uarchs, so we can start to
unconditionally emit the new hints right away.
Co-authored-by: WANG Rui <wangrui@loongson.cn>
This PR improves memory barriers generated by atomic operations.
Memory barrier semantics of LL/SC:
```
LL: <memory-barrier> + <load-exclusive>
SC: <store-conditional> + <memory-barrier>
```
Changes:
* Remove unnecessary memory barriers before LL and between LL/SC.
* Fix acquire semantics. (If the SC instruction is not executed, then
the guarantee of acquiring semantics cannot be ensured. Therefore, an
acquire barrier needs to be generated when memory ordering includes an
acquire operation.)
Prior to this change, stack realignment was achieved using the SRLI/SLLI
instructions in two steps. With this patch, stack realignment is
optimized using a single `BSTRINS` instruction.
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D158384
The testcases mainly cover three situations:
- the arguments which should be immediates are non immediates.
- the immediate is out of upper limit of the argument type.
- the immediate is out of lower limit of the argument type.
Depends on D155829
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D157570
The testcases mainly cover three situations:
- the arguments which should be immediates are non immediates.
- the immediate is out of upper limit of the argument type.
- the immediate is out of lower limit of the argument type.
Depends on D155830
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D157571
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.
D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.
A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.
`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.
Two preprocessor macros are defined based on user-provided `-march=`
and `-mtune=` options and the defaults.
- __loongarch_arch
- __loongarch_tune
Note that, to work with `-fno-integrated-cc1` we leverage cc1 options
`-target-cpu` and `-tune-cpu` to pass driver options `-march=` and
`-mtune=` respectively because cc1 needs these information to define
macros in `LoongArchTargetInfo::getTargetDefines`.
[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc
Reviewed By: xen0n, wangleiat, steven_wu, MaskRay
Differential Revision: https://reviews.llvm.org/D155824
This reverts commit c56514f21b2cf08eaa7ac3a57ba4ce403a9c8956. This
commit adds global state that is shared between clang driver and clang
cc1, which is not correct when clang is used with `-fno-integrated-cc1`
option (no integrated cc1). The -march and -mtune option needs to be
properly passed through cc1 command-line and stored in TargetInfo.
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.
D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.
A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.
`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.
And these two preprocessor macros are defined:
- __loongarch_arch
- __loongarch_tune
[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc
Reviewed By: xen0n, wangleiat
Differential Revision: https://reviews.llvm.org/D155824
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.
D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.
A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.
`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.
And these two preprocessor macros are defined:
- __loongarch_arch
- __loongarch_tune
[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc
Differential Revision: https://reviews.llvm.org/D155824
Add test case showing suboptimal codegen when zero extending.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D154918
The author of the following files is licongtian <licongtian@loongson.cn>:
- clang/lib/Basic/Targets/LoongArch.cpp
- llvm/lib/Target/LoongArch/LoongArchAsmPrinter.cpp
- llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
The files mentioned above implement InlineAsm for LSX and LASX as follows:
- Enable clang parsing LSX/LASX register name, such as $vr0.
- Support the case which operand type is 128bit or 256bit when the
constraints is 'f'.
- Support the way of specifying LSX/LASX register by using constraint,
such as "={$xr0}".
- Support the operand modifiers 'u' and 'w'.
- Support and legalize the data types and register classes involved in
LSX/LASX in the lowering process.
Reviewed By: xen0n, SixWeining
Differential Revision: https://reviews.llvm.org/D154931
This returns true for 8-bit and 16-bit loads, allowing ld.bu/ld.hu to be selected and avoiding unnecessary masks.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154819
Implementing isZextFree will allow ld.bu or ld.hu to be selected rather than ld.b+mask and ld.h+mask.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154818
This causes a trivial improvement in the legalicmpimm.ll test case.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: SixWeining, xen0n
Differential Revision: https://reviews.llvm.org/D154811