The option -falloc-token-max=0 is supposed to be usable to override
previous settings back to the target default max tokens (SIZE_MAX).
This did not work for the builtin:
```
| executed command: clang -cc1 [..] -nostdsysteminc -triple x86_64-linux-gnu -std=c++23 -fsyntax-only -verify clang/test/SemaCXX/alloc-token.cpp -falloc-token-max=0
| clang: llvm/lib/Support/AllocToken.cpp:38: std::optional<uint64_t> llvm::getAllocToken(AllocTokenMode, const AllocTokenMetadata &, uint64_t): Assertion `MaxTokens && "Must provide non-zero max tokens"' failed.
```
Fix it by also picking the default if "0" is passed.
Improve the documentation to be clearer what the value of "0" means.
Recent commits (7fe069121b57a, 53ddeb493529a) marked several x86
intrinsics as constexpr in headers without providing the necessary
constant evaluation support in the compiler backend. This caused
compilation failures when attempting to use these intrinsics in constant
expressions.
Resolves#166814Resolves#161203
This patch extends `interp__builtin_ia32_shuffle_generic` and `evalShuffleGeneric` to handle both 2-argument and 3-argument patterns, replacing specialized shuffle functions with the unified handler.
Resolves#166342
This patch enables compile-time evaluation of AVX512 permutex2var
intrinsics in constexpr contexts.
Extend shuffle generic to handle both integer immediate and vector mask
operands.
Resolves#161335
Support constexpr usage for SLLDQ/SRLDQ byte shift intrinsics
This draft PR adds support for using the following SRLDQ intrinsics in
constant expressions:
- _mm_srli_si128
- _mm256_srli_si256
- _mm_slli_si128
- _mm256_slli_si256
Relevant tests are included.
Fixes#156494
Implement the constexpr evaluation for `__builtin_infer_alloc_token()`
in Clang's constant expression evaluators (both in ExprConstant and the
new bytecode interpreter).
The constant evaluation is only supported for stateless (hash-based)
token modes. If a stateful mode like `increment` is used, the evaluation
fails, as the token value is not deterministic at compile time.
Get the zero-extended truncated desired value in that case. Add one RUN
line to the constexpr-string.cpp test case, to not increase the runtime
of that test too much.
**This PR supersedes and replaces PR #158853**
The original branch diverged too far from the main branch, resulting in
significant merge conflicts that were difficult to resolve cleanly. To
provide a clean and reviewable history, this new PR was created by
cherry-picking the necessary commits onto a fresh branch based on the
latest `main`.
---
*(Original Description)*
This patch enables the use of AVX/AVX512 subvector extraction intrinsics
within `constexpr` functions. This is achieved by implementing the
evaluation logic for these intrinsics in
`VectorExprEvaluator::VisitCallExpr` and `InterpretBuiltin`.
The original discussion and review comments can be found in the previous
pull request for context: #158853Fixes#157712
This PR resolves#155805 and updates the following builtins to handle
constant expressions:
```
_mm_mulhrs_pi16
mm_mulhrs_epi16 mm256_mulhrs_epi16 mm512_mulhrs_epi16
```
The PSHUFB instruction shuffles bytes within each 128-bit lane: for each
control byte, if bit 7 is set, the output byte is zeroed; otherwise, the
low 4 bits select a source byte (0–15) from the same lane.
Note: _mm_shuffle_pi8 function had to change as __anyext128 had negative
indices which are invalid in constant expression context.
Fixes#156612
Fix#158653
Add handling for:
```
ptestz128 / ptestz256 → (a & b) == 0.
ptestc128 / ptestc256 → (~a & b) == 0
ptestnzc128 / ptestnzc256 → (a & b) != 0 AND (~a & b) != 0.
vtestzps / vtestzps256 → (S(a) & S(b)) == 0.
vtestcps / vtestcps256 → (~S(a) & S(b)) == 0.
vtestnzcps / vtestnzcps256 → (S(a) & S(b)) != 0 AND (~S(a) & S(b)) != 0.
vtestzpd / vtestzpd256 → (S(a) & S(b)) == 0.
vtestcpd / vtestcpd256 → (~S(a) & S(b)) == 0.
vtestnzcpd / vtestnzcpd256 → (S(a) & S(b)) != 0 AND (~S(a) & S(b)) != 0.
```
Add corresponding test cases for:
```
int _mm_test_all_ones (__m128i a)
int _mm_test_all_zeros (__m128i mask, __m128i a)
int _mm_test_mix_ones_zeros (__m128i mask, __m128i a)
int _mm_testc_pd (__m128d a, __m128d b)
int _mm256_testc_pd (__m256d a, __m256d b)
int _mm_testc_ps (__m128 a, __m128 b)
int _mm256_testc_ps (__m256 a, __m256 b)
int _mm_testc_si128 (__m128i a, __m128i b)
int _mm256_testc_si256 (__m256i a, __m256i b)
int _mm_testnzc_pd (__m128d a, __m128d b)
int _mm256_testnzc_pd (__m256d a, __m256d b)
int _mm_testnzc_ps (__m128 a, __m128 b)
int _mm256_testnzc_ps (__m256 a, __m256 b)
int _mm_testnzc_si128 (__m128i a, __m128i b)
int _mm256_testnzc_si256 (__m256i a, __m256i b)
int _mm_testz_pd (__m128d a, __m128d b)
int _mm256_testz_pd (__m256d a, __m256d b)
int _mm_testz_ps (__m128 a, __m128 b)
int _mm256_testz_ps (__m256 a, __m256 b)
int _mm_testz_si128 (__m128i a, __m128i b)
int _mm256_testz_si256 (__m256i a, __m256i b)
```
The interp__builtin_ia32_pmadd implementation can be correctly used for
PMULDQ/PMULUDQ evaluation as well as we're ignoring the "hi" integers in
each pair
I've replaced the PMULDQ/PMULUDQ evaluation with callbacks and renamed
interp__builtin_ia32_pmadd to interp__builtin_ia32_pmul for consistency
This PR updates the PMADDWD/PMADDUBSW builtins to support constant
expression handling, by extending the VectorExprEvaluator::VisitCallExpr
that handles interp__builtin_ia32_pmadd builtins.
Closes#155392
The i16/i32 shuffle intrinsics (`pshufw`, `pshuflw`, `pshufhw`,
`pshufd`) currently cannot be used in constant expressions. This patch
adds support in both bytecode interpreter (InterpBuiltin.cpp) and
constant evaluator
(ExprConstant.cpp) for pshuf intrinsics, enabling their use in constant
expressions.
## Intrinsics covered
- `_mm_shuffle_pi16` (MMX `pshufw`)
- `_mm_shufflelo_epi16` / `_mm_shufflehi_epi16`
- `_mm_shuffle_epi32`
- Their AVX2/AVX512 vector-width variants
- Masked and maskz forms (handled indirectly via
`__builtin_ia32_select*`)
Fixes#156611
FIXES: #159753
Enable constexpr evaluation for X86 vector element extract/insert builtins. and adds corresponding tests
Index is masked with `(Idx & (NumElts - 1))`, matching existing CodeGen.