…tail storage" (#187410)
This reverts commit bf1db77fc87ce9d2ca7744565321b09a5d23692f.
Avoid using an `InterpFrame` member after calling its destructor this
time. I hope that was the only problem.
Instead of heap-allocating an `InterpFrame` and then immediately
heap-allocating more space for the local variables, do only one
heap-allocation and use tail storage for the local variables.
We already know how many bytes we need to for the tail storage after
all.
This also makes `InterpFrame` a little smaller since we don't need to
save an explicit pointer for the local variable memory.
For an artificial test case doing lots of function calls with local
variables like:
```c++
constexpr int plus(int a, int b) {
int x = a;
int y = b;
int z = x + y;
return z;
}
constexpr int minus(int a, int b) {
int x = a;
int y = b;
int z = x - y;
return z;
}
constexpr int foo() {
int a = 0;
for (unsigned I = 0; I != 1'000'000; ++I) {
int b = I;
a = plus(a,b );
a = minus(a,I);
}
return a;
}
static_assert(foo() == 0);
```
this saves us over 6%.
We also eliminate the per-argument `Block` heap allocation on the first
pointer-access to an argument the same way. To make this work, we change
the param ops to use the parameter index instead of the offset.
Clang will make the instance pointer be of type 'int' if it is invalid,
which trips up later logic. Mark functions as invalid if any of their
parameters is and compile + check them early in CallPtr.
Fixes https://github.com/llvm/llvm-project/issues/175425
…types usi… (#144676)"
This reverts commit 68471d29eed2c49f9b439e505b3f24d387d54f97.
IntegralAP contains a union:
union {
uint64_t *Memory = nullptr;
uint64_t Val;
};
On 64bit systems, both Memory and Val have the same size. However, on 32
bit system, Val is 64bit and Memory only 32bit. Which means the default
initializer for Memory will only zero half of Val. We fixed this by
zero-initializing Val explicitly in the IntegralAP(unsigned BitWidth)
constructor.
See also the discussion in
https://github.com/llvm/llvm-project/pull/144246
Create the Function* handles for all functions we see, but delay the
actual compilation until we really call the function. This speeds up
compile times with the new interpreter a bit.
Use the regular code paths for interpreting.
Add new instructions: `StartSpeculation` will reset the diagnostics
pointers to `nullptr`, which will keep us from reporting any diagnostics
during speculation. `EndSpeculation` will undo this.
The rest depends on what `Emitter` we use.
For `EvalEmitter`, we have no bytecode, so we implement `speculate()` by
simply visiting the first argument of `__builtin_constant_p`. If the
evaluation fails, we push a `0` on the stack, otherwise a `1`.
For `ByteCodeEmitter`, add another instrucion called `BCP`, that
interprets all the instructions following it until the next
`EndSpeculation` instruction. If any of those instructions fails, we
jump to the `EndLabel`, which brings us right before the
`EndSpeculation`. We then push the result on the stack.
Some function types are special to us, so add an enum and determinte the
function kind once when creating the function, instead of looking at the
Decl every time we need the information.
FunctionDecl::getBuiltinID() is surprisingly slow and we tend to call it
quite a bit, especially when interpreting builtin functions. Caching the
BuiltinID here reduces the time I need to compile the
floating_comparison namespace from builtin-functions.cpp from 7.2s to
6.3s locally.
I started out by adding a new pointer type for blocks, and I was fully
prepared to compile their AST to bytecode and later call them.
... then I found out that the current interpreter doesn't support
calling blocks at all. So we reuse `Function` to support sources other
than `FunctionDecl`s and classify `BlockPointerType` as `PT_FnPtr`.