This allows us to change the number of blocks stored according to the
size of BatchClass.
Also change the name `TransferBatch` to `Batch` given that it's never
the unit of transferring blocks.
Instead of explicitly disabling a feature by declaring the variable and
set it to false, this change supports the optional flags. I.e., you can
skip certain flags if you are not using it.
This optional feature supports both forms,
1. Value: A parameter for a feature. E.g., EnableRandomOffset
2. Type: A C++ type implementing a feature. E.g., ConditionVariableT
On the other hand, to access the flags will be through one of the
wrappers, BaseConfig/PrimaryConfig/SecondaryConfig/CacheConfig
(CacheConfig is embedded in SecondaryConfig). These wrappers have the
getters to access the value and the type. When adding a new feature, we
need to add it to `allocator_config.def` and mark the new variable with
either *_REQUIRED_* or *_OPTIONAL_* macro so that the accessor will be
generated properly.
In addition, also remove the need of `UseConditionVariable` to flip
on/off of condition variable. Now we only need to define the type of
condition variable.
Reverts llvm/llvm-project#70390
There's a bug caught by
`ScudoCombinedTestReallocateInPlaceStress_DefaultConfig.ReallocateInPlaceStress`
with gwp asan. It's an easy fix but given that this is a major change, I
would like to revert it first
Instead of always storing the same number of blocks as cached, we prefer
increasing the utilization by saving more blocks in a single
TransferBatch. This may slightly impact the performance, but it will
save a lot of memory used by BatchClassId (especially for larger
blocks).
This change moves the `TransferBatch` and `BatchGroup` out of
SizeClassAllocatorLocalCache. It allows us that the node in freelist can
store more blocks instead of depending on the number of blocks cached.
That means we will be able to store more blocks in each node of freelist
and therefore reduce the memory used by BatchClass (with little
performance overhead). Note that we haven't enabled that in this patch.
This is the first step of this transition.
This change is only in SizeClassAllocator32. SizeClassAllocator64 has
it implemented.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D158455
This change is only in SizeClassAllocator32. SizeClassAllocator64 has
it implemented.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D158456
This is only applied to SizeClassAllocator64 which has single region.
In SizeClassAllocator32, the region size has to be equal to the group
size.
Differential Revision: https://reviews.llvm.org/D156740
This gives a hint of potential bytes to release. Also remove the RSS
which is not supported yet. Will add it back when it's available.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D154551
This CL removes the restriction that pushing blocks into BatchClassId
can only be done when freelist is not empty. Without this constraint,
BatchClassId is also available for gathering blocks into groups.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D153492
When all the blocks (local caches are included) are freed, the size of
free blocks should be equal to `AllocatedUser`.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D152769
Create a new BlocksInfo to contain a list of blocks, poppedBlocks and
pushedBlocks. This is the preparation of adding new lock for operations
on freelist.
Differential Revision: https://reviews.llvm.org/D149143
To define custom allocation, you only need to put the configuration in
custom_scudo_config.h and define two required aliases, then you will be
switched to the customized config and the tests will also run with your
configuration.
In this CL, we also have a minor refactor the structure of
configuration. Now the essential fields are put under the associated
hierarchy and which will make the defining new configuration easier.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D150481
Release pages for large block (size greater than a page) is faster than
the small blocks. Besides, larger blocks are supposed not to be used
so often like smaller blocks which means we may hold several pages used
by large block and rarely get chance to release them if there's no
explicit M_PURGE call. Therefore, relax the release-interval condition
for large block.
This also fixes the assumption that FORCE_ALL should always try page
release.
Differential Revision: https://reviews.llvm.org/D151290
PageMap is allocated with MAP_ALLOWNOMEM if there's no static buffer
left. So it can be failed and return nullptr without any assertion
triggered. Instead of crashing in the releaseToOSMaybe in the middle,
just return and let the program handles the page failure.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D151379
In primary32, the unused region will have max/min region index with 0
value and it's an invalid index. Skip releaseToOSMaybe in both primary32
and primary64 even it's M_PURGE_ALL.
Differential Revision: https://reviews.llvm.org/D150243
Tracking the pushed bytes between to releaseToOSMaybe calls may lead to
a overestimated case that if we do malloc 2KB -> free 2KB -> malloc 2KB
-> free 2KB, we may think we have released 4KB but it only releases 2KB
actually. Switch to use bytes-in-freelist excludes more cases that can't
release the pages
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D146400
The force flag to releaseToOSMaybe does not release everything
since it is an expensive operation. Modify the release flag to
have three states: normal, force, forceall. Force behaves the same
as setting Force to true from before this change. Forceall will
release everything regardless of how much time it takes, or
how much there is to release.
In addition, add a new mallopt that will call the release function
with the forceall flag set.
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D146106
On Android, the _COARSE version of clock_gettime is about twice as fast.
Therefore, add a getMonotonicTimeFast function that is used in the
releaseToOSMaybe functions.
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D145636
Instead of going through all those trailing blocks, just count the
number and increase the counter at once.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D145419
With memory group, we always mark the free blocks from the same region.
Therefore, we don't need to calculate the offset from base and determine
the region index. Also improve the way we deal with the last block in
the region so that the loop body is simpler.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D143303
This alignment guarantee enables simpler group range check while page
releasing and a potential optimization which is, now all the pointers
from the same group are also inth same region, that means the complexity
in markFreeBlocks() can be reduced as well.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D142931
Fixed the bug in merging BatchGroups back to the FreeList. Added DCHECKs
to ensure the order of BatchGroups
This reverts commit 387452ec591c81def6d8869b23c2ab2f1c56f999.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D144920
This reduces the size of PageMap and we are more likely to use the
static local buffer. Note that now this is only supported for single
region case, i.e. on SizeClassAllocator64. For SizeClassAllocator32,
it needs a different way to save the PageMap.
Differential Revision: https://reviews.llvm.org/D142659
It is preferable to use `std::shared_mutex` style mutex. Will switch to
using it when it's available.
Differential Revision: https://reviews.llvm.org/D144691
When all the blocks in the group are known to be used, we should just
mark the pages in the range as all counted instead of visiting each of
them. This will reduce the time of marking free blocks especially for
smaller size class.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D141958
While populating new blocks, we didn't always put them into their own
groups because that needs additional sort for an almost-sorted new
blocks array. However, ensuring all blocks are placed in the right group
enables the fast identifying of unused pages in a group by simply
accouting the number of free blocks are there. Therefore, this commit is
used to set up the invariant for future optimizations.
Differential Revision: https://reviews.llvm.org/D141957
BatchClass is used to manage the free blocks for each size class. It's a
little bit tricky when it has to manage the free blocks of BatchClass.
In general, BatchClass block records the addresses of free blocks. In
order not to waste additional block to record the blocks in BatchClass,
it's self contained, i.e., it'll record its own address. The safety is
maintained by 2 preconditions,
1. If a block is used to record other BatchClass blocks, it'll also
record the address itself
2. While allocating free blocks, all the recorded blcoks will be
allocated together, which means there's no partial allocation
This CL fixes the violation of 1. and then we can push the free blocks
without having to push them in batches.
Differential Revision: https://reviews.llvm.org/D141956
The implementations of those functions require the rounding target to be
power-of-two. It's better to add a debugging check to avoid misuse.
Besides, add a general verion of those three to accommadate non
power-of-two cases.
Also change the name to roundUp/roundDown/isAligned
Reviewed By: cferris, cryptoad
Differential Revision: https://reviews.llvm.org/D142658