Vulkan Memory Allocator
User guide

Quick start

In your project code:

  1. Include "vk_mem_alloc.h" file wherever you want to use the library.
  2. In exacly one C++ file define following macro before include to build library implementation.
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"

At program startup:

  1. Initialize Vulkan to have VkPhysicalDevice and VkDevice object.
  2. Fill VmaAllocatorCreateInfo structure and create VmaAllocator object by calling vmaCreateAllocator().
VmaAllocatorCreateInfo allocatorInfo = {};
allocatorInfo.physicalDevice = physicalDevice;
allocatorInfo.device = device;

VmaAllocator allocator;
vmaCreateAllocator(&allocatorInfo, &allocator);

When you want to create a buffer or image:

  1. Fill VkBufferCreateInfo / VkImageCreateInfo structure.
  2. Fill VmaAllocationCreateInfo structure.
  3. Call vmaCreateBuffer() / vmaCreateImage() to get VkBuffer/VkImage with memory already allocated and bound to it.
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;

VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);

Don't forget to destroy your objects when no longer needed:

vmaDestroyBuffer(allocator, buffer, allocation);
vmaDestroyAllocator(allocator);

Persistently mapped memory

If you need to map memory on host, it may happen that two allocations are assigned to the same VkDeviceMemory block, so if you map them both at the same time, it will cause error because mapping single memory block multiple times is illegal in Vulkan.

It is safer, more convenient and more efficient to use special feature designed for that: persistently mapped memory. Allocations made with VMA_ALLOCATION_CREATE_PERSISTENT_MAP_BIT flag set in VmaAllocationCreateInfo::flags are returned from device memory blocks that stay mapped all the time, so you can just access CPU pointer to it. VmaAllocationInfo::pMappedData pointer is already offseted to the beginning of particular allocation. Example:

VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufCreateInfo.size = 1024;
bufCreateInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;

VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_CPU_ONLY;
allocCreateInfo.flags = VMA_ALLOCATION_CREATE_PERSISTENT_MAP_BIT;

VkBuffer buf;
VmaAllocation alloc;
VmaAllocationInfo allocInfo;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);

.// Buffer is immediately mapped. You can access its memory.
memcpy(allocInfo.pMappedData, myData, 1024);

Memory in Vulkan doesn't need to be unmapped before using it e.g. for transfers, but if you are not sure whether it's HOST_COHERENT (here is surely is because it's created with VMA_MEMORY_USAGE_CPU_ONLY), you should check it. If it's not, you should call vkInvalidateMappedMemoryRanges() before reading and vkFlushMappedMemoryRanges() after writing to mapped memory on CPU. Example:

VkMemoryPropertyFlags memFlags;
vmaGetMemoryTypeProperties(allocator, allocInfo.memoryType, &memFlags);
if((memFlags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) == 0)
{
    VkMappedMemoryRange memRange = { VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE };
    memRange.memory = allocInfo.deviceMemory;
    memRange.offset = allocInfo.offset;
    memRange.size   = allocInfo.size;
    vkFlushMappedMemoryRanges(device, 1, &memRange);
}

On AMD GPUs on Windows, Vulkan memory from the type that has both DEVICE_LOCAL and HOST_VISIBLE flags should not be mapped for the time of any call to vkQueueSubmit() or vkQueuePresent(). Although legal, that would cause performance degradation because WDDM migrates such memory to system RAM. To ensure this, you can unmap all persistently mapped memory using just one function call. For details, see function vmaUnmapPersistentlyMappedMemory(), vmaMapPersistentlyMappedMemory().

Custom memory pools

The library automatically creates and manages default memory pool for each memory type available on the device. A pool contains a number of VkDeviceMemory blocks. You can create custom pool and allocate memory out of it. It can be useful if you want to:

To use custom memory pools:

  1. Fill VmaPoolCreateInfo structure.
  2. Call vmaCreatePool() to obtain VmaPool handle.
  3. When making an allocation, set VmaAllocationCreateInfo::pool to this handle. You don't need to specify any other parameters of this structure, like usage.

Example:

.// Create a pool that could have at most 2 blocks, 128 MB each.
VmaPoolCreateInfo poolCreateInfo = {};
poolCreateInfo.memoryTypeIndex = ...
poolCreateInfo.blockSize = 128ull * 1024 * 1024;
poolCreateInfo.maxBlockCount = 2;

VmaPool pool;
vmaCreatePool(allocator, &poolCreateInfo, &pool);

.// Allocate a buffer out of it.
VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufCreateInfo.size = 1024;
bufCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.pool = pool;

VkBuffer buf;
VmaAllocation alloc;
VmaAllocationInfo allocInfo;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);

You have to free all allocations made from this pool before destroying it.

vmaDestroyBuffer(allocator, buf, alloc);
vmaDestroyPool(allocator, pool);

Defragmentation

Interleaved allocations and deallocations of many objects of varying size can cause fragmentation, which can lead to a situation where the library is unable to find a continuous range of free memory for a new allocation despite there is enough free space, just scattered across many small free ranges between existing allocations.

To mitigate this problem, you can use vmaDefragment(). Given set of allocations, this function can move them to compact used memory, ensure more continuous free space and possibly also free some VkDeviceMemory. It can work only on allocations made from memory type that is HOST_VISIBLE. Allocations are modified to point to the new VkDeviceMemory and offset. Data in this memory is also memmove-ed to the new place. However, if you have images or buffers bound to these allocations (and you certainly do), you need to destroy, recreate, and bind them to the new place in memory.

For further details and example code, see documentation of function vmaDefragment().

Lost allocations

If your game oversubscribes video memory, if may work OK in previous-generation graphics APIs (DirectX 9, 10, 11, OpenGL) because resources are automatically paged to system RAM. In Vulkan you can't do it because when you run out of memory, an allocation just fails. If you have more data (e.g. textures) that can fit into VRAM and you don't need it all at once, you may want to upload them to GPU on demand and "push out" ones that are not used for a long time to make room for the new ones, effectively using VRAM (or a cartain memory pool) as a form of cache. Vulkan Memory Allocator can help you with that by supporting a concept of "lost allocations".

To create an allocation that can become lost, include VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT flag in VmaAllocationCreateInfo::flags. Before using a buffer or image bound to such allocation in every new frame, you need to query it if it's not lost. To check it: call vmaGetAllocationInfo() and see if VmaAllocationInfo::deviceMemory is not VK_NULL_HANDLE. If the allocation is lost, you should not use it or buffer/image bound to it. You mustn't forget to destroy this allocation and this buffer/image.

To create an allocation that can make some other allocations lost to make room for it, use VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag. You will usually use both flags VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT and VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT at the same time.

Warning! Current implementation uses quite naive, brute force algorithm, which can make allocation calls that use VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag quite slow. A new, more optimal algorithm and data structure to speed this up is planned for the future.

When interleaving creation of new allocations with usage of existing ones, how do you make sure that an allocation won't become lost while it's used in the current frame?

It is ensured because vmaGetAllocationInfo() not only returns allocation parameters and checks whether it's not lost, but when it's not, it also atomically marks it as used in the current frame, which makes it impossible to become lost in that frame. It uses lockless algorithm, so it works fast and doesn't involve locking any internal mutex.

What if my allocation may still be in use by the GPU when it's rendering a previous frame while I already submit new frame on the CPU?

You can make sure that allocations "touched" by vmaGetAllocationInfo() will not become lost for a number of additional frames back from the current one by specifying this number as VmaAllocatorCreateInfo::frameInUseCount (for default memory pool) and VmaPoolCreateInfo::frameInUseCount (for custom pool).

How do you inform the library when new frame starts?

You need to call function vmaSetCurrentFrameIndex().

Example code:

struct MyBuffer
{
    VkBuffer m_Buf = nullptr;
    VmaAllocation m_Alloc = nullptr;

    .// Called when the buffer is really needed in the current frame.
    void EnsureBuffer();
};

void MyBuffer::EnsureBuffer()
{
    .// Buffer has been created.
    if(m_Buf != VK_NULL_HANDLE)
    {
        .// Check if its allocation is not lost + mark it as used in current frame.
        VmaAllocationInfo allocInfo;
        vmaGetAllocationInfo(allocator, m_Alloc, &allocInfo);
        if(allocInfo.deviceMemory != VK_NULL_HANDLE)
        {
            .// It's all OK - safe to use m_Buf.
            return;
        }
    }

    .// Buffer not yet exists or lost - destroy and recreate it.

    vmaDestroyBuffer(allocator, m_Buf, m_Alloc);

    VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
    bufCreateInfo.size = 1024;
    bufCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

    VmaAllocationCreateInfo allocCreateInfo = {};
    allocCreateInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;
    allocCreateInfo.flags = VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT |
        VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT;

    vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &m_Buf, &m_Alloc, nullptr);
}

When using lost allocations, you may see some Vulkan validation layer warnings about overlapping regions of memory bound to different kinds of buffers and images. This is still valid as long as you implement proper handling of lost allocations (like in the example above) and don't use them.

The library uses following algorithm for allocation, in order:

  1. Try to find free range of memory in existing blocks.
  2. If failed, try to create a new block of VkDeviceMemory, with preferred block size.
  3. If failed, try to create such block with size/2 and size/4.
  4. If failed and VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag was specified, try to find space in existing blocks, possilby making some other allocations lost.
  5. If failed, try to allocate separate VkDeviceMemory for this allocation, just like when you use VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT.
  6. If failed, choose other memory type that meets the requirements specified in VmaAllocationCreateInfo and go to point 1.
  7. If failed, return VK_ERROR_OUT_OF_DEVICE_MEMORY.