Vulkan Memory Allocator
Vulkan Memory Allocator

Table of Contents

Version 2.0.0-alpha.3 (2017-09-12)

Source repository: VulkanMemoryAllocator project on GitHub
Product page: Vulkan Memory Allocator on GPUOpen

Documentation of members grouped: Modules
Documentation of all members: vk_mem_alloc.h

User guide

Quick start

In your project code:

  1. Include "vk_mem_alloc.h" file wherever you want to use the library.
  2. In exacly one C++ file define following macro before include to build library implementation.
#define VMA_IMPLEMENTATION
#include "vk_mem_alloc.h"

At program startup:

  1. Initialize Vulkan to have VkPhysicalDevice and VkDevice object.
  2. Fill VmaAllocatorCreateInfo structure and create VmaAllocator object by calling vmaCreateAllocator().
VmaAllocatorCreateInfo allocatorInfo = {};
allocatorInfo.physicalDevice = physicalDevice;
allocatorInfo.device = device;

VmaAllocator allocator;
vmaCreateAllocator(&allocatorInfo, &allocator);

When you want to create a buffer or image:

  1. Fill VkBufferCreateInfo / VkImageCreateInfo structure.
  2. Fill VmaAllocationCreateInfo structure.
  3. Call vmaCreateBuffer() / vmaCreateImage() to get VkBuffer/VkImage with memory already allocated and bound to it.
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;

VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);

Don't forget to destroy your objects when no longer needed:

vmaDestroyBuffer(allocator, buffer, allocation);
vmaDestroyAllocator(allocator);

Persistently mapped memory

If you need to map memory on host, it may happen that two allocations are assigned to the same VkDeviceMemory block, so if you map them both at the same time, it will cause error because mapping single memory block multiple times is illegal in Vulkan.

It is safer, more convenient and more efficient to use special feature designed for that: persistently mapped memory. Allocations made with VMA_ALLOCATION_CREATE_PERSISTENT_MAP_BIT flag set in VmaAllocationCreateInfo::flags are returned from device memory blocks that stay mapped all the time, so you can just access CPU pointer to it. VmaAllocationInfo::pMappedData pointer is already offseted to the beginning of particular allocation. Example:

VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufCreateInfo.size = 1024;
bufCreateInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;

VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.usage = VMA_MEMORY_USAGE_CPU_ONLY;
allocCreateInfo.flags = VMA_ALLOCATION_CREATE_PERSISTENT_MAP_BIT;

VkBuffer buf;
VmaAllocation alloc;
VmaAllocationInfo allocInfo;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);

.// Buffer is immediately mapped. You can access its memory.
memcpy(allocInfo.pMappedData, myData, 1024);

Memory in Vulkan doesn't need to be unmapped before using it e.g. for transfers, but if you are not sure whether it's HOST_COHERENT (here is surely is because it's created with VMA_MEMORY_USAGE_CPU_ONLY), you should check it. If it's not, you should call vkInvalidateMappedMemoryRanges() before reading and vkFlushMappedMemoryRanges() after writing to mapped memory on CPU. Example:

VkMemoryPropertyFlags memFlags;
vmaGetMemoryTypeProperties(allocator, allocInfo.memoryType, &memFlags);
if((memFlags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) == 0)
{
    VkMappedMemoryRange memRange = { VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE };
    memRange.memory = allocInfo.deviceMemory;
    memRange.offset = allocInfo.offset;
    memRange.size   = allocInfo.size;
    vkFlushMappedMemoryRanges(device, 1, &memRange);
}

On AMD GPUs on Windows, Vulkan memory from the type that has both DEVICE_LOCAL and HOST_VISIBLE flags should not be mapped for the time of any call to vkQueueSubmit() or vkQueuePresent(). Although legal, that would cause performance degradation because WDDM migrates such memory to system RAM. To ensure this, you can unmap all persistently mapped memory using just one function call. For details, see function vmaUnmapPersistentlyMappedMemory(), vmaMapPersistentlyMappedMemory().

Custom memory pools

The library automatically creates and manages default memory pool for each memory type available on the device. A pool contains a number of VkDeviceMemory blocks. You can create custom pool and allocate memory out of it. It can be useful if you want to:

To use custom memory pools:

  1. Fill VmaPoolCreateInfo structure.
  2. Call vmaCreatePool() to obtain VmaPool handle.
  3. When making an allocation, set VmaAllocationCreateInfo::pool to this handle. You don't need to specify any other parameters of this structure, like usage.

Example:

.// Create a pool that could have at most 2 blocks, 128 MB each.
VmaPoolCreateInfo poolCreateInfo = {};
poolCreateInfo.memoryTypeIndex = ...
poolCreateInfo.blockSize = 128ull * 1024 * 1024;
poolCreateInfo.maxBlockCount = 2;

VmaPool pool;
vmaCreatePool(allocator, &poolCreateInfo, &pool);

.// Allocate a buffer out of it.
VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufCreateInfo.size = 1024;
bufCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

VmaAllocationCreateInfo allocCreateInfo = {};
allocCreateInfo.pool = pool;

VkBuffer buf;
VmaAllocation alloc;
VmaAllocationInfo allocInfo;
vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);

You have to free all allocations made from this pool before destroying it.

vmaDestroyBuffer(allocator, buf, alloc);
vmaDestroyPool(allocator, pool);

Defragmentation

Interleaved allocations and deallocations of many objects of varying size can cause fragmentation, which can lead to a situation where the library is unable to find a continuous range of free memory for a new allocation despite there is enough free space, just scattered across many small free ranges between existing allocations.

To mitigate this problem, you can use vmaDefragment(). Given set of allocations, this function can move them to compact used memory, ensure more continuous free space and possibly also free some VkDeviceMemory. It can work only on allocations made from memory type that is HOST_VISIBLE. Allocations are modified to point to the new VkDeviceMemory and offset. Data in this memory is also memmove-ed to the new place. However, if you have images or buffers bound to these allocations (and you certainly do), you need to destroy, recreate, and bind them to the new place in memory.

For further details and example code, see documentation of function vmaDefragment().

Lost allocations

If your game oversubscribes video memory, if may work OK in previous-generation graphics APIs (DirectX 9, 10, 11, OpenGL) because resources are automatically paged to system RAM. In Vulkan you can't do it because when you run out of memory, an allocation just fails. If you have more data (e.g. textures) that can fit into VRAM and you don't need it all at once, you may want to upload them to GPU on demand and "push out" ones that are not used for a long time to make room for the new ones, effectively using VRAM (or a cartain memory pool) as a form of cache. Vulkan Memory Allocator can help you with that by supporting a concept of "lost allocations".

To create an allocation that can become lost, include VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT flag in VmaAllocationCreateInfo::flags. Before using a buffer or image bound to such allocation in every new frame, you need to query it if it's not lost. To check it: call vmaGetAllocationInfo() and see if VmaAllocationInfo::deviceMemory is not VK_NULL_HANDLE. If the allocation is lost, you should not use it or buffer/image bound to it. You mustn't forget to destroy this allocation and this buffer/image.

To create an allocation that can make some other allocations lost to make room for it, use VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag. You will usually use both flags VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT and VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT at the same time.

Warning! Current implementation uses quite naive, brute force algorithm, which can make allocation calls that use VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag quite slow. A new, more optimal algorithm and data structure to speed this up is planned for the future.

When interleaving creation of new allocations with usage of existing ones, how do you make sure that an allocation won't become lost while it's used in the current frame?

It is ensured because vmaGetAllocationInfo() not only returns allocation parameters and checks whether it's not lost, but when it's not, it also atomically marks it as used in the current frame, which makes it impossible to become lost in that frame. It uses lockless algorithm, so it works fast and doesn't involve locking any internal mutex.

What if my allocation may still be in use by the GPU when it's rendering a previous frame while I already submit new frame on the CPU?

You can make sure that allocations "touched" by vmaGetAllocationInfo() will not become lost for a number of additional frames back from the current one by specifying this number as VmaAllocatorCreateInfo::frameInUseCount (for default memory pool) and VmaPoolCreateInfo::frameInUseCount (for custom pool).

How do you inform the library when new frame starts?

You need to call function vmaSetCurrentFrameIndex().

Example code:

struct MyBuffer
{
    VkBuffer m_Buf = nullptr;
    VmaAllocation m_Alloc = nullptr;

    .// Called when the buffer is really needed in the current frame.
    void EnsureBuffer();
};

void MyBuffer::EnsureBuffer()
{
    .// Buffer has been created.
    if(m_Buf != VK_NULL_HANDLE)
    {
        .// Check if its allocation is not lost + mark it as used in current frame.
        VmaAllocationInfo allocInfo;
        vmaGetAllocationInfo(allocator, m_Alloc, &allocInfo);
        if(allocInfo.deviceMemory != VK_NULL_HANDLE)
        {
            .// It's all OK - safe to use m_Buf.
            return;
        }
    }

    .// Buffer not yet exists or lost - destroy and recreate it.

    vmaDestroyBuffer(allocator, m_Buf, m_Alloc);

    VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
    bufCreateInfo.size = 1024;
    bufCreateInfo.usage = VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;

    VmaAllocationCreateInfo allocCreateInfo = {};
    allocCreateInfo.usage = VMA_MEMORY_USAGE_GPU_ONLY;
    allocCreateInfo.flags = VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT |
        VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT;

    vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &m_Buf, &m_Alloc, nullptr);
}

When using lost allocations, you may see some Vulkan validation layer warnings about overlapping regions of memory bound to different kinds of buffers and images. This is still valid as long as you implement proper handling of lost allocations (like in the example above) and don't use them.

The library uses following algorithm for allocation, in order:

  1. Try to find free range of memory in existing blocks.
  2. If failed, try to create a new block of VkDeviceMemory, with preferred block size.
  3. If failed, try to create such block with size/2 and size/4.
  4. If failed and VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT flag was specified, try to find space in existing blocks, possilby making some other allocations lost.
  5. If failed, try to allocate separate VkDeviceMemory for this allocation, just like when you use VMA_ALLOCATION_CREATE_OWN_MEMORY_BIT.
  6. If failed, choose other memory type that meets the requirements specified in VmaAllocationCreateInfo and go to point 1.
  7. If failed, return VK_ERROR_OUT_OF_DEVICE_MEMORY.

Configuration

Please check "CONFIGURATION SECTION" in the code to find macros that you can define before each include of this file or change directly in this file to provide your own implementation of basic facilities like assert, min() and max() functions, mutex etc. C++ STL is used by default, but changing these allows you to get rid of any STL usage if you want, as many game developers tend to do.

Pointers to Vulkan functions

The library uses Vulkan functions straight from the vulkan.h header by default. If you want to provide your own pointers to these functions, e.g. fetched using vkGetInstanceProcAddr() and vkGetDeviceProcAddr():

  1. Define VMA_STATIC_VULKAN_FUNCTIONS 0.
  2. Provide valid pointers through VmaAllocatorCreateInfo::pVulkanFunctions.

Custom host memory allocator

If you use custom allocator for CPU memory rather than default operator new and delete from C++, you can make this library using your allocator as well by filling optional member VmaAllocatorCreateInfo::pAllocationCallbacks. These functions will be passed to Vulkan, as well as used by the library itself to make any CPU-side allocations.

Device memory allocation callbacks

The library makes calls to vkAllocateMemory() and vkFreeMemory() internally. You can setup callbacks to be informed about these calls, e.g. for the purpose of gathering some statistics. To do it, fill optional member VmaAllocatorCreateInfo::pDeviceMemoryCallbacks.

Device heap memory limit

If you want to test how your program behaves with limited amount of Vulkan device memory available (without switching your graphics card to one that really has smaller VRAM), you can use a feature of this library intended for this purpose. To do it, fill optional member VmaAllocatorCreateInfo::pHeapSizeLimit.

Thread safety