Vulkan Memory Allocator
|
To "map memory" in Vulkan means to obtain a CPU pointer to VkDeviceMemory
, to be able to read from it or write to it in CPU code. Mapping is possible only of memory allocated from a memory type that has VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
flag. Functions vkMapMemory()
, vkUnmapMemory()
are designed for this purpose. You can use them directly with memory allocated by this library, but it is not recommended because of following issue: Mapping the same VkDeviceMemory
block multiple times is illegal - only one mapping at a time is allowed. This includes mapping disjoint regions. Mapping is not reference-counted internally by Vulkan. It is also not thread-safe. Because of this, Vulkan Memory Allocator provides following facilities:
VMA_MEMORY_USAGE_AUTO*
enum values. For other usage values they are ignored and every such allocation made in HOST_VISIBLE
memory type is mappable, but these flags can still be used for consistency.The easiest way to copy data from a host pointer to an allocation is to use convenience function vmaCopyMemoryToAllocation(). It automatically maps the Vulkan memory temporarily (if not already mapped), performs memcpy
, and calls vkFlushMappedMemoryRanges
(if required - if memory type is not HOST_COHERENT
).
It is also the safest one, because using memcpy
avoids a risk of accidentally introducing memory reads (e.g. by doing pMappedVectors[i] += v
), which may be very slow on memory types that are not HOST_CACHED
.
Copy in the other direction - from an allocation to a host pointer can be performed the same way using function vmaCopyAllocationToMemory().
The library provides following functions for mapping of a specific allocation: vmaMapMemory(), vmaUnmapMemory(). They are safer and more convenient to use than standard Vulkan functions. You can map an allocation multiple times simultaneously - mapping is reference-counted internally. You can also map different allocations simultaneously regardless of whether they use the same VkDeviceMemory
block. The way it is implemented is that the library always maps entire memory block, not just region of the allocation. For further details, see description of vmaMapMemory() function. Example:
When mapping, you may see a warning from Vulkan validation layer similar to this one:
Mapping an image with layout VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL can result in undefined behavior if this memory is used by the device. Only GENERAL or PREINITIALIZED should be used.
It happens because the library maps entire VkDeviceMemory
block, where different types of images and buffers may end up together, especially on GPUs with unified memory like Intel. You can safely ignore it if you are sure you access only memory of the intended object that you wanted to map.
Keeping your memory persistently mapped is generally OK in Vulkan. You don't need to unmap it before using its data on the GPU. The library provides a special feature designed for that: Allocations made with VMA_ALLOCATION_CREATE_MAPPED_BIT flag set in VmaAllocationCreateInfo::flags stay mapped all the time, so you can just access CPU pointer to it any time without a need to call any "map" or "unmap" function. Example:
HOST_VISIBLE
, the allocation will be mapped on creation. For an example of how to make use of this fact, see section Advanced data uploading.Memory in Vulkan doesn't need to be unmapped before using it on GPU, but unless a memory types has VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
flag set, you need to manually invalidate cache before reading of mapped pointer and flush cache after writing to mapped pointer. Map/unmap operations don't do that automatically. Vulkan provides following functions for this purpose vkFlushMappedMemoryRanges()
, vkInvalidateMappedMemoryRanges()
, but this library provides more convenient functions that refer to given allocation object: vmaFlushAllocation(), vmaInvalidateAllocation(), or multiple objects at once: vmaFlushAllocations(), vmaInvalidateAllocations().
Regions of memory specified for flush/invalidate must be aligned to VkPhysicalDeviceLimits::nonCoherentAtomSize
. This is automatically ensured by the library. In any memory type that is HOST_VISIBLE
but not HOST_COHERENT
, all allocations within blocks are aligned to this value, so their offsets are always multiply of nonCoherentAtomSize
and two different allocations never share same "line" of this size.
Also, Windows drivers from all 3 PC GPU vendors (AMD, Intel, NVIDIA) currently provide HOST_COHERENT
flag on all memory types that are HOST_VISIBLE
, so on PC you may not need to bother.