123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454 |
- =====================
- DRM Memory Management
- =====================
- Modern Linux systems require large amount of graphics memory to store
- frame buffers, textures, vertices and other graphics-related data. Given
- the very dynamic nature of many of that data, managing graphics memory
- efficiently is thus crucial for the graphics stack and plays a central
- role in the DRM infrastructure.
- The DRM core includes two memory managers, namely Translation Table Maps
- (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory
- manager to be developed and tried to be a one-size-fits-them all
- solution. It provides a single userspace API to accommodate the need of
- all hardware, supporting both Unified Memory Architecture (UMA) devices
- and devices with dedicated video RAM (i.e. most discrete video cards).
- This resulted in a large, complex piece of code that turned out to be
- hard to use for driver development.
- GEM started as an Intel-sponsored project in reaction to TTM's
- complexity. Its design philosophy is completely different: instead of
- providing a solution to every graphics memory-related problems, GEM
- identified common code between drivers and created a support library to
- share it. GEM has simpler initialization and execution requirements than
- TTM, but has no video RAM management capabilities and is thus limited to
- UMA devices.
- The Translation Table Manager (TTM)
- ===================================
- TTM design background and information belongs here.
- TTM initialization
- ------------------
- **Warning**
- This section is outdated.
- Drivers wishing to support TTM must fill out a drm_bo_driver
- structure. The structure contains several fields with function pointers
- for initializing the TTM, allocating and freeing memory, waiting for
- command completion and fence synchronization, and memory migration. See
- the radeon_ttm.c file for an example of usage.
- The ttm_global_reference structure is made up of several fields:
- ::
- struct ttm_global_reference {
- enum ttm_global_types global_type;
- size_t size;
- void *object;
- int (*init) (struct ttm_global_reference *);
- void (*release) (struct ttm_global_reference *);
- };
- There should be one global reference structure for your memory manager
- as a whole, and there will be others for each object created by the
- memory manager at runtime. Your global TTM should have a type of
- TTM_GLOBAL_TTM_MEM. The size field for the global object should be
- sizeof(struct ttm_mem_global), and the init and release hooks should
- point at your driver-specific init and release routines, which probably
- eventually call ttm_mem_global_init and ttm_mem_global_release,
- respectively.
- Once your global TTM accounting structure is set up and initialized by
- calling ttm_global_item_ref() on it, you need to create a buffer
- object TTM to provide a pool for buffer object allocation by clients and
- the kernel itself. The type of this object should be
- TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct
- ttm_bo_global). Again, driver-specific init and release functions may
- be provided, likely eventually calling ttm_bo_global_init() and
- ttm_bo_global_release(), respectively. Also, like the previous
- object, ttm_global_item_ref() is used to create an initial reference
- count for the TTM, which will call your initialization function.
- The Graphics Execution Manager (GEM)
- ====================================
- The GEM design approach has resulted in a memory manager that doesn't
- provide full coverage of all (or even all common) use cases in its
- userspace or kernel API. GEM exposes a set of standard memory-related
- operations to userspace and a set of helper functions to drivers, and
- let drivers implement hardware-specific operations with their own
- private API.
- The GEM userspace API is described in the `GEM - the Graphics Execution
- Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While
- slightly outdated, the document provides a good overview of the GEM API
- principles. Buffer allocation and read and write operations, described
- as part of the common GEM API, are currently implemented using
- driver-specific ioctls.
- GEM is data-agnostic. It manages abstract buffer objects without knowing
- what individual buffers contain. APIs that require knowledge of buffer
- contents or purpose, such as buffer allocation or synchronization
- primitives, are thus outside of the scope of GEM and must be implemented
- using driver-specific ioctls.
- On a fundamental level, GEM involves several operations:
- - Memory allocation and freeing
- - Command execution
- - Aperture management at command execution time
- Buffer object allocation is relatively straightforward and largely
- provided by Linux's shmem layer, which provides memory to back each
- object.
- Device-specific operations, such as command execution, pinning, buffer
- read & write, mapping, and domain ownership transfers are left to
- driver-specific ioctls.
- GEM Initialization
- ------------------
- Drivers that use GEM must set the DRIVER_GEM bit in the struct
- :c:type:`struct drm_driver <drm_driver>` driver_features
- field. The DRM core will then automatically initialize the GEM core
- before calling the load operation. Behind the scene, this will create a
- DRM Memory Manager object which provides an address space pool for
- object allocation.
- In a KMS configuration, drivers need to allocate and initialize a
- command ring buffer following core GEM initialization if required by the
- hardware. UMA devices usually have what is called a "stolen" memory
- region, which provides space for the initial framebuffer and large,
- contiguous memory regions required by the device. This space is
- typically not managed by GEM, and must be initialized separately into
- its own DRM MM object.
- GEM Objects Creation
- --------------------
- GEM splits creation of GEM objects and allocation of the memory that
- backs them in two distinct operations.
- GEM objects are represented by an instance of struct :c:type:`struct
- drm_gem_object <drm_gem_object>`. Drivers usually need to
- extend GEM objects with private information and thus create a
- driver-specific GEM object structure type that embeds an instance of
- struct :c:type:`struct drm_gem_object <drm_gem_object>`.
- To create a GEM object, a driver allocates memory for an instance of its
- specific GEM object type and initializes the embedded struct
- :c:type:`struct drm_gem_object <drm_gem_object>` with a call
- to :c:func:`drm_gem_object_init()`. The function takes a pointer
- to the DRM device, a pointer to the GEM object and the buffer object
- size in bytes.
- GEM uses shmem to allocate anonymous pageable memory.
- :c:func:`drm_gem_object_init()` will create an shmfs file of the
- requested size and store it into the struct :c:type:`struct
- drm_gem_object <drm_gem_object>` filp field. The memory is
- used as either main storage for the object when the graphics hardware
- uses system memory directly or as a backing store otherwise.
- Drivers are responsible for the actual physical pages allocation by
- calling :c:func:`shmem_read_mapping_page_gfp()` for each page.
- Note that they can decide to allocate pages when initializing the GEM
- object, or to delay allocation until the memory is needed (for instance
- when a page fault occurs as a result of a userspace memory access or
- when the driver needs to start a DMA transfer involving the memory).
- Anonymous pageable memory allocation is not always desired, for instance
- when the hardware requires physically contiguous system memory as is
- often the case in embedded devices. Drivers can create GEM objects with
- no shmfs backing (called private GEM objects) by initializing them with
- a call to :c:func:`drm_gem_private_object_init()` instead of
- :c:func:`drm_gem_object_init()`. Storage for private GEM objects
- must be managed by drivers.
- GEM Objects Lifetime
- --------------------
- All GEM objects are reference-counted by the GEM core. References can be
- acquired and release by :c:func:`calling
- drm_gem_object_reference()` and
- :c:func:`drm_gem_object_unreference()` respectively. The caller
- must hold the :c:type:`struct drm_device <drm_device>`
- struct_mutex lock when calling
- :c:func:`drm_gem_object_reference()`. As a convenience, GEM
- provides :c:func:`drm_gem_object_unreference_unlocked()`
- functions that can be called without holding the lock.
- When the last reference to a GEM object is released the GEM core calls
- the :c:type:`struct drm_driver <drm_driver>` gem_free_object
- operation. That operation is mandatory for GEM-enabled drivers and must
- free the GEM object and all associated resources.
- void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are
- responsible for freeing all GEM object resources. This includes the
- resources created by the GEM core, which need to be released with
- :c:func:`drm_gem_object_release()`.
- GEM Objects Naming
- ------------------
- Communication between userspace and the kernel refers to GEM objects
- using local handles, global names or, more recently, file descriptors.
- All of those are 32-bit integer values; the usual Linux kernel limits
- apply to the file descriptors.
- GEM handles are local to a DRM file. Applications get a handle to a GEM
- object through a driver-specific ioctl, and can use that handle to refer
- to the GEM object in other standard or driver-specific ioctls. Closing a
- DRM file handle frees all its GEM handles and dereferences the
- associated GEM objects.
- To create a handle for a GEM object drivers call
- :c:func:`drm_gem_handle_create()`. The function takes a pointer
- to the DRM file and the GEM object and returns a locally unique handle.
- When the handle is no longer needed drivers delete it with a call to
- :c:func:`drm_gem_handle_delete()`. Finally the GEM object
- associated with a handle can be retrieved by a call to
- :c:func:`drm_gem_object_lookup()`.
- Handles don't take ownership of GEM objects, they only take a reference
- to the object that will be dropped when the handle is destroyed. To
- avoid leaking GEM objects, drivers must make sure they drop the
- reference(s) they own (such as the initial reference taken at object
- creation time) as appropriate, without any special consideration for the
- handle. For example, in the particular case of combined GEM object and
- handle creation in the implementation of the dumb_create operation,
- drivers must drop the initial reference to the GEM object before
- returning the handle.
- GEM names are similar in purpose to handles but are not local to DRM
- files. They can be passed between processes to reference a GEM object
- globally. Names can't be used directly to refer to objects in the DRM
- API, applications must convert handles to names and names to handles
- using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls
- respectively. The conversion is handled by the DRM core without any
- driver-specific support.
- GEM also supports buffer sharing with dma-buf file descriptors through
- PRIME. GEM-based drivers must use the provided helpers functions to
- implement the exporting and importing correctly. See ?. Since sharing
- file descriptors is inherently more secure than the easily guessable and
- global GEM names it is the preferred buffer sharing mechanism. Sharing
- buffers through GEM names is only supported for legacy userspace.
- Furthermore PRIME also allows cross-device buffer sharing since it is
- based on dma-bufs.
- GEM Objects Mapping
- -------------------
- Because mapping operations are fairly heavyweight GEM favours
- read/write-like access to buffers, implemented through driver-specific
- ioctls, over mapping buffers to userspace. However, when random access
- to the buffer is needed (to perform software rendering for instance),
- direct access to the object can be more efficient.
- The mmap system call can't be used directly to map GEM objects, as they
- don't have their own file handle. Two alternative methods currently
- co-exist to map GEM objects to userspace. The first method uses a
- driver-specific ioctl to perform the mapping operation, calling
- :c:func:`do_mmap()` under the hood. This is often considered
- dubious, seems to be discouraged for new GEM-enabled drivers, and will
- thus not be described here.
- The second method uses the mmap system call on the DRM file handle. void
- \*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t
- offset); DRM identifies the GEM object to be mapped by a fake offset
- passed through the mmap offset argument. Prior to being mapped, a GEM
- object must thus be associated with a fake offset. To do so, drivers
- must call :c:func:`drm_gem_create_mmap_offset()` on the object.
- Once allocated, the fake offset value must be passed to the application
- in a driver-specific way and can then be used as the mmap offset
- argument.
- The GEM core provides a helper method :c:func:`drm_gem_mmap()` to
- handle object mapping. The method can be set directly as the mmap file
- operation handler. It will look up the GEM object based on the offset
- value and set the VMA operations to the :c:type:`struct drm_driver
- <drm_driver>` gem_vm_ops field. Note that
- :c:func:`drm_gem_mmap()` doesn't map memory to userspace, but
- relies on the driver-provided fault handler to map pages individually.
- To use :c:func:`drm_gem_mmap()`, drivers must fill the struct
- :c:type:`struct drm_driver <drm_driver>` gem_vm_ops field
- with a pointer to VM operations.
- struct vm_operations_struct \*gem_vm_ops struct
- vm_operations_struct { void (\*open)(struct vm_area_struct \* area);
- void (\*close)(struct vm_area_struct \* area); int (\*fault)(struct
- vm_area_struct \*vma, struct vm_fault \*vmf); };
- The open and close operations must update the GEM object reference
- count. Drivers can use the :c:func:`drm_gem_vm_open()` and
- :c:func:`drm_gem_vm_close()` helper functions directly as open
- and close handlers.
- The fault operation handler is responsible for mapping individual pages
- to userspace when a page fault occurs. Depending on the memory
- allocation scheme, drivers can allocate pages at fault time, or can
- decide to allocate memory for the GEM object at the time the object is
- created.
- Drivers that want to map the GEM object upfront instead of handling page
- faults can implement their own mmap file operation handler.
- Memory Coherency
- ----------------
- When mapped to the device or used in a command buffer, backing pages for
- an object are flushed to memory and marked write combined so as to be
- coherent with the GPU. Likewise, if the CPU accesses an object after the
- GPU has finished rendering to the object, then the object must be made
- coherent with the CPU's view of memory, usually involving GPU cache
- flushing of various kinds. This core CPU<->GPU coherency management is
- provided by a device-specific ioctl, which evaluates an object's current
- domain and performs any necessary flushing or synchronization to put the
- object into the desired coherency domain (note that the object may be
- busy, i.e. an active render target; in that case, setting the domain
- blocks the client and waits for rendering to complete before performing
- any necessary flushing operations).
- Command Execution
- -----------------
- Perhaps the most important GEM function for GPU devices is providing a
- command execution interface to clients. Client programs construct
- command buffers containing references to previously allocated memory
- objects, and then submit them to GEM. At that point, GEM takes care to
- bind all the objects into the GTT, execute the buffer, and provide
- necessary synchronization between clients accessing the same buffers.
- This often involves evicting some objects from the GTT and re-binding
- others (a fairly expensive operation), and providing relocation support
- which hides fixed GTT offsets from clients. Clients must take care not
- to submit command buffers that reference more objects than can fit in
- the GTT; otherwise, GEM will reject them and no rendering will occur.
- Similarly, if several objects in the buffer require fence registers to
- be allocated for correct rendering (e.g. 2D blits on pre-965 chips),
- care must be taken not to require more fence registers than are
- available to the client. Such resource management should be abstracted
- from the client in libdrm.
- GEM Function Reference
- ----------------------
- .. kernel-doc:: drivers/gpu/drm/drm_gem.c
- :export:
- .. kernel-doc:: include/drm/drm_gem.h
- :internal:
- GEM CMA Helper Functions Reference
- ----------------------------------
- .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
- :doc: cma helpers
- .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
- :export:
- .. kernel-doc:: include/drm/drm_gem_cma_helper.h
- :internal:
- VMA Offset Manager
- ==================
- .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
- :doc: vma offset manager
- .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
- :export:
- .. kernel-doc:: include/drm/drm_vma_manager.h
- :internal:
- PRIME Buffer Sharing
- ====================
- PRIME is the cross device buffer sharing framework in drm, originally
- created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME
- buffers are dma-buf based file descriptors.
- Overview and Driver Interface
- -----------------------------
- Similar to GEM global names, PRIME file descriptors are also used to
- share buffer objects across processes. They offer additional security:
- as file descriptors must be explicitly sent over UNIX domain sockets to
- be shared between applications, they can't be guessed like the globally
- unique GEM names.
- Drivers that support the PRIME API must set the DRIVER_PRIME bit in the
- struct :c:type:`struct drm_driver <drm_driver>`
- driver_features field, and implement the prime_handle_to_fd and
- prime_fd_to_handle operations.
- int (\*prime_handle_to_fd)(struct drm_device \*dev, struct drm_file
- \*file_priv, uint32_t handle, uint32_t flags, int \*prime_fd); int
- (\*prime_fd_to_handle)(struct drm_device \*dev, struct drm_file
- \*file_priv, int prime_fd, uint32_t \*handle); Those two operations
- convert a handle to a PRIME file descriptor and vice versa. Drivers must
- use the kernel dma-buf buffer sharing framework to manage the PRIME file
- descriptors. Similar to the mode setting API PRIME is agnostic to the
- underlying buffer object manager, as long as handles are 32bit unsigned
- integers.
- While non-GEM drivers must implement the operations themselves, GEM
- drivers must use the :c:func:`drm_gem_prime_handle_to_fd()` and
- :c:func:`drm_gem_prime_fd_to_handle()` helper functions. Those
- helpers rely on the driver gem_prime_export and gem_prime_import
- operations to create a dma-buf instance from a GEM object (dma-buf
- exporter role) and to create a GEM object from a dma-buf instance
- (dma-buf importer role).
- struct dma_buf \* (\*gem_prime_export)(struct drm_device \*dev,
- struct drm_gem_object \*obj, int flags); struct drm_gem_object \*
- (\*gem_prime_import)(struct drm_device \*dev, struct dma_buf
- \*dma_buf); These two operations are mandatory for GEM drivers that
- support PRIME.
- PRIME Helper Functions
- ----------------------
- .. kernel-doc:: drivers/gpu/drm/drm_prime.c
- :doc: PRIME Helpers
- PRIME Function References
- -------------------------
- .. kernel-doc:: drivers/gpu/drm/drm_prime.c
- :export:
- DRM MM Range Allocator
- ======================
- Overview
- --------
- .. kernel-doc:: drivers/gpu/drm/drm_mm.c
- :doc: Overview
- LRU Scan/Eviction Support
- -------------------------
- .. kernel-doc:: drivers/gpu/drm/drm_mm.c
- :doc: lru scan roaster
- DRM MM Range Allocator Function References
- ------------------------------------------
- .. kernel-doc:: drivers/gpu/drm/drm_mm.c
- :export:
- .. kernel-doc:: include/drm/drm_mm.h
- :internal:
|