Skip to content

Commit 1816f92

Browse files
goelakasdanvet
authored andcommitted
drm/i915: Support creation of unbound wc user mappings for objects
This patch provides support to create write-combining virtual mappings of GEM object. It intends to provide the same funtionality of 'mmap_gtt' interface without the constraints and contention of a limited aperture space, but requires clients handles the linear to tile conversion on their own. This is for improving the CPU write operation performance, as with such mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache flush after update from CPU side, when object is passed onto GPU. This type of mapping is specially useful in case of sub-region update, i.e. when only a portion of the object is to be updated. Using a CPU mmap in such cases would normally incur a clflush of the whole object, and using a GTT mmapping would likely require eviction of an active object or fence and thus stall. The write-combining CPU mmap avoids both. To ensure the cache coherency, before using this mapping, the GTT domain has been reused here. This provides the required cache flush if the object is in CPU domain or synchronization against the concurrent rendering. Although the access through an uncached mmap should automatically invalidate the cache lines, this may not be true for non-temporal write instructions and also not all pages of the object may be updated at any given point of time through this mapping. Having a call to get_pages in set_to_gtt_domain function, as added in the earlier patch 'drm/i915: Broaden application of set-domain(GTT)', would guarantee the clflush and so there will be no cachelines holding the data for the object before it is accessed through this map. The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been extended with a new flags field (defaulting to 0 for existent users). In order for userspace to detect the extended ioctl, a new parameter I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface. v2: Fix error handling, invalid flag detection, renaming (ickle) v3: Rebase to latest drm-intel-nightly codebase The new mmapping is exercised by igt/gem_mmap_wc, igt/gem_concurrent_blit and igt/gem_gtt_speed. Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
1 parent 43566de commit 1816f92

File tree

3 files changed

+31
-0
lines changed

3 files changed

+31
-0
lines changed

drivers/gpu/drm/i915/i915_dma.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
143143
case I915_PARAM_HAS_COHERENT_PHYS_GTT:
144144
value = 1;
145145
break;
146+
case I915_PARAM_MMAP_VERSION:
147+
value = 1;
148+
break;
146149
default:
147150
DRM_DEBUG("Unknown parameter %d\n", param->param);
148151
return -EINVAL;

drivers/gpu/drm/i915/i915_gem.c

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1534,6 +1534,12 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
15341534
struct drm_gem_object *obj;
15351535
unsigned long addr;
15361536

1537+
if (args->flags & ~(I915_MMAP_WC))
1538+
return -EINVAL;
1539+
1540+
if (args->flags & I915_MMAP_WC && !cpu_has_pat)
1541+
return -ENODEV;
1542+
15371543
obj = drm_gem_object_lookup(dev, file, args->handle);
15381544
if (obj == NULL)
15391545
return -ENOENT;
@@ -1549,6 +1555,19 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
15491555
addr = vm_mmap(obj->filp, 0, args->size,
15501556
PROT_READ | PROT_WRITE, MAP_SHARED,
15511557
args->offset);
1558+
if (args->flags & I915_MMAP_WC) {
1559+
struct mm_struct *mm = current->mm;
1560+
struct vm_area_struct *vma;
1561+
1562+
down_write(&mm->mmap_sem);
1563+
vma = find_vma(mm, addr);
1564+
if (vma)
1565+
vma->vm_page_prot =
1566+
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
1567+
else
1568+
addr = -ENOMEM;
1569+
up_write(&mm->mmap_sem);
1570+
}
15521571
drm_gem_object_unreference_unlocked(obj);
15531572
if (IS_ERR((void *)addr))
15541573
return addr;

include/uapi/drm/i915_drm.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -341,6 +341,7 @@ typedef struct drm_i915_irq_wait {
341341
#define I915_PARAM_HAS_WT 27
342342
#define I915_PARAM_CMD_PARSER_VERSION 28
343343
#define I915_PARAM_HAS_COHERENT_PHYS_GTT 29
344+
#define I915_PARAM_MMAP_VERSION 30
344345

345346
typedef struct drm_i915_getparam {
346347
int param;
@@ -488,6 +489,14 @@ struct drm_i915_gem_mmap {
488489
* This is a fixed-size type for 32/64 compatibility.
489490
*/
490491
__u64 addr_ptr;
492+
493+
/**
494+
* Flags for extended behaviour.
495+
*
496+
* Added in version 2.
497+
*/
498+
__u64 flags;
499+
#define I915_MMAP_WC 0x1
491500
};
492501

493502
struct drm_i915_gem_mmap_gtt {

0 commit comments

Comments
 (0)