Skip to content

Commit 5b5561b

Browse files
kleinermalexdeucher
authored andcommitted
drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
commit 4dfd648 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Fixes: fdo#93147 Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net> (v2) Refine radeon_flip_work_func() for better efficiency: In radeon_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Retested on DCE-3 and DCE-4 to verify it still works nicely. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1 parent cb5d416 commit 5b5561b

File tree

9 files changed

+164
-29
lines changed

9 files changed

+164
-29
lines changed

drivers/gpu/drm/radeon/cik.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9630,6 +9630,9 @@ static void dce8_program_watermarks(struct radeon_device *rdev,
96309630
(rdev->disp_priority == 2)) {
96319631
DRM_DEBUG_KMS("force priority to high\n");
96329632
}
9633+
9634+
/* Save number of lines the linebuffer leads before the scanout */
9635+
radeon_crtc->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode->crtc_hdisplay);
96339636
}
96349637

96359638
/* select wm A */

drivers/gpu/drm/radeon/evergreen.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2372,6 +2372,9 @@ static void evergreen_program_watermarks(struct radeon_device *rdev,
23722372
c.full = dfixed_div(c, a);
23732373
priority_b_mark = dfixed_trunc(c);
23742374
priority_b_cnt |= priority_b_mark & PRIORITY_MARK_MASK;
2375+
2376+
/* Save number of lines the linebuffer leads before the scanout */
2377+
radeon_crtc->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode->crtc_hdisplay);
23752378
}
23762379

23772380
/* select wm A */

drivers/gpu/drm/radeon/r100.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3217,6 +3217,9 @@ void r100_bandwidth_update(struct radeon_device *rdev)
32173217
uint32_t pixel_bytes1 = 0;
32183218
uint32_t pixel_bytes2 = 0;
32193219

3220+
/* Guess line buffer size to be 8192 pixels */
3221+
u32 lb_size = 8192;
3222+
32203223
if (!rdev->mode_info.mode_config_initialized)
32213224
return;
32223225

@@ -3631,6 +3634,13 @@ void r100_bandwidth_update(struct radeon_device *rdev)
36313634
DRM_DEBUG_KMS("GRPH2_BUFFER_CNTL from to %x\n",
36323635
(unsigned int)RREG32(RADEON_GRPH2_BUFFER_CNTL));
36333636
}
3637+
3638+
/* Save number of lines the linebuffer leads before the scanout */
3639+
if (mode1)
3640+
rdev->mode_info.crtcs[0]->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode1->crtc_hdisplay);
3641+
3642+
if (mode2)
3643+
rdev->mode_info.crtcs[1]->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode2->crtc_hdisplay);
36343644
}
36353645

36363646
int r100_ring_test(struct radeon_device *rdev, struct radeon_ring *ring)

drivers/gpu/drm/radeon/radeon_display.c

Lines changed: 79 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -322,7 +322,9 @@ void radeon_crtc_handle_vblank(struct radeon_device *rdev, int crtc_id)
322322
* to complete in this vblank?
323323
*/
324324
if (update_pending &&
325-
(DRM_SCANOUTPOS_VALID & radeon_get_crtc_scanoutpos(rdev->ddev, crtc_id, 0,
325+
(DRM_SCANOUTPOS_VALID & radeon_get_crtc_scanoutpos(rdev->ddev,
326+
crtc_id,
327+
USE_REAL_VBLANKSTART,
326328
&vpos, &hpos, NULL, NULL,
327329
&rdev->mode_info.crtcs[crtc_id]->base.hwmode)) &&
328330
((vpos >= (99 * rdev->mode_info.crtcs[crtc_id]->base.hwmode.crtc_vdisplay)/100) ||
@@ -401,6 +403,8 @@ static void radeon_flip_work_func(struct work_struct *__work)
401403
struct drm_crtc *crtc = &radeon_crtc->base;
402404
unsigned long flags;
403405
int r;
406+
int vpos, hpos, stat, min_udelay;
407+
struct drm_vblank_crtc *vblank = &crtc->dev->vblank[work->crtc_id];
404408

405409
down_read(&rdev->exclusive_lock);
406410
if (work->fence) {
@@ -437,6 +441,41 @@ static void radeon_flip_work_func(struct work_struct *__work)
437441
/* set the proper interrupt */
438442
radeon_irq_kms_pflip_irq_get(rdev, radeon_crtc->crtc_id);
439443

444+
/* If this happens to execute within the "virtually extended" vblank
445+
* interval before the start of the real vblank interval then it needs
446+
* to delay programming the mmio flip until the real vblank is entered.
447+
* This prevents completing a flip too early due to the way we fudge
448+
* our vblank counter and vblank timestamps in order to work around the
449+
* problem that the hw fires vblank interrupts before actual start of
450+
* vblank (when line buffer refilling is done for a frame). It
451+
* complements the fudging logic in radeon_get_crtc_scanoutpos() for
452+
* timestamping and radeon_get_vblank_counter_kms() for vblank counts.
453+
*
454+
* In practice this won't execute very often unless on very fast
455+
* machines because the time window for this to happen is very small.
456+
*/
457+
for (;;) {
458+
/* GET_DISTANCE_TO_VBLANKSTART returns distance to real vblank
459+
* start in hpos, and to the "fudged earlier" vblank start in
460+
* vpos.
461+
*/
462+
stat = radeon_get_crtc_scanoutpos(rdev->ddev, work->crtc_id,
463+
GET_DISTANCE_TO_VBLANKSTART,
464+
&vpos, &hpos, NULL, NULL,
465+
&crtc->hwmode);
466+
467+
if ((stat & (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE)) !=
468+
(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE) ||
469+
!(vpos >= 0 && hpos <= 0))
470+
break;
471+
472+
/* Sleep at least until estimated real start of hw vblank */
473+
spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
474+
min_udelay = (-hpos + 1) * max(vblank->linedur_ns / 1000, 5);
475+
usleep_range(min_udelay, 2 * min_udelay);
476+
spin_lock_irqsave(&crtc->dev->event_lock, flags);
477+
};
478+
440479
/* do the flip (mmio) */
441480
radeon_page_flip(rdev, radeon_crtc->crtc_id, work->base);
442481

@@ -1768,6 +1807,15 @@ bool radeon_crtc_scaling_mode_fixup(struct drm_crtc *crtc,
17681807
* \param dev Device to query.
17691808
* \param crtc Crtc to query.
17701809
* \param flags Flags from caller (DRM_CALLED_FROM_VBLIRQ or 0).
1810+
* For driver internal use only also supports these flags:
1811+
*
1812+
* USE_REAL_VBLANKSTART to use the real start of vblank instead
1813+
* of a fudged earlier start of vblank.
1814+
*
1815+
* GET_DISTANCE_TO_VBLANKSTART to return distance to the
1816+
* fudged earlier start of vblank in *vpos and the distance
1817+
* to true start of vblank in *hpos.
1818+
*
17711819
* \param *vpos Location where vertical scanout position should be stored.
17721820
* \param *hpos Location where horizontal scanout position should go.
17731821
* \param *stime Target location for timestamp taken immediately before
@@ -1911,10 +1959,40 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe,
19111959
vbl_end = 0;
19121960
}
19131961

1962+
/* Called from driver internal vblank counter query code? */
1963+
if (flags & GET_DISTANCE_TO_VBLANKSTART) {
1964+
/* Caller wants distance from real vbl_start in *hpos */
1965+
*hpos = *vpos - vbl_start;
1966+
}
1967+
1968+
/* Fudge vblank to start a few scanlines earlier to handle the
1969+
* problem that vblank irqs fire a few scanlines before start
1970+
* of vblank. Some driver internal callers need the true vblank
1971+
* start to be used and signal this via the USE_REAL_VBLANKSTART flag.
1972+
*
1973+
* The cause of the "early" vblank irq is that the irq is triggered
1974+
* by the line buffer logic when the line buffer read position enters
1975+
* the vblank, whereas our crtc scanout position naturally lags the
1976+
* line buffer read position.
1977+
*/
1978+
if (!(flags & USE_REAL_VBLANKSTART))
1979+
vbl_start -= rdev->mode_info.crtcs[pipe]->lb_vblank_lead_lines;
1980+
19141981
/* Test scanout position against vblank region. */
19151982
if ((*vpos < vbl_start) && (*vpos >= vbl_end))
19161983
in_vbl = false;
19171984

1985+
/* In vblank? */
1986+
if (in_vbl)
1987+
ret |= DRM_SCANOUTPOS_IN_VBLANK;
1988+
1989+
/* Called from driver internal vblank counter query code? */
1990+
if (flags & GET_DISTANCE_TO_VBLANKSTART) {
1991+
/* Caller wants distance from fudged earlier vbl_start */
1992+
*vpos -= vbl_start;
1993+
return ret;
1994+
}
1995+
19181996
/* Check if inside vblank area and apply corrective offsets:
19191997
* vpos will then be >=0 in video scanout area, but negative
19201998
* within vblank area, counting down the number of lines until
@@ -1930,31 +2008,5 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe,
19302008
/* Correct for shifted end of vbl at vbl_end. */
19312009
*vpos = *vpos - vbl_end;
19322010

1933-
/* In vblank? */
1934-
if (in_vbl)
1935-
ret |= DRM_SCANOUTPOS_IN_VBLANK;
1936-
1937-
/* Is vpos outside nominal vblank area, but less than
1938-
* 1/100 of a frame height away from start of vblank?
1939-
* If so, assume this isn't a massively delayed vblank
1940-
* interrupt, but a vblank interrupt that fired a few
1941-
* microseconds before true start of vblank. Compensate
1942-
* by adding a full frame duration to the final timestamp.
1943-
* Happens, e.g., on ATI R500, R600.
1944-
*
1945-
* We only do this if DRM_CALLED_FROM_VBLIRQ.
1946-
*/
1947-
if ((flags & DRM_CALLED_FROM_VBLIRQ) && !in_vbl) {
1948-
vbl_start = mode->crtc_vdisplay;
1949-
vtotal = mode->crtc_vtotal;
1950-
1951-
if (vbl_start - *vpos < vtotal / 100) {
1952-
*vpos -= vtotal;
1953-
1954-
/* Signal this correction as "applied". */
1955-
ret |= 0x8;
1956-
}
1957-
}
1958-
19592011
return ret;
19602012
}

drivers/gpu/drm/radeon/radeon_kms.c

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -755,14 +755,62 @@ void radeon_driver_preclose_kms(struct drm_device *dev,
755755
*/
756756
u32 radeon_get_vblank_counter_kms(struct drm_device *dev, int crtc)
757757
{
758+
int vpos, hpos, stat;
759+
u32 count;
758760
struct radeon_device *rdev = dev->dev_private;
759761

760762
if (crtc < 0 || crtc >= rdev->num_crtc) {
761763
DRM_ERROR("Invalid crtc %d\n", crtc);
762764
return -EINVAL;
763765
}
764766

765-
return radeon_get_vblank_counter(rdev, crtc);
767+
/* The hw increments its frame counter at start of vsync, not at start
768+
* of vblank, as is required by DRM core vblank counter handling.
769+
* Cook the hw count here to make it appear to the caller as if it
770+
* incremented at start of vblank. We measure distance to start of
771+
* vblank in vpos. vpos therefore will be >= 0 between start of vblank
772+
* and start of vsync, so vpos >= 0 means to bump the hw frame counter
773+
* result by 1 to give the proper appearance to caller.
774+
*/
775+
if (rdev->mode_info.crtcs[crtc]) {
776+
/* Repeat readout if needed to provide stable result if
777+
* we cross start of vsync during the queries.
778+
*/
779+
do {
780+
count = radeon_get_vblank_counter(rdev, crtc);
781+
/* Ask radeon_get_crtc_scanoutpos to return vpos as
782+
* distance to start of vblank, instead of regular
783+
* vertical scanout pos.
784+
*/
785+
stat = radeon_get_crtc_scanoutpos(
786+
dev, crtc, GET_DISTANCE_TO_VBLANKSTART,
787+
&vpos, &hpos, NULL, NULL,
788+
&rdev->mode_info.crtcs[crtc]->base.hwmode);
789+
} while (count != radeon_get_vblank_counter(rdev, crtc));
790+
791+
if (((stat & (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE)) !=
792+
(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_ACCURATE))) {
793+
DRM_DEBUG_VBL("Query failed! stat %d\n", stat);
794+
}
795+
else {
796+
DRM_DEBUG_VBL("crtc %d: dist from vblank start %d\n",
797+
crtc, vpos);
798+
799+
/* Bump counter if we are at >= leading edge of vblank,
800+
* but before vsync where vpos would turn negative and
801+
* the hw counter really increments.
802+
*/
803+
if (vpos >= 0)
804+
count++;
805+
}
806+
}
807+
else {
808+
/* Fallback to use value as is. */
809+
count = radeon_get_vblank_counter(rdev, crtc);
810+
DRM_DEBUG_VBL("NULL mode info! Returned count may be wrong.\n");
811+
}
812+
813+
return count;
766814
}
767815

768816
/**

drivers/gpu/drm/radeon/radeon_mode.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,7 @@ struct radeon_crtc {
367367
u32 line_time;
368368
u32 wm_low;
369369
u32 wm_high;
370+
u32 lb_vblank_lead_lines;
370371
struct drm_display_mode hw_mode;
371372
enum radeon_output_csc output_csc;
372373
};
@@ -687,6 +688,9 @@ struct atom_voltage_table
687688
struct atom_voltage_table_entry entries[MAX_VOLTAGE_ENTRIES];
688689
};
689690

691+
/* Driver internal use only flags of radeon_get_crtc_scanoutpos() */
692+
#define USE_REAL_VBLANKSTART (1 << 30)
693+
#define GET_DISTANCE_TO_VBLANKSTART (1 << 31)
690694

691695
extern void
692696
radeon_add_atom_connector(struct drm_device *dev,

drivers/gpu/drm/radeon/radeon_pm.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1756,7 +1756,9 @@ static bool radeon_pm_in_vbl(struct radeon_device *rdev)
17561756
*/
17571757
for (crtc = 0; (crtc < rdev->num_crtc) && in_vbl; crtc++) {
17581758
if (rdev->pm.active_crtcs & (1 << crtc)) {
1759-
vbl_status = radeon_get_crtc_scanoutpos(rdev->ddev, crtc, 0,
1759+
vbl_status = radeon_get_crtc_scanoutpos(rdev->ddev,
1760+
crtc,
1761+
USE_REAL_VBLANKSTART,
17601762
&vpos, &hpos, NULL, NULL,
17611763
&rdev->mode_info.crtcs[crtc]->base.hwmode);
17621764
if ((vbl_status & DRM_SCANOUTPOS_VALID) &&

drivers/gpu/drm/radeon/rs690.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,9 @@ void rs690_line_buffer_adjust(struct radeon_device *rdev,
207207
{
208208
u32 tmp;
209209

210+
/* Guess line buffer size to be 8192 pixels */
211+
u32 lb_size = 8192;
212+
210213
/*
211214
* Line Buffer Setup
212215
* There is a single line buffer shared by both display controllers.
@@ -243,6 +246,13 @@ void rs690_line_buffer_adjust(struct radeon_device *rdev,
243246
tmp |= V_006520_DC_LB_MEMORY_SPLIT_D1_1Q_D2_3Q;
244247
}
245248
WREG32(R_006520_DC_LB_MEMORY_SPLIT, tmp);
249+
250+
/* Save number of lines the linebuffer leads before the scanout */
251+
if (mode1)
252+
rdev->mode_info.crtcs[0]->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode1->crtc_hdisplay);
253+
254+
if (mode2)
255+
rdev->mode_info.crtcs[1]->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode2->crtc_hdisplay);
246256
}
247257

248258
struct rs690_watermark {

drivers/gpu/drm/radeon/si.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2376,6 +2376,9 @@ static void dce6_program_watermarks(struct radeon_device *rdev,
23762376
c.full = dfixed_div(c, a);
23772377
priority_b_mark = dfixed_trunc(c);
23782378
priority_b_cnt |= priority_b_mark & PRIORITY_MARK_MASK;
2379+
2380+
/* Save number of lines the linebuffer leads before the scanout */
2381+
radeon_crtc->lb_vblank_lead_lines = DIV_ROUND_UP(lb_size, mode->crtc_hdisplay);
23792382
}
23802383

23812384
/* select wm A */

0 commit comments

Comments
 (0)