View Issue Details

IDProjectCategoryView StatusLast Update
0000331madVRbugpublic2018-01-14 15:51
Reporterkael Assigned Tomadshi  
PrioritynormalSeveritymajorReproducibilityalways
Status closedResolutionunable to reproduce 
PlatformWindowsOSWindows 10 Pro x64 
Summary0000331: D3D11 presentation enters a degraded state over time
DescriptionOn Windows 10 if I enable D3D11 presentation, many minor issues/bugs regarding frame presentation, tearing, dropped frames, etc go away - but over time madVR consistently goes into a degraded state where it starts dropping frames.

Typically after leaving a movie or stream playing for something on the order of 30-60 minutes, the average and max present times in the madVR performance stats will climb from ~1ms all the way up to around ~14ms or more. At this point the render and present queues empty out, hovering around 0-1, even though the decode and upload queues are full. Once madVR enters this state it never recovers.

If I cause madVR to reinitialize (move the player to another monitor; enter/exit fullscreen mode) it instantly recovers and present times drop down to ~1ms.
Steps To ReproduceI've been reproducing this by running mpc-hc windowed on one of my monitors, playing a Twitch live stream using livestreamer. It happens with mkv files containing films/anime as well.
Additional InformationIn regular D3D9 presentation mode this never happens. In Present Frames In Advance (D3D9+DWM, i think?) mode this never happens.

I tried a few different video decoders (software, dxva, cuvid) to see if those had any impact, but they don't.

I've tried adjusting the queue sizes and other rendering settings to see if this goes away, but nothing seems to fix it other than disabling D3D11 entirely ( :( )
TagsNo tags attached.
madVR Version0.88.21
Media Player (with version info)MPC-HC 1.7.9 846eff0
Splitter (with version info)LAV Splitter 0.65.0.9-git
Decoder (with version info)LAV Video Decoder 0.65.0
DecodingCUDA
Deinterlacingnone (progressive)
DXVA2 Scaling Activeno
Aero / Desktop CompositionOn
Problem occurs with modewindowed mode
GPU ManufacturerNVidia
GPU ModelGeForce GTX 970
GPU Driver Version353.62

Activities

kael

2015-08-01 13:03

reporter   ~0001126

Once madVR enters a degraded state, the average/max present times climb by around 0.03ms/second. The max present time seems to top out at around 16.67ms (matching the 60hz refresh rate of my panel) and the average time tops out at around 14ms. Once the average present time hits around 13ms the render queue starts to empty out and get close to 0, at which point the present queue can start emptying out and I get dropped frames. Otherwise, the dropped/delayed frame counters and presentation glitch counters stay fixed and don't climb at all. Once it tops out it stays in that state forever (as far as I can tell).

CPU usage is low during this whole time period, and GPU usage doesn't seem to spike up either.

In my testing this happens regardless of whether 'present a frame for every VSync' is turned on (tried with/without). This also happens with separate devices enabled and without them enabled.

I'm attaching a screenshot of a GPUView trace from mpc-hc/madVR in the degraded state. As far as I can tell, the vast majority of each frame is spent on a fence (waiting for vsync, looks like? since the fence releases shortly after the vsync - marked by the blue vertical bar in the trace). GPU load looks low and the burst of operations in madvr/d3d after the vsync fences lines up with how normally the present times are <1ms.

kael

2015-08-01 13:03

reporter  

kael

2015-08-01 13:08

reporter  

non-degraded gpuview.png (127,503 bytes)   
non-degraded gpuview.png (127,503 bytes)   

kael

2015-08-01 13:09

reporter   ~0001127

Attached a comparison trace of madVR/mpc-hc in a non-degraded state. the fences barely last any time at all. Most of the trace timeline is empty (where in the degraded state, 0000042:0000090% of it or more is occupied by the fences).

dwm looks different too, which is interesting. maybe it's getting backed up with queued frames somehow? (I've tried adjusting the backbuffer count in madvr settings but it always seems to use a present queue size of 6 in d3d11 mode.)

madshi

2015-08-01 13:19

administrator   ~0001128

It's hard to say what's happening there. Is this a new problem, or did you always have this problem? If it's a new problem, what introduced it? Updating to Windows 10? Or updating to a new madVR build? Or updating to a new MPC-HC build? Or something else?

D3D11 mode always use "present several frames in advance", even if you have that feature disabled in the madVR settings. So enable it again, then you can change the size of the present queue.

kael

2015-08-01 13:20

reporter   ~0001129

Until I installed win10 and updated to latest madvr I didn't have the D3D11 option. I don't have any trivial way to downgrade to win8 and see if it reproduces there. I could downgrade to a version of madvr without D3D11 but I'm not sure if that would help.

It happened with previous revisions of the win10 nvidia drivers.

madshi

2015-08-01 13:23

administrator   ~0001130

Is there anything different when using D3D9 compared to how it was when using win8 and the older madVR build you were using before?

Does it help if you change the "present several frames in advance" option to reduce or increase the size of the present queue in D3D11 mode?

Do you have 10bit output mode activated? If so, try setting your display to 8bit, does that change anything?

kael

2015-08-01 13:26

reporter   ~0001131

D3D9 is identical. I have always had nasty problems with D3D9, though some of them can be mitigated with different configurations. (I'm pretty sure they're just due to my rather unusual display configuration - two 4k panels + a 1080p hdtv, with video usually playing on the hdtv, sometimes at a different refresh rate.) With the exception of this issue, D3D11 is flawless.

Vs win8 and the older madvr build i haven't noticed any changes to d3d9 behavior, both in backbuffer and present-several-frames mode. For d3d9, exclusive mode also eliminates most of the issues but has its own downsides.

My display has no 10bit support and I don't have it enabled in my drivers either. The status overlay explicitly says 'd3d11 windowed (8 bit)'.

I set the present queue size to 3 and the decoder/upload/render queues to 5. I'll run for a while and see if it reproduces. I'll also try it in windowed fullscreen mode to see if it can reproduce there (I think I remember it happening, but I'm not sure.)

madshi

2015-08-01 13:52

administrator   ~0001132

Does any of this help?

http://forum.doom9.org/showpost.php?p=1724688&postcount=30582

kael

2015-08-01 13:57

reporter   ~0001133

Some of that looks relevant, so I'll do some more testing. Sorry I didn't see it before filing the bug.

I already had maximum pre-rendered frames set to application default, at least. It's interesting that your guidance is to set the frame count to 8, since I've never had it higher than 6. That explanation certainly matches up with what I've seen so far, though. Ideally a queue size of 4 will work out fine, since that seems to be the lowest value with no dropped frames/glitches.

Thanks.

kael

2015-08-02 02:51

reporter   ~0001134

Hit it again with shorter queues (render/decode/upload size of 6, frame count of 4). 3 wasn't enough to avoid frame drops on my machine.

I had tried changing the flush settings to 'flush and wait (loop)' on both final render step and present. In this degraded state, the final render step only shows a time of around 2ms and the present step takes around 14ms. So the time here appears to be getting burned on the present. I'm going to keep fiddling with settings more - let me know if you have any suggestions.

kael

2015-08-02 02:52

reporter  

madshi

2015-08-02 09:18

administrator   ~0001135

Have you tried different GPU driver versions? This might be a GPU driver problem.

kael

2015-08-02 13:31

reporter   ~0001136

Since I'm on Windows 10 there are no other video drivers to try. :-(

Maybe I could set up a dual-boot install of W7 or W8 to try as a comparison.

madshi

2015-08-02 14:34

administrator   ~0001137

From what I've seen, there are the official drivers, and some older slightly "unofficial" one, for the Windows 10 test builds. At least users in the doom9 forum have been using different NVidia driver versions for Windows 10.

kael

2015-08-09 00:54

reporter   ~0001138

After some more testing, I wanted to mention that the problems I have with 'present multiple frames in advance' d3d9 mode (and to a lesser extent, d3d9 backbuffer mode) seem to be that DWM is pulling old frames out of the present queue. I'm not sure if this would be a DWM bug, an nvidia bug, or a madvr bug, or some mix of the three.

In this scenario madvr's queues are full and show up as full - it's not dropping frames - but for some reason an old frame appears for a single vsync periodically. Sometimes it gets into this state and I see wrong frames repeatedly for a while. Making the queue shorter makes this problem worse. Do madvr or DWM use a ring buffer of frames or something? That would explain this.

Is it possible that something subtle changed recently about how madVR interacts with DWM? Does this just sound like a set of driver bugs? Sadly this issue doesn't show up in GPU traces like the d3d11 present time degradation does. I wouldn't be amazed if these were driver bugs, but madVR is the only thing I've ever had DWM problems with, and I did have these issues on Win8.

kael

2015-08-11 10:59

reporter   ~0001139

OK, I think this is definitely an nvidia or dwm bug. I managed to reproduce the frame ordering thing in firefox, and I'm already sure the d3d11 frame backlog thing is too.

It would be cool to get a diagnostic/workaround feature of some sort, since I suspect this probably affects other people with the nvidia/win10 pairing. The simplest workaround I can imagine would be to re-initialize the device when average present times get close enough to the vsync interval. Resizing the player window in mpc-hc is enough to do this and other than a tiny frame drop from the queues being emptied, it seems to be instant, so it's not a bad workaround (other than being a terrible hack).

The D3d9/dwm wrong frame order thing is harder to imagine a workaround for, but it at least doesn't affect D3D11. Seems to happen even with a 3-frames-deep DWM queue. It only ever seems to present a single frame in the wrong order, so doubling up every frame in the DWM queue and halving the present interval might work? I bet that would have nasty side effects though. Blah, compositors.

Thanks for the help troubleshooting so far. I'm gonna yell at nvidia.

madshi

2015-08-11 11:21

administrator   ~0001140

I don't believe in trying to work around weird GPU driver issues. Doing so would cost me A LOT of time, and if NVidia finally decides to look into the issue, having a workaround in madVR means if would make it much harder for NVidia to see what's wrong.

The best solution is to report this to NVidia and have them fix it.

madshi

2015-09-25 20:26

administrator   ~0001174

Any news? I suppose I can close this one cause it's seemingly not my fault?

kael

2015-09-25 20:31

reporter   ~0001176

NV doesn't respond to bug reports and I've had this problem for ages, so I switched to EVR-CP :/ might as well close it. I may revisit in a few months but I expect they will never fix it.

madshi

2015-09-25 20:32

administrator   ~0001177

You reported this problem only for D3D11 presentation, and you said things were fine in D3D9. So why did you switch to EVR-CP then instead of switching to madVR with D3D9? ;-O

kael

2015-09-25 20:33

reporter   ~0001178

D3D9 drops frames often enough for me to notice. :-(

madshi

2015-09-25 20:35

administrator   ~0001179

Even in FSE mode? Are those frame drops reported in the OSD?

kael

2015-10-09 16:32

reporter   ~0001186

FSE drops less, but yes. And they show up in the OSD. Usually a pair of 'dropped frame' and 'repeated frame'.

madshi

2015-10-09 17:21

administrator   ~0001188

If you provide a debug log which captures those D3D9 FSE drops/repeats, maybe I can see something. (Or maybe no.) If you do, please zip the debug log. Thanks.

kael

2015-10-24 23:57

reporter   ~0001213

I just wanted to check back in and let you know that I switched out the HDTV I was using for a Dell monitor. I have noticed that the display rate is more steady and the clock deviation doesn't drift much or increase. Coincidentally (?) i haven't had frame drop/repeat issues on this machine anymore using D3D9 in 'present many frames in advance' mode, and it seemed pretty steady in old backbuffer mode (despite the fact that windows won't let me create more than 3 backbuffers). So I think this may have all been a bad interaction between my drivers and a TV with weird timing characteristics. Does that make sense?

madshi

2015-10-25 00:07

administrator   ~0001214

The TV shouldn't have much effect on the HTPC at all. The only way I can influence things is by providing an EDID block with weird timings. Which madVR build have you now tested with? Could be that the issue you had was also fixed by updating to a newer madVR build? I don't really know...

kael

2015-10-25 00:11

reporter   ~0001215

I'm still using v0.88.21. I noticed this because I remembered seeing the 'display' hz rate drift a bit during playback and the 'clock deviation' would drift a *lot*, where it's quite stable now and very low. I feel like that was probably a bad panel.

kael

2015-10-25 00:14

reporter   ~0001216

I just looked at the change log for newer versions:
* fixed: potential cause for "old frame" flickering when using smooth motion
* fixed: potential cause for "old frame" flickering in new windowed/FSE modes

Even if those weren't for me, thank you very much for taking a shot at fixing them :-)

madshi

2015-10-25 00:15

administrator   ~0001217

It's not like the GPU and display would use a handshake during playback to agree on a common timing. The GPU outputs whatever it likes and the display has to "suck it up" (or fail and show a black screen or "signal out of range" complaint). Because of that, as I said, the influence of the display is limited to provide a specific EDID block which tells the GPU/HTPC which timings the display supports. It's possible that your GPU/HTPC didn't like the EDID block of your old display, I guess, but I don't really know.

Clock deviation drifting could also have to do with which audio output you're using (e.g. SPDIF or HDMI), which audio renderer etc.

Issue History

Date Modified Username Field Change
2015-07-31 23:54 kael New Issue
2015-08-01 13:03 kael Note Added: 0001126
2015-08-01 13:03 kael File Added: Screenshot 2015-08-01 04.00.31.png
2015-08-01 13:08 kael File Added: non-degraded gpuview.png
2015-08-01 13:09 kael Note Added: 0001127
2015-08-01 13:19 madshi Note Added: 0001128
2015-08-01 13:20 kael Note Added: 0001129
2015-08-01 13:23 madshi Note Added: 0001130
2015-08-01 13:26 kael Note Added: 0001131
2015-08-01 13:52 madshi Note Added: 0001132
2015-08-01 13:57 kael Note Added: 0001133
2015-08-02 02:51 kael Note Added: 0001134
2015-08-02 02:52 kael File Added: Screenshot 2015-08-01 17.48.59.png
2015-08-02 09:18 madshi Note Added: 0001135
2015-08-02 13:31 kael Note Added: 0001136
2015-08-02 14:34 madshi Note Added: 0001137
2015-08-09 00:54 kael Note Added: 0001138
2015-08-11 10:59 kael Note Added: 0001139
2015-08-11 11:21 madshi Note Added: 0001140
2015-09-25 20:26 madshi Note Added: 0001174
2015-09-25 20:26 madshi Assigned To => madshi
2015-09-25 20:26 madshi Status new => feedback
2015-09-25 20:31 kael Note Added: 0001176
2015-09-25 20:31 kael Status feedback => assigned
2015-09-25 20:32 madshi Note Added: 0001177
2015-09-25 20:33 kael Note Added: 0001178
2015-09-25 20:35 madshi Note Added: 0001179
2015-10-09 16:29 madshi Status assigned => feedback
2015-10-09 16:32 kael Note Added: 0001186
2015-10-09 16:32 kael Status feedback => assigned
2015-10-09 17:21 madshi Note Added: 0001188
2015-10-09 17:21 madshi Status assigned => feedback
2015-10-24 23:57 kael Note Added: 0001213
2015-10-24 23:57 kael Status feedback => assigned
2015-10-25 00:07 madshi Note Added: 0001214
2015-10-25 00:11 kael Note Added: 0001215
2015-10-25 00:14 kael Note Added: 0001216
2015-10-25 00:15 madshi Note Added: 0001217
2018-01-14 15:51 madshi Status assigned => closed
2018-01-14 15:51 madshi Resolution open => unable to reproduce