Friday 16 August 2013

Dota 2 Wine optimization for Intel GPUs

Dota 2 for Linux implements it's 3D engine by using a Direct3D to OpenGL translation layer called ToGL. I assume that this layer can be used in different ways, but for Dota 2 it seems to be used in a less than ideal way as documented previously here. In short, Dota for Linux compiles 11000 shaders on startup, compared to just 220 the Wine version does. This causes much higher memory usage (1.2 GB vs 2.6 GB) and start-up time (35 seconds vs 1:15 min).


With Wine we actually do get the source of their Direct3D to OpenGL layer called wined3d, since Wine is open source. It's funny, the stack used to run Windows version of Dota 2 is actually more open.

Since Dota 2 for Windows when run on Wine actually outperforms native Linux version in some important aspects, and it's framerate is just slightly less, I decided to take a look on improving its performance.

I've used a tool called apitrace to record a trace of a Dota 2 session with wine so I can analyze the OpenGL calls and look at driver performance warnings (INTEL_DEBUG=perf) with qapitrace.

I optimized two things:

1. Reduce the number of vs and ps constants checked

There were many calls to check values of VS (vertex shader) and PS (pixel shader, also called fragment shaders in OpenGL) constants each frame like this:

532550 glGetUniformLocationARB(programObj = 152, name = "vs_c[4095]") = -1

This function is called from shader_glsl_init_vs_uniform_locations() in glsl_shader.c in
wined3d.

It uses GL_MAX_VERTEX_UNIFORM_COMPONENTS_ARB, defined to be 4096 in #define MAX_UNIFORMS in Mesa source.

Dota 2 doesn't need so many uniforms, most checks return -1, and wined3d checks all of values for both VS
and PS uniforms.

I reduced this number to 256, just enough for Dota 2. This saved thousands of calls per frame.

2. Use fast clear depth more often

Intel driver complains about not being able to use fast depth clears because of scissor being enabled. Turns out that device_clear_render_targets() in wined3d device.c doesn't really need to do glScissor for Dota 2, it's probably an optimization that maps better to Direct3D driver.


A small patch including both optimizations is here:
https://gist.github.com/vrodic/6437312

This patch is a hack, and glScissor part probably breaks other apps, so this is just for Dota 2. It maybe could be made in a better way so it could be merged in Wine, but I'm not wined3d expert.

So how faster is it? A solo mid hero on a setup described in the previous blog post used to get 41 FPS. Now it gets 46-49 FPS. Native version is similar to optimized Wine, but in some situations it gets worse than Wine optimized.

Ideas for improvement:

Dota 2 for Linux needs  ~7500 calls per frame. Wine version, even after my optimizations needs 37000 (EDIT: just as I was writing this post, there were some improvements, now its about 22000).

There is probably a way to optimize this even more, but it's outside of the scope of an afternoon project,  like this was. I'd like to keep on digging though.

4 comments:

  1. How do we know that this game has a great potential?

    First, it is a product from Valve (Half-Life, Counter-Strike, Team Fortress, Left 4dead,...) and IceFrog, which have brought us many good games over the last decade. Another is that Dota will be run by an improved Source engine. If you are familiar with the source engine (for example, it was used in Half-Life series) than you know this is one of the best "engines" a game can have. Dota 2 is going to run through the well-known Steam.



    Dota 2

    ReplyDelete
  2. This is so cool. I am such a huge fan of their work. I really am impressed with how much you have worked to make this website so enjoyable.
    quantum entanglement

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete