Falling from bliss

As long as everything goes smoothly using a higher-level language with OpenGL the experience is magical: it feels kind of like flying. But sometimes you hit a bug the debugger won’t catch(e.g. NULL pointer dereference) and it feels like crashing hard to the ground and skidding over dirt and rock.

Everything in RiftSkel was working fine on both Linux and Windows, but when I did a git pull and ran it on my VR system - crash. This was odd enough that I had to try it a few times and do a few sanity checks. Everything had worked fine on that very system when booted into Linux just the previous day. Everything also worked fine on a different machine(Surface Pro 3) running Windows 8 just minutes before.

Hardware or OS issue?

The table of functionality by GPU and OS looks like this:

GPU/OS Win7 Win8 Linux
Intel   fine  
NVIDIA   CRASH fine
AMD CRASH   fine


At this point, I am perplexed as to why the program is running fine on both Windows and Linux, yet crashing on this particular machine only on Windows. Confusing me even further, the pure luajit main which called into the very same scene code worked fine on the VR machine. It was only in the embedded luajit within the C++ app that the crash occurred.

The gamescene script was still working OK. Just as soon as I switched over to cubescene2(the one with textures), the program crashed. This was a tremendous clue - I bet those with lots of OpenGL experience already know what the problem was. I myself had to hack out all the GL functions call by call until I fould the ones that were crashing - texture functions such as glTexParameteri and glTexImage2D, and glGetIntegerv.

OpenGL Function Loaders

It turns out there is some legacy oddness related to loading OpenGL function pointers on Windows. I had heard of this before but never been bitten by it as the excellent GLFW library handled it all for me. In the Luajit main, I did as I always did and used glfw’s loader. But, in the embedded version, my thinking was that I didn’t need to use Luajit GLFW bindings as the window was already created by the C++ app, so I could just use wglGetProcAddress/glXGetProcAddress. Coincidentally, this created no problem with gamescene as gamescene used no functions from OpenGL 1.1 or earlier. But as soon as a scene added one of those older functions directly exported by OpenGL32.DLL - NULL dereference and crash.

The way to handle this is to check for invalid values returned from wglGetProcAddress and in that case look up proc addresses from Windows’s GetProcAddress. GLFW does it here in _glfwPlatformGetProcAddress. Since I wasn’t doing it manually, all the OpenGL 1 functions were coming back NULL.

I thought the simplest way around this was to pass the function pointer for glfwGetProcAddress from C++ where GLFW is linked right into Luajit. This required casting it to a double for lua_pushnumber then casting it back to a function pointer using the FFI. I’m pretty sure this will break with a 64-bit function pointer.

But why does it work on Intel/Win8?

Does the Intel graphics driver handle the two-phase function lookup silently? If so, why don’t the NVIDIA and AMD drivers do this? If not - where do the correct proc addresses come from on Surface Pro 3/Win8? I would love to find out more of the details about this.

Everything works again

In the meantime, work can proceed. Back on the jetpack, woohoo!!