Mikael Manukyan

GSOC 2016 #4: Sneaky bug

July 26, 2016 · 4 mins to read

In this blog post I want to tell you about bug in Splash which was undiscovered almost a year.

First meeting

The past two weeks I was working on HTMLElement class which makes working with HTML DOM elements easier. I thought I’ve almost finished it, but when I’ve started to write tests one strange thing happened. In that test I selected several DOM elements using splash:select and did assertion of their HTML node types. I had 5 different elements: p, input, div, span and button.

function main(splash)
    assert(splash:go(splash.args.url))
    assert(splash:wait(0.5))

    local p = splash:select('p')
    local input = splash:select('input')
    local div = splash:select('div')
    local span = splash:select('span')
    local button = splash:select('button')


    return {
        p=p:node_property('nodeName'):lower(),
        input=input:node_property('nodeName'):lower(),
        div=div:node_property('nodeName'):lower(),
        span=span:node_property('nodeName'):lower(),
        button=button:node_property('nodeName'):lower(),
    }
end

The weird thing started when the actual types returned by splash were p, button, button, button, button.

As you could notice the test failed. Also, you could notice that only the first element had the correct type and the other ones have the type of the last element. To test that I’ve tried to swap some splash:select calls. The result was the same: the first value was correct and the other ones had the type of the last splash:select.

Investigation

After some thoughts I assumed that the issue was in some method that becomes the same (static) for the all instances of HTMLElement or _ExposedElement. I examined both classes but didn’t find any strange initialization which overrides the class methods. To confirm my thoughts I logged every splash:select and element:node_property call to see the instance on which these methods are called. It turned out that only the first and the last instances of _ExposedElement were used. So, the issue is in the function that calls these methods.

Where those function are called? From Lua. For a moment I thought that our Lua runner (lupa) is broken (because there is some not fixed bug in it), but that idea was thrown away quickly. I wondered if this bug is in our Lua wrappers code so it must show itself somehow. For that moment the only thing that could go wrong is a return value of splash:call_later because it creates an instance of _ExposedTimer which is the only class which can be created as many times as you want (on the contrary, Splash, Response, Request and Extras class are created once during the Lua script execution). I initialized several timers and wrote a simple test to check whether my assumption about the bug was write. And it was confirmed - the bug is in our Lua wrappers, because I got the same issue with the instances of _ExposedTimer.

I started examining methods of wraputils.lua and noticed several strange things:

  1. Metemethods are initialized on the prototype table after each setup_property_access call.

  2. In those metamethods for getters/setters we are using self, but the other properties are retrieved/assigned from/to the cls.

So what was happening? Why the first splash:select element was working correctly and the other ones except the last one not? The answer is pretty obvious. During the first call of splash:select metamethods for Element wasn’t set and hence not called. So everything was working as it should work. However, after the first call we are setting those metamethods, so after every next call they are called when we assigning methods to the instance of Element and in the __newindex method we are setting that method to the Element. So when executing span:node_property('nodeName') it actually calls Element:node_property because of our __index metamethod.

Solution

After understanding why that happened the solution comes to mind very quickly: assign getters/setters to the self and call rawget and rawset on the self. Which was done in my PR.

Conclusion

It was very interesting bug. During the work on it I’ve learned many things about how OOP and metamethods works in Lua. I hope that I’ll meet such kind of challenging tasks in my future work with Splash.

  • gsoc

  • Previous Post

  • Next Post