GSOC 2016 #4: Sneaky bug
July 26, 2016 · 4 mins to read
In this blog post I want to tell you about bug in Splash which was undiscovered almost a year.
First meeting
The past two weeks I was working on HTMLElement class which makes working with HTML DOM elements easier. I thought I’ve almost finished it, but when I’ve started to write tests one strange thing happened. In that test I selected several DOM elements using splash:select
and did assertion of their HTML node types. I had 5 different elements: p
, input
, div
, span
and button
.
function main(splash)
assert(splash:go(splash.args.url))
assert(splash:wait(0.5))
local p = splash:select('p')
local input = splash:select('input')
local div = splash:select('div')
local span = splash:select('span')
local button = splash:select('button')
return {
p=p:node_property('nodeName'):lower(),
input=input:node_property('nodeName'):lower(),
div=div:node_property('nodeName'):lower(),
span=span:node_property('nodeName'):lower(),
button=button:node_property('nodeName'):lower(),
}
end
The weird thing started when the actual types returned by splash were p
, button
, button
, button
, button
.
As you could notice the test failed. Also, you could notice that only the first element had the correct type and the other ones have the type of the last element. To test that I’ve tried to swap some splash:select
calls. The result was the same: the first value was correct and the other ones had the type of the last splash:select
.
Investigation
After some thoughts I assumed that the issue was in some method that becomes the same (static) for the all instances of HTMLElement or _ExposedElement. I examined both classes but didn’t find any strange initialization which overrides the class methods. To confirm my thoughts I logged every splash:select
and element:node_property
call to see the instance on which these methods are called. It turned out that only the first and the last instances of _ExposedElement
were used. So, the issue is in the function that calls these methods.
Where those function are called? From Lua. For a moment I thought that our Lua runner (lupa) is broken (because there is some not fixed bug in it), but that idea was thrown away quickly. I wondered if this bug is in our Lua wrappers code so it must show itself somehow. For that moment the only thing that could go wrong is a return value of splash:call_later
because it creates an instance of _ExposedTimer
which is the only class which can be created as many times as you want (on the contrary, Splash
, Response
, Request
and Extras
class are created once during the Lua script execution). I initialized several timers and wrote a simple test to check whether my assumption about the bug was write. And it was confirmed - the bug is in our Lua wrappers, because I got the same issue with the instances of _ExposedTimer
.
I started examining methods of wraputils.lua and noticed several strange things:
-
Metemethods are initialized on the prototype table after each
setup_property_access
call. -
In those metamethods for getters/setters we are using
self
, but the other properties are retrieved/assigned from/to thecls
.
So what was happening? Why the first splash:select
element was working correctly and the other ones except the last one not? The answer is pretty obvious. During the first call of splash:select
metamethods for Element wasn’t set and hence not called. So everything was working as it should work. However, after the first call we are setting those metamethods, so after every next call they are called when we assigning methods to the instance of Element
and in the __newindex
method we are setting that method to the Element
. So when executing span:node_property('nodeName')
it actually calls Element:node_property
because of our __index
metamethod.
Solution
After understanding why that happened the solution comes to mind very quickly: assign getters/setters to the self
and call rawget
and rawset
on the self
. Which was done in my PR.
Conclusion
It was very interesting bug. During the work on it I’ve learned many things about how OOP and metamethods works in Lua. I hope that I’ll meet such kind of challenging tasks in my future work with Splash.