What is the size of a QList::Data, RenderObject?

What is the size of a QList::Data, RenderObject?

We tend to write classes without really caring about what the compile will do to create the binary file. When looking into performance and specially memory usage and you create certain objects thousands of times it becomes interesting of how much memory one is wasting for padding/no good reason.

The Linux kernel hackers wrote a tool called pahole that will analyze the DWARF2 symbols and then spit out friendly messages like the one below.


struct Data {
class QBasicAtomicInt ref; /* 0 4 */
int alloc; /* 4 4 */
int begin; /* 8 4 */
int end; /* 12 4 */
uint sharable:1; /* 16:31 4 */

/* XXX 31 bits hole, try to pack */

void * array[1]; /* 20 4 */

/* size: 24, cachelines: 1, members: 6 */
/* bit holes: 1, sum bit holes: 31 bits */
/* last cacheline: 24 bytes */

In this case QList::Data could have used at least three bytes less memory and changing the definition of sharable and array would have removed a whole in the struct. Maybe that is something for Qt5 to keep in mind.

The research question. Can QtWebKit memory usage be reduced by shrinking some of the Qt structs without losing functionality?

Performance musings

Performance musings

Like many others I enjoy being in Las Palmas at the Gran Canaria Desktop Summit. It is great to see new and old friends, put faces to IRC nicknames… While sitting in talks I started to feel the performance itch. What code is the moc actually generating, how fast is it… Luckily Qt Software released their internal tests and you will find some benchmarks in the tests/benchmark directory.

Looking at QMetaObject and generated code

When using QObject::connect currently the following happens. For the sender QMetaObject::indexOfSignal gets called and for the receiver the QMetaObject::indexOfSlot or QMetaObject::indexOfSignal is being invoked. Now the various QMetaObject::index* (for properties, methods, signals, slots) work in the same way. You will start with the current QMetaObject and go through the list of all methods, if you didn’t find anything you go to the parent QMetaObject and do the same. What you are doing is a linear search for a signature across the inheritance hierachy. When having found the index of the method for a given QMetaObject you will add the number of methods from your parent QMetaObjects and this will be the id used by QObject::qt_metacall. The first thing the generated code in ::qt_metacall will do is to call the parents qt_metacall to subtract the id.

What can be done to improve it

Hashing and such comes into mind, or having a trie for the whole inheritance chain. With things like gperf you could create a perfect hash for the inheritance chain. The problem with having metadata over the whole inheritance chain is that maybe your baseclass is in a different library and they added a new method, now your hash might not be unique anymore… and obviously you will require more memory when you have the whole inheritance tree…

The easy solution is to sort methods/signals/slots so you can do binary search in the various indexOf* methods in QMetaObject. And so far I have only implemented this, but I have some other ideas from “self” and javascript how to cheat a bit to make recuring actions like QObject::invokeMethod a lot faster (there is no need to search again for the “slot”).

Another thing is when having searched/found the index of the slot/signal/method you might just safe the QMetaObject and the id instead of adding the offset and when emitting the signal you avoid going through the hierachy because you actually know which ::qt_metacall we want to call…

Non academic benchmark

The code can be seen in a branch on gitorious and for some tests in the QObject::connect/QObject::disconnect benchmark the new code is 30% faster. This happens when you have to go down in the inheritance tree to find the signal… For some other cases there is no difference. What is missing is code to deal with old generated code…

Taking over memprof

Taking over memprof

Where did all my memory go? Who is allocating it, how much is being allocated? From where were theses QImages allocated? valgrind provides an accurate leak checker, but for a running application you might want to know about allocations and browse through them and don’t take the performance hit of valgrind (e.g with massif).

There is an easy way to answer these questions, use memprof. memprof used to be a GNOME application, it was unmaintained, the website was gone from the net, but this tool is just way too good to just drop out of the net. After trying to reach the maintainer twice I decided to adopt the orphaned thing.

Check the application out, it is great, it helps me to get an overview of memory allocations for WebKit/GTK+…

Long overdue…

Long overdue…

A long overdue blog post… I’m currently in Taipei… canceled my original flight back to Germany, instead I will go to Hong Kong and then probably back to Taipei. So if you are in this area and would like to talk about WebKit, Linux, KDE, OpenBSC, OpenEmbedded… drop me a mail.

Thank you for KDE4.2

Thank you for KDE4.2

Thank you for the Club Mate at the KDE 4.2 release event in Berlin hosted by KDAB. Thanks to kubuntu I can enjoy KDE 4.2 on my intrepid installation. Which in turn allows me to use KMail 1.11.0 which is featuring the new wicked cool views and it looks like there were some nice oxygen style updates as well. Well done!

This time of the year

This time of the year

Last year around this time of the year I already had resigned from GMIT, figuring out what to do with myself, Thiago finished his new networking classes, Simon integrated most of it into WebKit at around FOSS.IN, we fixed the regressions of the layout test suite and I was tracing down bugs with SSL, a Cookie issue on yahoo mail, a crash on gmail… This year things look awfully similar. I know a little better what to do with myself, I have fixed a funny issue of QNetworkAccessCache (a new class in Qt4.5) breaking JavaScript sunspider benchmark.

If you write code that is using QObject::startTimer and not use QObject::killTimer you leak a timer. But if you manage to call QObject::killTimer twice chances are you have killed a timer belonging to someone else. To be safe you would have to reset the timerId you saved to an invalid timerId. The resulting code will look ugly though, but Qt to rescue! There is a shiny new class in Qt4.0 called QBasicTimer. It should not add any overhead and will avoid the above issue.

FOSS.IN and the need for more events

FOSS.IN and the need for more events

I have sadly missed this years FOSS.in. The goal of this conference is to turn India into a nation of FOSS contributors. There are plenty of people, awesome food!, there is a huge software industry, companies like Tata Consulting are even on Level 5 five of the CMMI model. This means there is a huge potential! But when I get my daily mail on webkit-dev I recognize that it is still a long way from simply consuming, to try to attempt to think, to contributing. And that many more events like FOSS.in need to occur until there is a noticeable difference.

On the other hand with people like Girish, Prashant, pradeepto, Shreyas I have high hopes that the goal will be reached.

Making your own dumplings

Making your own dumplings

If I would be in Taipei I could ask Tick, Jeremy, Erin, Olv, Julian, John to have lunch with me and eat dumplings at one of the restaurants pretty close to the office and lie that it is my first time eating with chop sticks and that my skills are not bad for the first time…. But I’m currently not in Taipei, I have no place to get dumplings. So today was the day to make vegetarian dumplings myself. The process is really simple, you don’t even need a fancy bamboo steamer, just don’t put too much sugar into it… working with tofu and vegetables is also fun. See you soon in Taipei…

The result:
CIMG1493.JPG
CIMG1486.JPG
The filling: Carrots, Pepper, Tofu, Ginger, some Chilli, Garlic, a bit of onions..
CIMG1479.JPG

Flying back to Taipei

Flying back to Taipei

Yesterday I spend most my day with watching videos that featured Alan Kay. These included videos about Squeak, Seaside, Croquet, demos of old (1977!) 3d, fully anti-aliased, zooming interfaces of the MIT. This was concluded with reading about self and PEP. When watching these kind of things I wonder why the computing world as it is today and why all these things have not made it into mainstream computing yet.

I will soon enter a train to Frankfurt, will arrive early at the airport, board an airplane and will be back in Taipei for a couple of days. Anyone interested in talking about WebKit, Smalltalk or other things please go ahead and give me a call.

Pushing things forward

Pushing things forward

There is one thing of Tiny SVG1.2 that I really like. It is the possibility to embed audio and video. For video you can do transformations, filters and the usual stuff of SVG. TinySVG1.2 is popping up in more specs and recently I began to read DIMS again and well and thanks to the support of GMIT I had a go at it.

This shows parts of the TinySVG spec and the video was replaced with Code Rush. It is dog slow and needs some refactoring to SVGSMILElement and parts of the SVG RenderObjects to be merge able. The code should popup in my holger/dims branch soon.

Hacking on the WebKit codebase is so much fun that I seriously wonder if I want more of that…

svg12_video