Speeding up my SIP/MGCP Smalltalk Parsers

Speeding up my SIP/MGCP Smalltalk Parsers

When creating the MGCP and SIP implementation I didn’t want to do string splitting/scanning myself but follow the grammar of the two RFCs. I decided to use the PetitParser framework. PetitParser is a parsing
combinator. Which mostly mean you create small parsers and combine them with things like a sequence (parse this, than that and expect the input to be fully consumed).

For a long time this approach was just okay but I recently started to use the code on a under powered ARM system and the performance started to hurt. The MGCP/SIP Parser is created at runtime and then the result will be used.

Measuring

Besides things being too slow it is important to know how slow it is and to see if a change has made a difference or not. GNU Smalltalk has a Time class that provides a monotonic nanosecond clock. The easiest way to benchmark is to use an approach like:
 [ | start | start := Time nanosecondClock. benchmarkCode. start – Time nanosecondClock] value.

This will give me a number. Now with a language like Smalltalk there will be extra runtime code so there will be a noticeable variance.

Improving the PetitParser GNU Smalltalk port

PetitParser needs to use backtracking to go back in the stream when one element of a choice could not be parsed. This means PetitParser will heavily exercise >>#position, >>#position: and >>#atEnd. In Pharo the implementation has a position variable but in GNU Smalltalk the implementation has a pointer that points to the next character. When a method is called in Smalltalk an activation record (stack frame, context) needs to be created. GNU Smalltalk has the optimization to detect methods that simply return a value or instance variable. In the case of >>#position and >>#position: an arithmetic operation will be executed and can not be optimized. I have decided to change the PetitParser code to use >>#pointer and >>#pointer: to avoid the extra activation cost.

Local PetitParser hacks

In my older port of PetitParser.209 the PPRepeatingParser class will store the parsed elements in an OrderedCollection and at the end convert it to an Array. My client code will simply iterate over the
result and I don’t need to pay the price of the extra memory collection and copying.
With Smalltalk and open classes I can simply load my own/improved/reduced version of the >>#parseOn:
and benefit for the speed-gain in my implementation.

Speeding up parser construction

In PetitParser one defines the parser in selectors and on creating everything from the >>#start will be turned into the parser. In case some basic parsers are re-used one can create instance variables and the PetitParser baseclass will then cache the result in that variable. In the past I haven’t used this mode too much but I had a lot of parts that are used more than once. This was a significant speed-up in parser construction. At least in PetitParser.209 (I think later versions had some improvements) not every caching will make things more quick. It is good to benchmark both number of created parsers and number construction speed.

Simplifying the Grammar

In both MGCP and SIP I have a SIPGrammar/MGCPGrammar class and then a subclass called SIPParser/MGCPParser that creates a structure my client code can work with. When creating the SIPGrammar/MGCPGrammar I followed the RFC BNF but e.g. for matching the possible parameters I had to create a choice parser with many many choices but all of them are in the form of “Key: Value”. So instead of checking the valid keys with the Grammar, I should check the structure in the grammar and do the key validation inside the parser class.

Using PetitParser instead of following the Grammar

In the case of SIP there a lot of rules that specify where whitespace can be. This means I created PPSequenceParsers where some elements consume the space that nobody cares about it. The PPParser class already knows the >>#trim selector to deal with such things. It will automatically take away leading and trailing whitespace.
In SIP (e.g. with the challenge BNF) there is one mandatory parameter followed by optional elements separated by comas. PetitParser has a built-in way to express such things. The >>#separatedBy: selector will take a parser (e.g. one that can parse a comma) as parameter and then parse one or more occurrences.

Using PPObjectPredicateParser

In PetitParser one can write #digit asParser / #word asParser and this will either parse a single digit or a single character. In both cases a PPObjectPredicateParser with a PPCharsetPredicate (with an internal look-up table) will be created. I had many of such occurrences and could simplify the code.

Creating a custom parser

In SIP some parameters can be of the nature of key=value or key=”VALUE”. The later is a quoted-string that permits certain characters and certain escape sequences. The rule to parse this were nested character parsers of choices of choices. The resulting parser was very slow. A simple string to parse a nonce with a quoted-string could take 40ms. I decided to write my own SIPQuotedStringParser.

Summary

The MGCPParser construction time was in the range of 20 seconds, after the change we are in the ballpark of a second or such and the situation was similar for the SIPParser. A testcase to parse a 401 SIP message went from 200ms to 70ms. This is on a slow ARM with the plain interpreter and there should still be some room for improvements.

Outlook

I think PetitParser could have some further optimizations. Instead of the PPCharSetPredicate and the PPObjectPredicateParser there could be a PPCharSetPredicateParser that avoids the call to >>#value:. and one creating two PPCharSetPredicateParsers and joining them with a PPChoiceParser one could simply join the two look-up tables. This would optimize #digit asParser / #blank asParser and save one LookupTable. The next thing would be to create sparse PPCharSetPredicate when one knows that only a subset will be filled.
In terms of GNU Smalltalk we need to port to GNU lightning 2.0 and we would gain JIT support for ARM and can take it from there as well. Another option would be to start having ByteCode to ByteCode optimizations like inlining often called methods (with and without OSR).
Last but not least we could use MrGwen’s GNU lightning bindings and JIT a parser from the PetitParser representation.
Comments are closed.