-
Notifications
You must be signed in to change notification settings - Fork 34
Improve performance by optimizing critical code sections #1420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance by optimizing critical code sections #1420
Conversation
This method is used in a few places, including in other methods in MethodsV2. As such, it may be called repeatedly with identical arguments in different places. It is expensive in time due to the number of requests that this method generates, so it is particularly interesting to memoize. Doing so yields a very significant performance boost.
Simplify the constructor, which comes with a considerable speedup (about 14% less time spent constructing new objects). Function calls are expensive in Perl, so we start by eliminating the _new() method and replacing them with direct calls to Class::Accessor::new(). Also, there is no need to do something like my $class = ref $proto || $proto because Class::Accessor::new() already does that for us. We trade off a little bit of safety for speed as well: the new() function is now assuming that it’s always called with at most 1 parameter. We also remove a useless type check because the error case was unreachable. Given how often new objects are constructed, speeding this function up even slightly has a significant effect on Zonemaster’s run time.
Throughout a test, the string() method is called over one hundred thousand times. Creating the string is relatively expensive, and memoizing the result is cheap memory-wise. This change is simple and nets us a 5% gain in execution time on a simple run of zonemaster-cli.
We can implement a simple optimization for the string() method if the object is built using from_string(). This saves a little bit of time as well.
A few modifications are made to str_cmp, which is one of the most-used functions in Zonemaster::Engine. Firstly, not unpacking @_ gives us a first speed boost. Secondly, test whether we are comparing with another Zonemaster::Engine::DNSName object (which is almost always the case), so we can use a shortcut if that is the case. Thirdly, use a more efficient regular expression to remove a trailing dot from a name except if it is the only character in the string. The trick, which works with Perl 5.10.0 and up, uses the relatively obscure \K token in the regular expression, which works more or less like a variable-length lookbehind.
I’m less sure about whether these particular optimizations do any good, because the speedup I’ve measured is too small to be statistically significant. The code in question is not used often. But because we know that the “name” attribute of a Zonemaster::Engine::Zone is always a Zonemaster::Engine::DNSName, we can take a little shortcut.
The constructor of Zonemaster::Engine::Nameserver objects is slightly redesigned to speed it up and to precompate objects’ string representations, thus speeding up string() in the process. First of all, we don’t need to write my $class = ref $proto || $proto because Class::Accessor::new() already does that for us. Secondly, there were multiple repeated calls of the same accessor on the same Net::IP::XS object that I’ve tried to factor out. This induces a change that might break software naively processing IPv6 addresses in NS_CREATED messages. More on that later. Last, but not least, we have the opportunity to precompute the object’s string representation cheaply at construction time. The string() method, which is called very very often throughout a Zonemaster run, thus merely returns that precomputed string. The extra memory footprint is insignificant compared to the time gains, however. About the aforementioned breaking change: this function emits NS_CREATED messages whenever a new object is actually constructed instead of being returned from the object cache. These messages have an “ip” parameter which was populated with the output of Net::IP::XS::ip() applied to the address object. As a side-effect of trying to avoid calling the same subroutine multiple times, NS_CREATED messages are now populated with the output of Net::IP::XS::short(). As a result, NS_CREATED messages will use the short form of IPv6 addresses (e.g. 2001:db8:0:8::1) instead of the long form (e.g. 2001:0db8:0000:0008:0000:0000:0000:0001). Which is arguably a good thing, because IPv6 addresses in NS_CREATED messages are now in their canonical form, which wasn’t previously the case. The old practice was bad and should have been avoided from the start.
Slightly speeding up Zonemaster::Engine::Nameserver::query() thanks to a few relatively simple tricks. Firstly, store the name server’s address and the effective profile in intermediate variables. They are accessed often and intermediate variables both make the code slightly easier to read and avoid repeatedly calling methods. They are just accessors, so they are referentially transparent as long as the object is not mutated. Secondly, turn repeated calls of Digest::MD5::add() into one single call, thus avoiding the overhead of repeated calls to this function. Calling Digest::MD5::add("1", "2", "3", …, "n") is equivalent to calling Digest::MD5::add("1"), then again with "2", then "3", and so on.
Use grep more efficiently in functions within Zonemaster::Engine::Packet. The functions modified in this commit are used very frequently in the code, so a detail like this actually gives a 3% speedup on a test run.
1e06bfe
to
c43cdb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great findings ! This PR combined with zonemaster/zonemaster-ldns#217 and zonemaster/zonemaster-cli#419, I measured a 70% decrease in Zonemaster::Engine::test_zone()
and 58% in total execution time; impressive !
Good work! I just have one nit. I believe Memoize::flush_cache() should be called from Zonemaster::Engine::reset(). |
Good point. We might also need to do that in other cases, such as when we touch the cache. Or when we add fake delegation. Any other possible situations where invalidating the memoized results from MethodsV2 might be on the safe side? |
It looks to me like we're not doing our due diligence with regard to cleaning up fake delegations. We add stuff to %Zonemaster::Engine::Recursor::_fake_addresses_cache but I don't think we every remove anything from it. I believe the caller is responsible for calling Zonemaster::Engine::reset between test runs, and that this would cover the fake delegation situation too. Maybe we should create a replacement for Zonemaster::Engine::test_zone that takes fake delegation and fake ds info as arguments, automatically clears the caches, adds fake delegation and ds (if provided) and starts the test. And then deprecate Zonemaster::Engine::test_zone directing people to use the new method instead. (Btw, we ought to update |
I’ve added a commit that invalidates the MethodsV2 cache when calling Some basic testing:
If we call
|
@tgreenx and @mattias-p, can I have another review of this PR? In particular, are there any things left to be addressed before you’d approve? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great! I have a couple of questions though.
This commit speeds up two very often-called subroutines in Zonemaster::Engine::Util. ns() is optimized by removing a layer of indirection. It previously called Zonemaster::Engine::ns() which then forwards its arguments, unmodified, to Zonemaster::Engine::Nameserver::new(). Removing this layer of indirection avoids an expensive function call. name() is optimized by not unpacking @_, instead forwarding it directly to Zonemaster::Engine::DNSName::new(), also netting a significant speedup in this function’s execution time.
A nameserver constructed with a name of "0" could be mistakenly cached as having no name, because in Perl, the string "0" is falsey. It’s better to test for string emptyness instead.
A previous commit introduced memoization of one particular function in MethodsV2. Make sure we can invalidate the cache, and make sure it happens when calling Zonemaster::Engine::reset().
66d3468
to
045e18d
Compare
@marc-vanderwal, please merge. |
Purpose
This PR aims to improve performance of Zonemaster::Engine by speeding up some critical sections in the code base.
The net effect of these speedups mostly depends on network conditions: domains whose name servers are fast, work well and always respond to all queries will benefit the most from these changes, while domains having unresponsive name servers might not notice much of a difference. Runs of unit tests take 16% less time; runs of
zonemaster-cli
which use previous data (--restore <FILE>
) also benefit greatly.In practice, a test run on the first 1 000 domains of the Tranco list generated on January 21st, 2025 gave a rather disappointing 1,1 % decrease in run time. I suspect that this is due to a small number of domains that take considerable time to test because some of their name servers do not respond to some of the DNS messages Zonemaster sends.
While the CPU is clearly not the bottleneck for Zonemaster, it’s still nice to consume fewer CPU cycles when testing domains. Zonemaster does a lot of scatter-gather processing where the same query is sent to all name servers before processing all their responses. Parallelizing that will benefit Zonemaster the most.
Context
Discussions on improving performance of Zonemaster during last face-to-face meeting.
Changes
In a nutshell, the speedups are the result of the following types of changes:
@_
in the most critical functions.Reviewers are strongly encouraged to look at each commit one by one, as each commit message thoroughly documents what was done.
I’m setting the V-Major tag because one of the commits changes the string representation of IPv6 addresses in
NS_CREATED
messages. Previously, IPv6 addresses were represented in full (e.g. 2001:0db8:0000:0008:0000:0000:0000:0001); now they are represented using their short and canonical representation (e.g. 2001:db8:0:8::1). Technically, this change could break software processing IPv6 addresses naively inNS_CREATED
messages.How to test this PR
Unit tests must still pass.
If curious about performance improvements, the benchmark I used is: