Rspamd has been started to handle mail flows that has grown over the last decade by more than ten times. From the very beginning of the project, Rspamd was oriented on highly loaded mail systems with development focus on performance and scan speed. Rspamd is written in plain C language and it uses a number of techniques to run fast especially on the modern hardware. On the other hand, it is possible to run Rspamd even on an embedded device with a very constrained environment.
So far, Rspamd can filter hundreds of messages per second using a single 4 cores SandyBridge machine:
Rspamd can be treated as a faster replacement for SpamAssassin mail filter with the ability to scan ten times more messages using the same rules (by means of SpamAssassin plugin). In the next graph, you can see how switch to Rspamd from SA helped to reduce CPU load on scanner machines:
Global optimizations are used to speed up the overall messages processing improving all filters performance and arranging checks in an optimal order.
Rspamd uses various methods to speed up each individual message processing stage. This is achieved by applying local optimizations techniques:
AST optimizations are used to exclude unnecessary checks from rules. You can watch the following slides to get more details about this method.
Unlike SA, Rspamd uses specific state machines to parse email components: mime structure, HTML parts, URLs, images, received headers and so on and so forth. This approach allows to skip unnecessary details and extract information from emails quicker than by using a large set of regular expressions for these purposes.
Hyperscan engine is used in Rspamd to quickly process large regular expressions set. Unlike traditional regexps engines, Hyperscan allows to process multiple expressions at the same time. There are many details about hyperscan that are covered in the following slides.
Assembly snippets allow to optimize specific algorithms for targeted architectures. Rspamd uses assembly for some frequently used cryptography and hashing algorithms selecting the proper version of code in runtime, relying on CPU instructions set support tests.