Rspamd vs Spamassassin performance comparison

2016-03-03 00:00:00 +0000

Just before 1.2 release, I have measured performance of rspamd comparing to SA. In this experiment, I’ve taken rspamd master branch with default rules. Then I’ve added all rules from SA using spamassassin plugin. Hence, two scanners run with almost exact set of rules.

This set is quite large and it includes about 3k of custom regexp rules. Rspamd runs without hyperscan and pcre2, so it performs literally the same job as SA does. And here are results for about 100k messages being scanned:

Total False Positives: 517
Total False Negatives: 348
Total messages: 101349

Total SA time: 423942 seconds, total rspamd time: 33149 seconds
Average SA time: 4182ms/msg, average rspamd time: 327ms/msg seconds

So the difference in checks is less than 1% and in many cases rspamd does better job than SA because, for example, multiple hits of URIBL rules, phishing detection and some other differences. And it’s still 13 times faster than SA. Moreover, it eats less memory and can process more messages in parallel. In other experiments, rspamd was able to process about 450 messages per second on a single SandyBridge 4 cores scanner box.

I plan to release rspamd 1.2 very soon with a lot of cool features, including dynamic rules updates. I would appreciate any help in testing of the experimental packages. In fact, they are already used in production and are even more stable than 1.1 branch.

Rspamd switches to apache 2 license

2016-02-04 00:00:00 +0000

In the modern world, software patents are a significant threat for the Open Source software. Therefore, I have decided to switch from the original BSD license to Apache-2 license. Whilst Apache license has the same permissive clauses as BSD license has there is an explicit definition of software patents in Apache license. The Apache License contains both a patent grant and a patent retaliation clause.

Another terms of licensing have not been changed: you can still use the code in your projects and you are not obliged to open your modifications to the code like you need in GPL. Contributed code is still licensed under BSD license.

Rspamd 1.1.2 has been released

2016-01-29 00:00:00 +0000

The next feature release 1.1.2 of rspamd is out. This release contains some important improvements:

  • Add support for forged confirmation headers (by @AdUser)
  • Improve multimap plugin: add filtering support
  • Add rspamadm statconvert utility to convert statistical tokens and learn cache from sqlite3 to redis
  • Add logging for slow rules and regexps
  • Add mime_types plugin to check sanity of mime types in messages

Bugfixes and minor improvements in this version:

  • Fix stat_cache closing
  • Add checkpoints to sqlite3 learn cache
  • Do not recompile lua generated headers all the time
  • Increase number of messages learned
  • Fix issues with dual stack and hfilter
  • Disable MID checks for hfilter by default
  • Fix cache definitions in multiple classifier and no type
  • Don’t crash if learn cache failed to initialize
  • Fix googlegroups support in maillist plugin
  • Rework flags LUA API:
    • Allow to check for a specific flag
    • Add learn_spam, learn_ham and broken_headers flags
    • Unify internal functions
  • Allow any, mime and smtp for get_from/get_recipients
  • Add rule to detect spammers attempts to cheat mime parsing
  • Rework parsing of IP addresses in configuration (better IPv6 support)
  • Add util.parse_mail_address function to LUA API
  • Add lua sqlite3 module
  • Implement synchronous redis call
  • Ratelimit: avoid possible indexing of nil value (Fixes #498) (by @fatalbanana)
  • Implement redis advanced lua api with pipelining
  • Fix memory leak on redis stat (#500)
  • Fix user/language learn count in sqlite statistics (#496) (by @fatalbanana)
  • Fix build with custom pcre
  • Fix fuzzy relearning (#498)
  • Improve planning of asynchronous tasks
  • Add base32 decode/encode routines to lua util
  • Allow converting of learn cache from sqlite to redis
  • Add methods to check if a messages has from/rcpts
  • Disable reload command in rc scripts
  • Improve runtime CPU dispatcher for libcryptobox
  • Add preliminary support of digital signatures via ed25519
  • Add detection for RDRAND support
  • Print configuration of crypto on start
  • A in SPF presumes AAAA lookup as well

This version has full backward compatibility with 1.1.0 preserved.

Rspamd 1.1 has been released

2016-01-18 00:00:00 +0000

The next major release of rspamd: 1.1.0 is now released. In this version, I did another bunch of architectural rework. This time, I’ve refactored fuzzy storage, regular expressions processing and statistics primarily.

A number of features that I was asked constantly about have been added to rspamd, including such features as:

  • Autolearning for BAYES: statistics can learn on good or bad messages automatically
  • Redis backend for statistics to enable distributed and fast redis storage for rspamd cluster
  • Scalable fuzzy storage: it is now possible to scale hashes storage across multiple processes to process thousands of requests per second

There is also major performance improvement: hyperscan engine support for optimizing regular expressiosn execution.

With this version, I have added a lot of documentation, including tutorials and improved quick start guide.

Here is the full changelog for this version available on github:

The new version is almost 100% backward compatible with 1.0 branch but please check the migration document if you are using per-user statistics and rspamd-1.0.

Here are some graph of rspamd performance on scanning:

So you can see that rspamd can scan as much as 200 messages per second consuming less that 50% cpu of a typical scanner machine (Xeon E5405 single CPU).

Rmilter is also upgraded to the version 1.7.0 that brings full IPv6 support, redis cache support and major cleanup of unused and broken stuff. Rmilter changelog is available here:

Rspamd 1.0.11 has been released

2015-12-21 00:00:00 +0000

The next bugfixes only release 1.0.11 of rspamd is out. This release contains the following changes:

  • Fix spf redirects
  • Fix domains when parsing mx/ptr/a records in includes/redirects
  • Fix unfolded base64 encoding
  • Fix GError use-after-free
  • Do not rewrite the original url when using redirector
  • Fix parsing of fragment in urls
  • Fix processing of HTML tags
  • Improve empty image rule
  • Avoid long double type
  • Fix tokens weights in OSB algorithm
  • Improve debugging for bayes

This version has backward compatibility with 1.0.0 preserved.

The branch 1.0 is now considered as stable and all development has now been moved to the master branch which is going to be the next 1.1 major release.