Module rspamd_text

This module provides access to opaque text structures used widely to prevent copying between Lua and C for various concerns: performance, security etc…

You can convert rspamd_text into string but it will copy data.

Brief content:

Functions:

Function Description
rspamd_text.fromstring(str) Creates rspamd_text from Lua string (copied to the text).
rspamd_text.null() Creates rspamd_text with NULL pointer for testing purposes.
rspamd_text.randombytes(nbytes) Creates rspamd_text with random bytes inside (raw bytes).
rspamd_text.fromtable(tbl[, delim]) Same as table.concat but generates rspamd_text instead of the Lua string.

Methods:

Method Description
rspamd_text:byte(pos[, pos2]) Returns a byte at the position pos or bytes from pos to pos2 if specified.
rspamd_text:len() Returns length of a string.
rspamd_text:str() Converts text to string by copying its content.
rspamd_text:ptr() Converts text to lightuserdata.
rspamd_text:save_in_file(fname[, mode]) Saves text in file.
rspamd_text:span(start[, len]) Returns a span for lua_text starting at pos [start] (1 indexed) and with.
rspamd_text:sub(start[, len]) Returns a substring for lua_text similar to string.sub from Lua.
rspamd_text:lines([stringify]) Returns an iter over all lines as rspamd_text objects or as strings if stringify is true.
rspamd_text:split(regexp, [stringify]) Returns an iter over all encounters of the specific regexp as rspamd_text objects or as strings if stringify is true.
rspamd_text:at(pos) Returns a byte at the position pos.
rspamd_text:memchr(chr, [reverse]) Returns the first or the last position of the character chr in the text or.
rspamd_text:bytes() Converts text to an array of bytes.
rspamd_text:lower([is_utf, [inplace]]) Return a new text with lowercased characters, if is_utf is true then Rspamd applies utf8 lowercase.
rspamd_text:exclude_chars(set_to_exclude, [always_copy]) Returns a text (if owned, then the original text is modified, if not, then it is copied and owned).
rspamd_text:oneline([always_copy]) Returns a text (if owned, then the original text is modified, if not, then it is copied and owned).
rspamd_text:base32([b32type]) Returns a text encoded in base32 (new rspamd_text is allocated).
rspamd_text:base64([line_length, [nline, [fold]]]) Returns a text encoded in base64 (new rspamd_text is allocated).
rspamd_text:hex() Returns a text encoded in hex (new rspamd_text is allocated).
rspamd_text:find(pattern [, init]) Looks for the first match of pattern in the string s.

Functions

The module rspamd_text defines the following functions.

Function rspamd_text.fromstring(str)

Creates rspamd_text from Lua string (copied to the text)

Parameters:

  • str {string}: string to use

Returns:

  • {rspamd_text}: resulting text

Back to module description.

Function rspamd_text.null()

Creates rspamd_text with NULL pointer for testing purposes

Parameters:

  • str {string}: string to use

Returns:

  • {rspamd_text}: resulting text

Back to module description.

Function rspamd_text.randombytes(nbytes)

Creates rspamd_text with random bytes inside (raw bytes)

Parameters:

  • nbytes {number}: number of random bytes generated

Returns:

  • {rspamd_text}: random bytes text

Back to module description.

Function rspamd_text.fromtable(tbl[, delim])

Same as table.concat but generates rspamd_text instead of the Lua string

Parameters:

  • tbl {table}: table to use
  • delim {string}: optional delimiter

Returns:

  • {rspamd_text}: resulting text

Back to module description.

Methods

The module rspamd_text defines the following methods.

Method rspamd_text:byte(pos[, pos2])

Returns a byte at the position pos or bytes from pos to pos2 if specified

Parameters:

  • pos {integer}: index
  • pos2 {integer}: index

Returns:

  • {integer}: byte at the position pos or varargs of bytes

Back to module description.

Method rspamd_text:len()

Returns length of a string

Parameters:

No parameters

Returns:

  • {number}: length of string in bytes

Back to module description.

Method rspamd_text:str()

Converts text to string by copying its content

Parameters:

No parameters

Returns:

  • {string}: copy of text as Lua string

Back to module description.

Method rspamd_text:ptr()

Converts text to lightuserdata

Parameters:

No parameters

Returns:

  • {lightuserdata}: pointer value of rspamd_text

Back to module description.

Method rspamd_text:save_in_file(fname[, mode])

Saves text in file

Parameters:

No parameters

Returns:

  • {boolean}: true if save has been completed

Back to module description.

Method rspamd_text:span(start[, len])

Returns a span for lua_text starting at pos [start] (1 indexed) and with length len (or to the end of the text)

Parameters:

  • start {integer}: start index
  • len {integer}: length of span

Returns:

  • {rspamd_text}: new rspamd_text with span (must be careful when using with owned texts…)

Back to module description.

Method rspamd_text:sub(start[, len])

Returns a substring for lua_text similar to string.sub from Lua

Parameters:

No parameters

Returns:

  • {rspamd_text}: new rspamd_text with span (must be careful when using with owned texts…)

Back to module description.

Method rspamd_text:lines([stringify])

Returns an iter over all lines as rspamd_text objects or as strings if stringify is true

Parameters:

  • stringify {boolean}: stringify lines

Returns:

  • {iterator}: iterator triplet

Back to module description.

Method rspamd_text:split(regexp, [stringify])

Returns an iter over all encounters of the specific regexp as rspamd_text objects or as strings if stringify is true

Parameters:

  • regexp {rspamd_regexp}: regexp (pcre syntax) used for splitting
  • stringify {boolean}: stringify lines

Returns:

  • {iterator}: iterator triplet

Back to module description.

Method rspamd_text:at(pos)

Returns a byte at the position pos

Parameters:

  • pos {integer}: index

Returns:

  • {integer}: byte at the position pos or nil if pos out of bound

Back to module description.

Method rspamd_text:memchr(chr, [reverse])

Returns the first or the last position of the character chr in the text or -1 in case if a character has not been found. Indexes start from 1

Parameters:

  • chr {string/number}: character or a character code to find
  • reverse {boolean}: last character if true

Returns:

  • {integer}: position of the character or -1

Back to module description.

Method rspamd_text:bytes()

Converts text to an array of bytes

Parameters:

No parameters

Returns:

  • {table|integer}: bytes in the array (as unsigned char)

Back to module description.

Method rspamd_text:lower([is_utf, [inplace]])

Return a new text with lowercased characters, if is_utf is true then Rspamd applies utf8 lowercase

Parameters:

  • is_utf {boolean}: apply utf8 lowercase
  • inplace {boolean}: lowercase the original text

Returns:

  • {rspamd_text}: new rspamd_text (or the original text if inplace) with lowercased letters

Back to module description.

Method rspamd_text:exclude_chars(set_to_exclude, [always_copy])

Returns a text (if owned, then the original text is modified, if not, then it is copied and owned) where all chars from set_to_exclude are removed Patterns supported:

  • %s - all space characters
  • %n - all newline characters
  • %c - all control characters (it includes 8bit characters and spaces)
  • %8 - all 8 bit characters
  • %% - just a percent character

Parameters:

  • set_to_exclude {string}: characters to exclude
  • always_copy {boolean}: always copy the source text

Returns:

  • {rspamd_text}: modified or copied text

Back to module description.

Method rspamd_text:oneline([always_copy])

Returns a text (if owned, then the original text is modified, if not, then it is copied and owned) where the following transformations are made:

  • All spaces sequences are replaced with a single space
  • All newlines sequences are replaced with a single space
  • Trailing and leading spaces are removed
  • Control characters are excluded
  • UTF8 sequences are normalised

Parameters:

  • always_copy {boolean}: always copy the source text

Returns:

  • {rspamd_text}: modified or copied text

Back to module description.

Method rspamd_text:base32([b32type])

Returns a text encoded in base32 (new rspamd_text is allocated)

Parameters:

  • b32type {string}: base32 type (default, bleach, rfc)

Returns:

  • {rspamd_text}: new text encoded in base32

Back to module description.

Method rspamd_text:base64([line_length, [nline, [fold]]])

Returns a text encoded in base64 (new rspamd_text is allocated)

Parameters:

  • line_length {number}: return text split with newlines up to this attribute
  • nline {string}: newline type: cr, lf, crlf
  • fold {boolean}: use folding when splitting into lines (false by default)

Returns:

  • {rspamd_text}: new text encoded in base64

Back to module description.

Method rspamd_text:hex()

Returns a text encoded in hex (new rspamd_text is allocated)

Parameters:

No parameters

Returns:

  • {rspamd_text}: new text encoded in hex

Back to module description.

Method rspamd_text:find(pattern [, init])

Looks for the first match of pattern in the string s. If it finds a match, then find returns the indices of s where this occurrence starts and ends; otherwise, it returns nil. A third, optional numerical argument init specifies where to start the search; its default value is 1 and can be negative. This method currently supports merely a plain search, no patterns.

Parameters:

  • pattern {string}: pattern to find
  • init {number}: specifies where to start the search (1 default)

Returns:

  • {number,number/nil}: If it finds a match, then find returns the indices of s where this occurrence starts and ends; otherwise, it returns nil

Back to module description.

Back to top.