Rule Syntax

The syntax for creating rules is based off of logical expressions evaluating to either True (matching) or False (non- matching). Rules support a small set of data types which can be defined as literals or resolved using the Python object on which the rule is being applied. See the Data Types table for a comprehensive list of the supported types.

Not all supported operations work with all data types as noted in the table below. Rules follow a standard order of operations.

Grammar

The expression grammar supports a number of operations including basic arithmetic for numerical data and regular expressions for strings. Operations are type aware and will raise an exception when an incompatible type is used.

Supported Operations

The following table outlines all operators that can be used in Rule Engine expressions.

Operation

Description

Compatible Data Types

Arithmetic Operators

+

Addition

BYTES, DATETIME, FLOAT, py:attr:~.DataType.STRING, TIMEDELTA 1

-

Subtraction

FLOAT, DATETIME, TIMEDELTA

*

Multiplication

FLOAT

**

Exponent

FLOAT

/

True division

FLOAT

//

Floor division

FLOAT

%

Modulo

FLOAT

Bitwise-Arithmetic Operators

&

Bitwise-and 2

FLOAT, SET

|

Bitwise-or 2

FLOAT, SET

^

Bitwise-xor 2

FLOAT, SET

>>

Bitwise right shift 2

FLOAT

<<

Bitwise left shift 2

FLOAT

Comparison Operators

==

Equal to

ANY

!=

Not equal to

ANY

Arithmetic-Comparison Operators

>

Greater than

ARRAY, BOOLEAN, DATETIME, TIMEDELTA, FLOAT, NULL, STRING 3

>=

Greater than or equal to

ARRAY, BOOLEAN, DATETIME, TIMEDELTA, FLOAT, NULL, STRING 3

<

Less than

ARRAY, BOOLEAN, DATETIME, TIMEDELTA, FLOAT, NULL, STRING 3

<=

Less than or equal to

ARRAY, BOOLEAN, DATETIME, TIMEDELTA, FLOAT, NULL, STRING 3

Fuzzy-Comparison Operators

=~

Regex match 4

NULL, STRING

=~~

Regex search 4

NULL, STRING

!~

Regex match fails 4

NULL, STRING

!~~

Regex search fails 4

NULL, STRING

Logical Operators

and

Logical and

ANY

not

Logical not

ANY

or

Logical or

ANY

?, :

Ternary operator

ANY

Membership Operators

in

Membership check

ARRAY, BYTES, MAPPING, SET, STRING

Accessor Operators

.

Attribute access

ARRAY, BYTES, DATETIME, MAPPING, SET, STRING, TIMEDELTA

&.

Safe attribute access

ARRAY, BYTES, DATETIME, MAPPING, NULL, SET, STRING, TIMEDELTA

[

Item lookup

ARRAY, BYTES, MAPPING, STRING

&[

Safe item lookup

ARRAY, BYTES, MAPPING, NULL, STRING

1 Addition operations involving DATETIME and TIMEDELTA must have a TIMEDELTA value on the right. TIMEDELTA values can be added to other TIMEDELTA values, or DATETIME values but DATETIME can not be added to other DATETIME values. The remaining types (BYTES, STRING, and FLOAT) must be added to values of the same type.

2 Bitwise operations support floating point values, but if the value is not a natural number, an EvaluationError will be raised.

3 The arithmetic comparison operators support multiple data types however the data type of the left value must be the same as the data type of the right. For example, a STRING can be compared to another STRING but not a FLOAT. The technique is the same lexicographical ordering based sequence comparison technique used by Python.

4 When using regular expression operations, the expression on the left is the string to compare and the expression on the right is the regular expression to use for either the match or search operation.

Accessor Operators

Some data types support accessor operators to obtain sub-values and attributes. One example is the STRING which supports both attribute and item lookup operations. For example, “length” is a valid attribute and can be accessed by appending .length to either a string literal or symbol. Alternatively, a specific character in a string of characters can be accessed by index. For example, the first character in a string can be referenced by appending [0] to either the string literal or symbol. Attempts to lookup either an invalid attribute or item will raise a LookupError.

Both attribute and item lookups have “safe” variants which utilize the & operator prefix (not to be confused with the bit-wise and operator which leverages the same symbol). The safe operator version will evaluate to NULL instead of raising an exception when the container value on which the operation is applied is NULL. Additionally, the safe version of item lookup operations will evaluate to NULL instead of raising a LookupError exception when the item is not held within the container. This is analogous the Python’s dict.get() method.

The item lookup operation can also evaluate to an array when a stop boundary is provided. For example to reference the first four elements of a string by appending [0:4] to the end of the value. Alternatively, only the ending index may be specified using [:4]. Finally, just as in Python, negative values can be used to reference the last elements.

Array Comprehension

An operation may be able to be applied to each member of an iterable value to generate a new ARRAY composed of the resulting expressions. This could for example be used to determine how many values within an array match an arbitrary condition. The syntax is very similar to the list comprehension within Python and is composed of three mandatory components with an optional condition expression. The three required components in order from left to right are the result expression, the variable assignment and the iterable (followed by the optional condition). Each component uses a reserved keyword as a delimiter and the entire expression is wrapped within brackets just like an array literal.

For example, to square an array of numbers: [ v ** 2 for v in [1, 2, 3] ]. In this case, the resulting expression is the square operation (v ** 2) which uses the variable v defined in the assignment. Finally, the operation is applied to the array literal [1, 2, 3], which could have been any iterable value.

An optional condition may be applied to the value before the resulting expression is evaluated using the if keyword. Building on the previous example, if only the squares of each odd number was needed, the expression could be updated to: [ v ** 2 for v in [1, 2, 3] if v % 2]. This example uses the modulo operator to filter out even values.

One limitation to the array comprehension syntax when compared to Python’s list comprehension is that the variable assignment may not contain more than one value. There is currently no support for unpacking multiple values like Python does, (e.g. [ v for k,v in my_dict.items() if test(k) ].

Ternary Operators

The ternary operator can be used in place of a traditional “if-then-else” statement. Like other languages the question mark and colon are used as the expression delimiters. A ternary expression is a combination of a condition followed by an expression used when the condition is true and ending with an expression used when the condition is false.

For example: condition ? true_case : false_case

Function Calls

Function calls can be preformed on function symbols by placing parenthesis after them. The parenthesis contain zero or more argument expressions to pass to the function. Functions support optional positional arguments. For example, a function can take two arguments and one or both can specify a default value and then be omitted when called. Functions do not support keyword arguments.

Using the builtin split function as an example, it can be called with up to 3 arguments. The first is required while the second two are optional. The split symbol requires the $ prefix to access the builtin value.

# only the required argument performs an unlimited number of splits on spaces
$split("Star Wars")         # => ("Star", "Wars")

# the optional second argument specifies an alternative string to split on
$split("Star Wars", "r")    # => ('Sta', ' Wa', 's')

# the optional third argument specifies the maximum number of times to split the string
$split("Star Wars", "r", 1) # => ('Sta', ' Wars')

# raises FunctionCallError because the second argument must be a string, the third argument
# can not be specified without the second
$split("Star Wars", 1)      # => FunctionCallError: data type mismatch (argument #2)
$split("Star Wars", ' ', 1) # => ("Star", "Wars")

Reserved Keywords

The following keywords are reserved and can not be used as the names of symbols.

Keyword

Description

null

The NullExpression literal value

Array Comprehension

for

Array comprehension result and assignment delimiter

if

Array comprehension iterable and (optional) condition delimiter

Booleans (BooleanExpression Literals)

true

The “True” boolean value

false

The “False” boolean value

Floats (FloatExpression Literals)

inf

Floating point value for infinity

nan

Floating point value for not-a-number

Logical Operators

and

Logical “and” operator

not

Logical “not” operator

or

Logical “or” operator

Membership Operators

in

Checks member is in the container

Reserved For Future Use

elif

Reserved for future use

else

Reserved for future use

while

Reserved for future use

Literal Values

DATETIME, STRING, and TIMEDELTA literal values are specified in a very similar manner by defining the value as a string of characters enclosed in either single or double quotes. The difference comes in an optional leading character before the opening quote. Either no leading character or a single s will specify a standard STRING value, while a single d will specify a DATETIME value, and a single t will specify a TIMEDELTA value.

Literal DATETIME Values

DATETIME literals must be specified in ISO-8601 format. The underlying parsing logic is provided by dateutil.parser.isoparse(). DATETIME values with no time specified (e.g. d"2019-09-23") will evaluate to a DATETIME of the specified day at exactly midnight.

Example rules showing equivalent literal expressions:

  • d"2019-09-23" == d"2019-09-23 00:00:00" (dates default to midnight unless a time is specified)

  • d"2019-09-23" == d"2019-09-23 00:00:00-04:00" (only equivalent when the local timezone is EDT)

Literal FLOAT Values

FLOAT literals may be expressed in either binary, octal, decimal, or hexadecimal formats. The binary, octal and hexadecimal formats use the 0b, 0o, and 0x prefixes respectively. Values in the decimal format require no prefix and is the default base in which values are represented. Only base-10, decimal values may include a decimal place component.

Example rules showing equivalent literal expressions:

  • 0b10 == 2

  • 0o10 == 8

  • 10.0 == 10

  • 0x10 == 16

FLOAT literals may also be expressed in scientific notation using the letter e.

Example rules show equivalent literal expressions:

  • 1E0 == 1

  • 1e0 == 1

  • 1.0e0 == 1

Literal TIMEDELTA Values

TIMEDELTA literals must be specified in a subset of the ISO-8601 format for durations. Everything except years and months are supported in TIMEDELTA values, to match the underlying representation provided by the Python standard library.

Example rules showing equivalent literal expressions:

  • t"P1D" == t"PT24H" (24 hours in a day)

  • t"P1D" == t"PT1440M" (1,440 minutes in a day)

Comments

A single # symbol can be used to create a comment in the rule text. The everything after the first # occurrence will be ignored.

Example rule containing a comment: size == 1 # this is a comment

Builtin Symbols

The following symbols are provided by default using the from_defaults() method. These symbols can be accessed through the $ prefix, e.g. $pi. The default values can be overridden by defining a custom subclass of Context and setting the builtins attribute.

Functions

Note

The following functions use a pseudo syntax to define their signature for use within rules. The signature is:

functionName(argumentType argumentName, ...) -> returnType

abs(FLOAT value) -> FLOAT

returns:

FLOAT

value:

(FLOAT) The numeric to get the absolute value of.

Returns the absolute value of value.

Added in version 4.1.0.

all(ARRAY[??] values) -> BOOLEAN

returns:

BOOLEAN

values:

(ARRAY of anything) An array of values to check.

Returns true if every member of the array argument is truthy. If values is empty, the function returns true.

any(ARRAY[??] values) -> BOOLEAN

returns:

BOOLEAN

values:

(ARRAY of anything) An array of values to check.

Returns true if any member of the array argument is truthy. If values is empty, the function returns false.

ARRAY[??] filter(FUNCTION function, ARRAY[??] values)

returns:

ARRAY of anything

function:

(FUNCTION) The function to call on each of the values.

values:

(ARRAY of anything) The array of values to apply function to.

Returns an array containing a subset of members from values where function returns true.

ARRAY[??] map(FUNCTION function, ARRAY[??] values)

returns:

ARRAY of anything

function:

(FUNCTION) The function to call on each of the values.

values:

(ARRAY of anything) The array of values to apply function to.

max(ARRAY[FLOAT] values) -> FLOAT

returns:

FLOAT

values:

(ARRAY of FLOAT) An array of values to check.

Returns the largest value from the array of values. If values is empty, a FunctionCallError is raised.

min(ARRAY[FLOAT] values) -> FLOAT

returns:

FLOAT

values:

(ARRAY of FLOAT) An array of values to check.

Returns the smallest value from the array of values. If values is empty, a FunctionCallError is raised.

parse_datetime(STRING value) -> DATETIME

returns:

DATETIME

value:

(STRING) The string value to parse into a timestamp.

Parses the string value into a DATETIME value. The string must be in ISO-8601 format and if it fails to parse, a DatetimeSyntaxError is raised.

parse_float(STRING value) -> FLOAT

returns:

FLOAT

value:

(STRING) The string value to parse into a numeric.

Parses the string value into a FLOAT value. The string must be properly formatted and if it fails to parse, a FloatSyntaxError is raised.

parse_timedelta(STRING value) -> FLOAT

returns:

TIMEDELTA

value:

(STRING) The string value to parse into a time period.

Parses the string value into a TIMEDELTA value. The string must be properly formatted and if it fails to parse, a TimedeltaSyntaxError is raised.

random([FLOAT boundary]) -> FLOAT

returns:

FLOAT

boundary:

(Optional FLOAT) The upper boundary to generate a random number for.

Generate a random number. If boundary is not specified, the random number returned will be between 0 and 1. If boundary is specified, it must be a natural number and the random number returned will be between 0 and boundary, including boundary.

ARRAY[FLOAT] range(FLOAT start, [FLOAT stop, FLOAT step])

returns:

ARRAY of FLOAT

start:

(FLOAT) The value of the start parameter.

stop:

(Optional FLOAT) The value of the stop parameter. If not supplied, start value will be used as stop instead.

step:

(Optional FLOAT) The value of the step parameter (or 1 if the parameter was not supplied).

Generate a sequence of FLOAT’s between start (inclusive) and stop (exclusive) by step.

ARRAY[STRING] split(STRING string, [STRING sep, FLOAT maxsplit])

returns:

ARRAY of STRING

string:

(STRING) The string value to split into substrings.

sep:

(Optional STRING) The value to split string on.

maxsplit:

(Optional FLOAT) The maximum number of times tp split string.

Split a string value into sub strings. If sep is not specified, the string will be split by all whitespace. If sep is specified, string will be split by that value. This alters how consecutive spaces are handled. When sep is not specified, consecutive whitespace is handled as a single unit and reduced, where as if sep is a single space, consecutive spaces will result in empty strings being returned.

For example:

$split("A    B")      # => ('A', 'B')
$split("A    B", ' ') # => ('A', '', '', '', 'B')

If maxsplit is specified, it must be a natural number and will be used as the maximum number of times to split string. This will guarantee that the resulting array length is less than or equal to maxsplit + 1.

sum(ARRAY[FLOAT] values) -> FLOAT

returns:

FLOAT

values:

(ARRAY of FLOAT) An array of values to add.

Returns the sum of an array of values. If values is empty, the function returns 0.