Data Tests¶
About¶
SAYN provides an extension to the functionality of the columns
field in task definitions that enables the user to make use of predefined (standard) or custom tests for their table data fields.
Standard tests are implemented for the autosql
and copy
task types, while custom tests are not task bound and can be applied to any table in the warehouse. Custom tests work like SAYN sql
tasks with a specific SQL query structure that only execute during the SAYN test suite.
Running sayn test
through the CLI will execute the standard and custom tests for a given project.
All CLI usage for tasks applies to tests as well, with the major difference being that tests don't make use of a DAG to determine the order of execution (so attempting to execute tests on parents of children of a task is not supported).
Examples:
sayn test -t test_name
: runtask_name
only.sayn test -t test1 test2
: runstask1
andtask2
only.sayn test -x test_name
: run all tests excepttask_name
.
Test Types¶
Please see below the available SAYN test types:
unique
: is applied on any number of columns of a given table and is responsible for validating uniqueness. This will also define constraints during the table creation where applicable.not_null
: is applied on any number of columns of a given table and is responsible for validating nullity (or rather the lack of it). This will also define constraints during the table creation where applicable.allowed_values
: is applied on any number of columns of a given table and is responsible for validating accepted values. This will also define constraints during the table creation where applicable.custom
: is a specifically formated SQL query whose output is used to validate a successful or failed test.
Defining Tests¶
Tests are defined in a list format using the tests
subfield for each entry in columns
. For unique
and not_null
you need to include these keywords in the list, while for allowed_values
we define another list that is populated by the allowed string values for that data field.
For example, we can define tests to verify uniqueness and nullity for the id
field and allowed values for the alias
field for the following task in the core
group:
tasks.yaml
task:
type: autosql
file_name: "task.sql"
materialisation: table
destination:
table: "{{ task.name }}"
columns:
- name: id
tests:
- unique
- not_null
- name: alias
tests:
- allowed_values:
- 'first'
- 'second'
- 'third'
We can also define the tests inside task.sql
by call config
from a Jinja tag:
tasks.sql
{{ config(columns=[ {'name': 'id', 'tests':['unique', 'not_null']},
{'name':'alias', 'tests':['allowed_values':['first','second','third']}]) }}
SELECT ...
Custom tests are defined in their own task group called tests
(defining tasks in an arbitrary test
group will cause SAYN to fail). custom
tests are provided with an SQL file that needs to exist in a tests
folder in the sql
project folder.
For example, we can define a custom tests that executes the test query presented bellow:
tests.yaml
tests:
test_1:
file_name: test.sql
tasks:
...
...
SQL test query
SELECT l.arena_id
FROM dim_arenas as l
WHERE l.arena_id IS NULL
GROUP BY l.arena_id
HAVING COUNT(*) > 0
You can read more about test types by heading to the corresponding pages.