Parameters¶
parameters are a really powerful SAYN feature. They enable you to make your code dynamic and easily switch between profiles. For example, parameters are key to separating development and production environments.
SAYN uses Jinja templating for both the SQL queries and YAML properties. parameters can also be accessed in python tasks.
Project Parameters¶
Project parameters are defined in project.yaml:
project.yaml
parameters:
user_prefix: ''
schema_logs: analytics_logs
schema_staging: analytics_staging
schema_models: analytics_models
The value set in project.yaml is the default value those parameters will have. This should match
with the value used on production.
Note
Parameters are interpreted as yaml values, so for example schema_logs above would end up
as a string. In the above example user_prefix would also be a string (empty string by default)
because we included the double quote, but if we didn't include those quotations, the value
would be python's None when we use it in both python and sql tasks.
To override those default values, we just need to set them in the profile. For example, for a dev environment we can do the following:
settings.yaml
# ...
default_profile: dev
profiles:
dev:
credentials:
# ...
parameters:
user_prefix: songoku_
schema_logs: analytics_adhoc
schema_staging: analytics_adhoc
schema_models: analytics_adhoc
prod:
credentials:
# ...
In the above, we're overriding the values of the project parameters for the dev profile,
but not for the prod profile.
Task Parameters¶
Tasks can also define parameters. This is useful if there's a way for several tasks to share the same code:
tasks/base.yaml
task1:
type: sql
file_name: task_template.sql
parameters:
src_table: 'table1'
task2:
type: sql
file_name: task_template.sql
parameters:
src_table: 'table2'
sql/task_template.yaml
SELECT dt
, COUNT(1) AS c
FROM {{ src_table }}
GROUP BY 1
In the above example both task1 and task2 are sql tasks pointing at the same file
sql/task_template.sql, the difference between the 2 is the value of the src_table parameter
which is used to change the source table in the SQL.
Using Parameters¶
Using Parameters In tasks¶
Task attributes are interpreted as Jinja
parameters. Therefore, you can make the tasks' definition dynamic. This example uses an autosql
task:
tasks/base.yaml
task_autosql_param:
type: autosql
file_name: task_autosql_param.sql
materialisation: table
destination:
tmp_schema: '{{ schema_staging }}'
schema: '{{ schema_models }}'
table: '{{ user_prefix }}task_autosql_param'
In this example we're using schema_staging, schema_models and user_prefix project parameters
so that the values would change depending on the profile. Note the use of quotation in the yaml file
when we template task properties.
When running sayn run -t task_autosql_param, this would be interpreted based on the dev profile,
which we set as default above and evaluate as:
Example
task_autosql_param:
type: autosql
file_name: task_autosql_param.sql
materialisation: table
destination:
tmp_schema: analytics_adhoc
schema: analytics_adhoc
table: songoku_task_autosql_param
If we used the prod profile instead (sayn run -t task_autosql_param -p prod) the task will evaluate as:
Example
task_autosql_param:
type: autosql
file_name: task_autosql_param.sql
materialisation: table
destination:
tmp_schema: analytics_staging
schema: analytics_models
table: task_autosql_param
This task example is even more powerful when used in presets in combination with the jinja variable task:
tasks/base.yaml
presets:
preset_auto_param:
type: autosql
file_name: '{{ task.name }}.sql'
materialisation: table
destination:
tmp_schema: '{{ schema_staging }}'
schema: '{{ schema_models }}'
table: '{{ user_prefix }}{{ task.name }}'
tasks:
task_autosql_param:
preset: preset_auto_param
Here we extract all values from task_autosql_param into a preset preset_auto_param that can be reused
in multiple tasks. The name of the task is then used to reference the correct sql file and the correct
table name using {{ task.name }}
In SQL Queries¶
For SQL related tasks (autosql, sql), use parameters within the SQL code with the same jinja syntax
{{ parameter_name }}:
sql/task_autosql_param.sql
SELECT mt.*
FROM {{schema_models}}.{{user_prefix}}my_table AS mt
This SQL query would then be compiled with the relevant paramaters based on the profile of the execution.
If using the dev profile, this would therefore be compiled as:
compiled/base/task_autosql_param.sql
SELECT mt.*
FROM analytics_adhoc.songoku_my_table AS mt
In python Tasks¶
Parameters are accessible to python tasks as well as properties of the task class with
self.project_parameters, self.task_parameters and self.parameters, which are all python dictionaries.
self.parameters is the most convenient one as it combines both project and task parameters in a single
dictionary.
python/task_python.py
from sayn import PythonTask
class TaskPython(PythonTask):
def run(self):
param1 = self.parameters['param1']
# Some code using param1
return self.success()