Parameters¶

parameters are a really powerful SAYN feature. They enable you to make your code dynamic and easily switch between profiles. For example, parameters are key to separating development and production environments.

SAYN uses Jinja templating for both the SQL queries and YAML properties. parameters can also be accessed in python tasks.

Project Parameters¶

Project parameters are defined in project.yaml:

project.yaml

parameters:
  user_prefix: ''
  schema_logs: analytics_logs
  schema_staging: analytics_staging
  schema_models: analytics_models

The value set in project.yaml is the default value those parameters will have. This should match with the value used on production.

Note

Parameters are interpreted as yaml values, so for example schema_logs above would end up as a string. In the above example user_prefix would also be a string (empty string by default) because we included the double quote, but if we didn't include those quotations, the value would be python's None when we use it in both python and sql tasks.

To override those default values, we just need to set them in the profile. For example, for a dev environment we can do the following:

settings.yaml

# ...

default_profile: dev

profiles:
  dev:
    credentials:
      # ...
    parameters:
      user_prefix: songoku_
      schema_logs: analytics_adhoc
      schema_staging: analytics_adhoc
      schema_models: analytics_adhoc

  prod:
    credentials:
      # ...

In the above, we're overriding the values of the project parameters for the dev profile, but not for the prod profile.

Task Parameters¶

Tasks can also define parameters. This is useful if there's a way for several tasks to share the same code:

tasks/base.yaml

task1:
  type: sql
  file_name: task_template.sql
  parameters:
    src_table: 'table1'

task2:
  type: sql
  file_name: task_template.sql
  parameters:
    src_table: 'table2'

sql/task_template.yaml

SELECT dt
     , COUNT(1) AS c
  FROM {{ src_table }}
 GROUP BY 1

In the above example both task1 and task2 are sql tasks pointing at the same file sql/task_template.sql, the difference between the 2 is the value of the src_table parameter which is used to change the source table in the SQL.

Using Parameters¶

Using Parameters In `tasks`¶

Task attributes are interpreted as Jinja parameters. Therefore, you can make the tasks' definition dynamic. This example uses an autosql task:

tasks/base.yaml

task_autosql_param:
  type: autosql
  file_name: task_autosql_param.sql
  materialisation: table
  destination:
    tmp_schema: '{{ schema_staging }}'
    schema: '{{ schema_models }}'
    table: '{{ user_prefix }}task_autosql_param'

In this example we're using schema_staging, schema_models and user_prefix project parameters so that the values would change depending on the profile. Note the use of quotation in the yaml file when we template task properties.

When running sayn run -t task_autosql_param, this would be interpreted based on the dev profile, which we set as default above and evaluate as:

Example

task_autosql_param:
  type: autosql
  file_name: task_autosql_param.sql
  materialisation: table
  destination:
    tmp_schema: analytics_adhoc
    schema: analytics_adhoc
    table: songoku_task_autosql_param

If we used the prod profile instead (sayn run -t task_autosql_param -p prod) the task will evaluate as:

Example

task_autosql_param:
  type: autosql
  file_name: task_autosql_param.sql
  materialisation: table
  destination:
    tmp_schema: analytics_staging
    schema: analytics_models
    table: task_autosql_param

This task example is even more powerful when used in presets in combination with the jinja variable task:

tasks/base.yaml

presets:
  preset_auto_param:
    type: autosql
    file_name: '{{ task.name }}.sql'
    materialisation: table
    destination:
      tmp_schema: '{{ schema_staging }}'
      schema: '{{ schema_models }}'
      table: '{{ user_prefix }}{{ task.name }}'

tasks:
  task_autosql_param:
    preset: preset_auto_param

Here we extract all values from task_autosql_param into a preset preset_auto_param that can be reused in multiple tasks. The name of the task is then used to reference the correct sql file and the correct table name using {{ task.name }}

In SQL Queries¶

For SQL related tasks (autosql, sql), use parameters within the SQL code with the same jinja syntax {{ parameter_name }}:

sql/task_autosql_param.sql

SELECT mt.*
  FROM {{schema_models}}.{{user_prefix}}my_table AS mt

This SQL query would then be compiled with the relevant paramaters based on the profile of the execution. If using the dev profile, this would therefore be compiled as:

compiled/base/task_autosql_param.sql

SELECT mt.*
  FROM analytics_adhoc.songoku_my_table AS mt

In `python` Tasks¶

Parameters are accessible to python tasks as well as properties of the task class with self.project_parameters, self.task_parameters and self.parameters, which are all python dictionaries. self.parameters is the most convenient one as it combines both project and task parameters in a single dictionary.

python/task_python.py

from sayn import PythonTask

class TaskPython(PythonTask):
    def run(self):
        param1 = self.parameters['param1']

        # Some code using param1

        return self.success()

Parameters¶

Project Parameters¶

Task Parameters¶

Using Parameters¶

Using Parameters In tasks¶

In SQL Queries¶

In python Tasks¶

Using Parameters In `tasks`¶

In `python` Tasks¶