Path: blob/main/docs/source/polars-cloud/integrations/airflow.md
8393 views
Airflow
Execute Polars Cloud queries remotely using Airflow workflows through secure credential management. This section explains how to configure Airflow to submit and monitor Polars Cloud workloads using Airflow's built-in security mechanisms, keeping service account credentials isolated from DAG code while maintaining full workflow control.
Secret manager (docs): is the Airflow-recommended way to handle secrets. It involves setting up a
Secret Backend(many providers maintained by the community) in theairflow.cfgand let Airflow workers pull the given secrets via theairflow.models.VariableAPI asVariable.get("<SECRET NAME>"). Note Airflow will pull the secret in its own metastore; if this situation is not desirable, interacting with the cloud provider's Secret Manager (or any other vault accessible via API) can simply be performed as a task of your DAG; see relevant official docs (here is AWS' as an example).Environment variables (docs): load your environment variables into your containers after prefixing them by
AIRFLOW_VAR_, for instanceAIRFLOW_VAR_POLARS_CLOUD_CLIENT_IDandAIRFLOW_VAR_POLARS_CLOUD_CLIENT_SECRET. They should then be available through theairflow.models.VariableAPI asVariable.get("POLARS_CLOUD_CLIENT_ID").Airflow
Variables(docs): in the Airflow UI > Admin > Variables tab one can add/edit key: value pairs provided to Airflow, which will make them accessible through theairflow.models.VariableAPI. Note these objects can also be defined using the Airflow CLI (if accessible):airflow variables set POLARS_CLOUD_CLIENT_ID "<SECRET>".
Some code snippets for solutions #1 and #2 described above:
Below a few lines of pseudo-code using Airflow' TaskFlow API: