Running DAGs locally¶
Apache Airflow is used to manage our workflows (DAGs) in the catalog. For more information see our quickstart guide. This document describes how to run or test out these DAGs locally during development.
Additionally, it is worth noting that not all DAGs can be run locally in
development right away as some of them require API keys from the provider.
However, some other DAGs like the ones for SMK or Finnish Museums can be run
locally without any additional keys. In order to run some DAGs locally, you
might need to get the API keys from the provider and add them to the
catalog/.env file.
Getting started¶
Refer to and follow the instructions at Quickstart to setup and make sure the catalog service is up and running. If you have successfully completed the general setup then this can be started by running
ov just catalog/up.Navigate to http://localhost:9090

You should be met with an authentication page, use
airflowas both the username and password to log in.

Search for or scroll down to any DAG of choice that does not require an API key and click on it. We are using the
finnish_museums_workflowfor this example.

Click the toggle button labelled “DAG” at the top left, an alert box pops up for you to confirm the action, click “OK” to continue. A run will get kicked off.
DAGs which are run on a schedule (like this one) will usually kick off immediately, though the page may need to be refreshed in order to view the run. If you wish to kick off a new DAG run, you should click on the “Trigger DAG” button represented by a “play” icon at the top right of the page.

To get a summary of the DAG, click on the “DAG Docs” accordion and you should see an overview of the DAG displayed.
Note
For more info about how Airflow works in general, check out their documentation on the UI.