GCP Cloud Composer — Save Development Time

GCP Cloud Composer — Save Development Time

Generally a common approach that is followed by the new developers working on Google cloud composer (Managed Airflow) is to do iterative hit and try method for development till the DAG import errors go away.

Below is sample of the DAG import error caused due to incorrect library imports

Why is this not a good approach ?
1. It can easily create an endless cycle of hit and trial till the DAG starts working
2. The amount of the development time increases to great extent
3. The process is counter-productive

Usually in composer environment wit h lower specifications and many development users, the scheduler takes longer time to parse the DAG which again adds to turn around

Leveraging Standalone Airflow

When you install airflow from pip distribution as example it setups a local airflow directory and light weight server for usage

The airflow cfg file has the configuration properties for the airflow. Below is the local unix path configuration of the dags_folder

To start the standalone server, simply run below command which starts up local airflow server running on port 8080.

airflow standalone

The default user will be admin and the password will be present in the standalone_admin_password.txt as above screenshot

Airflow Standalone UI

Simply create the required DAGs and place it in the dags_folder as configured above

Example of simple DAG — Bash Operator Templated
Bash Template File
Execution on the local airflow
Rendered Template of the Task

This can be one faster way to get the things moving and speeding up the hit-trial method.

Since the airflow standalone version is light weight and used by 1 user, the DAG parsing time is relatively faster.

Is there more that can be done similar to compiling a java code and getting compile time errors which can be fixed before the DAG gets uploaded to composer bucket ?

Run DAG as Python file

Run the DAG file as the python executable

python simple_dag.py

Any error in syntax or declaration would be caught instantly.

For Composer environments, the DAG file can be added to different folder apart from /dags/ like /data/ and below command can be run to catch syntax errors

gcloud composer enviro nments run <environment> dags list — -subdir <home/airflow/gcs/data/>

Listing Import Errors for DAGs

For quick test on DAGs without uploading the DAGs to DAGs Folder, below command can be run to catch any syntactical issues

airflow dags list-import-errors — subdir <folder other than DAGs folder>
list-import-error output

Above command can also be executed in the composer environment with similar syntax

Reference — https://cloud.google.com/sdk/gcloud/reference/composer/environments/run

Test a DAG Run

Running the below command helps to validate the dags in the provided folder and execute the mentioned DAG.

airflow dags test simple_templated_bash_d ag_1 2022–10–09 — subdir /home/murlik/gcp-code-repo/composer
Execution Logs of the DAG

Test specific task instance

Running airflow task test command can also provide a quick way to test the task instance execution without triggering in composer environment

airflow tasks test simple_templated_bash_dag bash_task 2022–10–09

The above command can also have — subdir parameter passed

Quick test of airflow task instance

This command can also be executed in composer as below

gcloud composer environments run <environment name> tasks test — <parameters>

Using the above tech niques, valuable time can be saved and development process can be sped up.

Please note there are some differences in commands between Airflow 1 and 2, do check the documentation for version related commands syntax (link in references below)

Do try it out

References : https://cloud.google.com/sdk/gcloud/reference/composer/environments/run
https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html
https://airflow.apache.org/docs/apache-airflow/stable/start.html

Linked-in Handle — https://www.linkedin.com/in/murli-krishnan-a1319842/


GCP Cloud Composer — Save Development Time was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.

Namaste Devops is a one stop solution view, read and learn Devops Articles selected from worlds Top Devops content publishers inclusing AWS, Azure and others. All the credit/appreciations/issues apart from the Clean UI and faster loading time goes to original author.

Comments

Did you find the article or blog useful? Please share this among your dev friends or network.

An android app or website on your mind?

We build blazing fast Rest APIs and web-apps and love to discuss and develop on great product ideas over a Google meet call. Let's connect for a free consultation or project development.

Contact Us

Trending DevOps Articles

Working with System.Random and threads safely in .NET Core and .NET Framework

Popular DevOps Categories

Docker aws cdk application load balancer AWS CDK Application security AWS CDK application Application Load Balancers with DevOps Guru Auto scale group Automation Autoscale EC2 Autoscale VPC Autoscaling AWS Azure DevOps Big Data BigQuery CAMS DevOps Containers Data Observability Frequently Asked Devops Questions in Interviews GCP Large Table Export GCP Serverless Dataproc DB Export GTmetrix Page Speed 100% Google Page Speed 100% Healthy CI/CD Pipelines How to use AWS Developer Tools IDL web services Infrastructure as code Istio App Deploy Istio Gateways Istio Installation Istio Official Docs Istio Service Istio Traffic Management Java Database Export with GCP Jenkin K8 Kubernetes Large DB Export GCP Linux MSSQL March announcement MySQL Networking Popular DevOps Tools PostgreSQL Puppet Python Database Export with GCP Python GCP Large Table Export Python GCP Serverless Dataproc DB Export Python Postgres DB Export to BigQuery Sprint Top 100 Devops Questions TypeScript Client Generator anti-patterns of DevOps application performance monitoring (APM) aws amplify deploy blazor webassembly aws cdk application load balancer security group aws cdk construct example aws cdk l2 constructs aws cdk web application firewall aws codeguru reviewer cli command aws devops guru performance management aws service catalog best practices aws service catalog ci/cd aws service catalog examples azure Devops use cases azure devops whitepaper codeguru aws cli deploy asp.net core blazor webassembly devops guru for rds devops guru rds performance devops project explanation devops project ideas devops real time examples devops real time scenarios devops whitepaper aws docker-compose.yml health aware ci/cd pipeline example host and deploy asp.net core blazor webassembly on AWS scalable and secure CI/CD pipelines security vulnerabilities ci cd pipeline security vulnerabilities ci cd pipeline aws smithy code generation smithy server generator
Show more