Articles

Cloud Composer – Take5


[MUSIC PLAYING] MICHEL SZARINDAR: Mark,
what are you doing? MARK MIRCHANDANI: I’m trying
to conduct all these workflows, but it’s so much work to manage. MICHEL SZARINDAR: What? You do that to yourself? You should use Google
Cloud Composer. MARK MIRCHANDANI: What’s
Google Cloud Composer? MICHEL SZARINDAR:
Ah, let me show you. [MUSIC PLAYING] MICHEL SZARINDAR: So let’s say
that you have a data workflow to collect, to move,
to transform, and check these data. And all these tasks
needs to be finished at the same time every day. So you need to make sure that
you can schedule, monitor, restart, and even chain
different tasks together. Google Cloud Composer is
a fully-managed version of Apache Airflow and
will let you do all this. MARK MIRCHANDANI: So for anyone
already familiar with Airflow, it’s like the same
thing, but you don’t have to worry about
managing the infrastructure. MICHEL SZARINDAR: Exactly, yeah. MARK MIRCHANDANI: Oh,
that sounds really useful. Can we look more
at your example? MICHEL SZARINDAR: Yeah. In this architecture,
for example, the workflow will move data from table
in BigQuery in the US to BigQuery in the EU. To do that, we are
exporting and transferring data between two different
Google Cloud Storage buckets. Let’s jump to the interface. Here is a list of the
environments for Google Cloud Platform. And I’m going to show you how
easy it is to create a new one. So here, you are adding the
name of a new environment. You can just choose
your location– us-central1 here– and
no preference on zone. There are other
configuration points, but I’m not going
to show you this. So let’s do Create. So while the
environment is creating, we are going to see an example. OK, so here you have all the
details of the environment, and you also have, for example,
for this one, the work nodes. You can just check here. And we are going to go
inside the Airflow UI, which is created automatically. And this is a console where
you store all of your DAGs. MARK MIRCHANDANI: So wait. What is a DAG? MICHEL SZARINDAR: A DAG
is a Directed Acyclic Graph, of course. [EXHALES] A DAG is
like a workflow. Think of it as an organized list
of tasks that you want to run. Let me show you that. Let’s look inside the code. So inside the code view, can
see all the code for this DAG. So here on the bottom, you’re
going to see all the tasks. So for example, I start. And after I take the
data from BigQuery, I export that to Cloud Storage. From Cloud Storage, I’m going
to move the data from US to EU. And when it’s inside
the EU, I’m going to start importing back
the data inside BigQ. So let’s go back to the DAG. So I’m going to start
the DAG just now. It’s normally done
automatically. But I’m going to
manually start the DAG. And there are different
ways to see the DAG running. So for example, I’m going
to show you the tree view. And there also a
nice one, which is the graph view, where I can
see all the status of all the tasks. Here, for example, the task
is soon going to be queued– yes, here. And the task is soon
going to be started. So one of the beauty, as
well, of Cloud Composer is that you can just see the
execution of the DAGs day after time. So here, for example, you
have all the landing times where you can start looking
at the execution over time. So now, you can monitor
all your workflows, and relax, and be sure your
job will be completed on time. MARK MIRCHANDANI: Wow. That’s awesome. Now I’m feeling
much more composed. MICHEL SZARINDAR:
Get off the stage. MARK MIRCHANDANI:
Thanks for watching. Don’t forget to like, comment,
and subscribe for more great Google Cloud Platform content. [MUSIC PLAYING]

Comment here