Name: Building Batch Data Pipelines on Google Cloud
Price: 595 USD

Building Batch Data Pipelines on Google Cloud (BBDP)

Course Overview

Data pipelines typically fall under one of the Extract and Load (EL), Extract, Load and Transform (ELT) or Extract, Transform and Load (ETL) paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud for data transformation including BigQuery, executing Spark on Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Dataflow. Learners get hands-on experience building data pipeline components on Google Cloud using Qwiklabs.

Who should attend

This course is intended for developers who are responsible for designing pipelines and architectures for data processing.

Certifications

This course is part of the following Certifications:

Google Cloud Certified Professional Data Engineer

Prerequisites

Experience with data modeling and ETL (extract, transform, load) activities.
Experience with developing applications by using a common programming language such as Python or Java.

Course Objectives

Review different methods of data loading: EL, ELT and ETL and when to use what.
Run Hadoop on Dataproc, use Cloud Storage, and optimize Dataproc jobs.
Build your data processing pipelines by using Dataflow.
Manage data pipelines with Data Fusion and Cloud Composer

Outline: Building Batch Data Pipelines on Google Cloud (BBDP)

Module 1 - Introduction to Building Batch Data Pipelines

Topics:

EL, ELT, ETL
Quality considerations
How to conduct operations in BigQuery
Shortcomings
ETL to solve data quality issues

Objectives:

Review different methods of loading data into your data lakes and warehouses: EL, ELT and ETL

Module 2 - Executing Spark on Dataproc

Topics:

The Hadoop ecosystem
Run Hadoop on Dataproc
Cloud Storage instead of HDFS
Optimizing Dataproc

Objectives:

Review the Hadoop ecosystem.
Discuss how to lift and shift your existing Hadoop workloads to the cloud using Dataproc.
Explain when to use Cloud Storage instead of HDFS storage.
Explain how to optimize your Dataproc jobs.

Module 3 - Serverless Data Processing with Dataflow

Topics:

Introduction to Dataflow
Why customers value Dataflow
Dataflow pipelines
Aggregate with GroupByKey and Combine
Side inputs and windows
Dataflow templates

Objectives:

Identify the features that customers value in Dataflow.
Discuss core concepts in Dataflow.
Review the use of Dataflow templates and SQL.
Write a simple Dataflow pipeline and run it both locally and on the cloud.
Identify map and reduce operations, execute the pipeline, and use command line parameters.
Read data from BigQuery into Dataflow and use the output of a pipeline as a sideinput to another pipeline

Module 4 - Manage Data Pipelines with Cloud Data Fusion and Cloud Composer

Topics:

Building batch data pipelines visually with Cloud Data Fusion
- Components
- UI overview
- Building a pipeline
- Exploring data using Wrangler
Orchestrating work between Google Cloud services with Cloud Composer
- Apache Airflow environment
- DAGs and operators
- Workflow scheduling
- Monitoring and logging

Objectives:

Discuss how to manage your data pipelines with Data Fusion and Cloud Composer.
Summarize how Cloud Data Fusion allows data analysts and ETL developers to wrangle data and build pipelines in a visual way.
Describe how Cloud Composer can help to orchestrate the work across multiple Google Cloud services.

Prices & Delivery methods

Online Training

Duration
1 day

Price

US $ 595

Enroll now

Request a date

Classroom Training

Duration
1 day

Price

United States: US $ 595

Enroll now

Request a date

Click on town name or "Online Training" to book Schedule

This is an Instructor-Led Classroom course

Instructor-led Online Training: This is an Instructor-Led Online (ILO) course. These sessions are conducted via WebEx in a VoIP environment and require an Internet Connection and headset with microphone connected to your computer or laptop. If you have any questions about our online courses, feel free to contact us via phone or Email anytime.

This is a FLEX course, which is delivered simultaneously in two modalities. Choose to attend the Instructor-Led Online (ILO) virtual session or Instructor-Led Classroom (ILT) session.

Europe

Germany

Aug 20, 2025	Frankfurt This is a FLEX course.	Enroll
	Online Training Time zone: Central European Summer Time (CEST)	Enroll
Nov 26, 2025	Hamburg This is a FLEX course.	Enroll
	Online Training Time zone: Central European Time (CET)	Enroll

Italy

Oct 9, 2025

Online Training Time zone: Central European Summer Time (CEST)

Enroll

Slovenia

Nov 3, 2025

Online Training Time zone: Central European Time (CET)

Enroll

Switzerland

Jan 23, 2026	Zurich This is a FLEX course.	Enroll
	Online Training Time zone: Central European Time (CET)	Enroll
Apr 27, 2026	Zurich This is a FLEX course.	Enroll
	Online Training Time zone: Central European Summer Time (CEST)	Enroll
Jul 17, 2026	Zurich This is a FLEX course.	Enroll
	Online Training Time zone: Central European Summer Time (CEST)	Enroll
Oct 16, 2026	Zurich This is a FLEX course.	Enroll
	Online Training Time zone: Central European Summer Time (CEST)	Enroll

Africa

Egypt

Aug 4, 2025	Cairo	Enroll
Nov 3, 2025	Cairo	Enroll
Dec 15, 2025	Cairo	Enroll