For business inquiries : (+1) 438 601-1155

For special requests : (+1) 438 601-1155

A propos de la formation Data Engineering on Google Cloud Platform (DEGCP)

Our course combines lectures, demonstrations, and hands-on labs to guide you through the creation of data processing systems, development of end-to-end data pipelines, data analysis, and the implementation of machine learning. Covering structured, unstructured, and streaming data, this comprehensive program ensures you gain hands-on experience in harnessing the power of Google Cloud for effective data engineering.

Détails
Objectifs pédagogiques de la formation Data Engineering on Google Cloud Platform (DEGCP)
  • Develop the skills to design and construct data processing systems effectively on Google Cloud
  • Implement autoscaling data pipelines on Dataflow to process both batch and streaming data seamlessly
  • Extract valuable business insights from vast datasets using BigQuery
  • Utilize Spark and ML APIs on Dataproc to handle unstructured data proficiently
  • Enable real-time insights from streaming data for immediate decision-making
  • Acquire knowledge on ML APIs-BigQuery ML and discover how to create potent models effortlessly using AutoML without coding.

Qui devrait suivre cette formation Data Engineering on Google Cloud Platform (DEGCP) ?

Public visé par la formation Data Engineering on Google Cloud Platform (DEGCP)

Designed for data engineers, analysts, and professionals seeking hands-on expertise in data processing on Google Cloud Platform (GCP).

Prérequis de la formation Data Engineering on Google Cloud Platform (DEGCP)

To optimize the learning experience in this course, participants should have completed "Google Cloud Big Data and Machine Learning Fundamentals" or possess equivalent experience.

Formations Similaires

Déroulé de la formation Data Engineering on Google Cloud Platform (DEGCP)


Module 01 - Introduction to Data Engineering
Analyze data engineering challenges
Introduction to BigQuery
Data lakes and data warehouses
Transactional databases versus data warehouses
Partner effectively with other data teams
Manage data access and governance
Build production-ready pipelines
Review Google Cloud customer case study
 
Module 02 - Building a Data Lake
Introduction to data lakes
Data storage and ETL options on Google Cloud
Building a data lake using Cloud Storage
Securing Cloud Storage
Storing all sorts of data types
Cloud SQL as a relational data lake

Module 03 - Building a Data Warehouse
The modern data warehouse
Introduction to BigQuery
Getting started with BigQuery
Loading data
Exploring schemas
Schema design
Nested and repeated fields
Optimizing with partitioning and clustering
 
Module 04 - Introduction to Building Batch Data Pipelines
EL, ELT, ETL
Quality considerations
How to carry out operations in BigQuery
Shortcomings
ETL to solve data quality issues
 
Module 05 - Executing Spark on Dataproc
The Hadoop ecosystem
Run Hadoop on Dataproc
Cloud Storage instead of HDFS
Optimize Dataproc
 
Module 06 - Serverless Data Processing with Dataflow
Introduction to Dataflow
Why customers value Dataflow
Dataflow pipelines
Aggregating with GroupByKey and Combine
Side inputs and windows
Dataflow templates
Dataflow SQL
 
Module 07 - Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
Building batch data pipelines visually with Cloud Data Fusion
Components
UI overview
Building a pipeline
Exploring data using Wrangler
Orchestrating work between Google Cloud services with Cloud Composer
Apache Airflow environment
DAGs and operators
Workflow scheduling
Monitoring and logging 
 
Module 08 - Introduction to Processing Streaming Data
Processing Streaming Data 
 
Module 09 - Serverless Messaging with Pub/Sub
Introduction to Pub/Sub
Pub/Sub push versus pull
Publishing with Pub/Sub code
 
Module 10 - Dataflow Streaming Features
Steaming data challenges
Dataflow windowing
 
Module 11 - High-Thoughput BigQuery and Bigtable Streaming Features
Streaming into BigQuery and visualizing results
High-throughput streaming with Cloud Bigtable
Optimizing Cloud Bigtable performance
 
Module 12 - Advanced BigQuery Functionality and Performance
Analytic window functions
Use With clauses
GIS functions
Performance considerations
 
Module 13 - Int5roduction to Analytics and AI
What is AI?
From ad-hoc data analysis to data-driven decisions
Options for ML models on Google Cloud
 
Module 14 - Prebuilt ML Model APIs for Unstructured Data
Unstructured data is hard
ML APIs for enriching data 
 
Module 15 - Big Data Analytics with Notebooks
What’s a notebook?
BigQuery magic and ties to Pandas 
 
Module 16 - Production ML Pipelines
Ways to do ML on Google Cloud
Vertex AI Pipelines
AI Hub 

Module 17 - Custom Model Building with SQL in BigQuery ML
BigQuery ML for quick model building
Supported models 
 
Module 18 - Custom Model Building with AutoML
Why AutoML?
AutoML Vision
AutoML NLP
AutoML tables

Formations Similaires
QlikView Détails
Qlik Sense Détails
SAP HANA Détails
SAP BI (BO) Détails
Suite Microsoft (SSIS-SSAS-SSRS) Détails
Data science Détails
Talend Détails
Microsoft BI (MCSE) Détails
Power BI Détails
SAP BI/BW 7.5 HANA Détails
Informatica PowerCenter 10.4 Détails
Big Data & Machine Learning Détails
Microsoft Power Platform Fundamentals Détails
BIG DATA Détails
Google Cloud Big Data and Machine Learning Fundamentals Détails
Big Data on Amazon Web Services (AWS) Détails
Data Engineering on Google Cloud Platform (DEGCP) Détails
JasperReports Détails
Elasticsearch, Logstash, and Kibana (ELK) Détails

Vous pouvez faire l’inscription ou la demande du devis avec un seul click