Azure Data Engineer Associate Certification Guide

by Giacinto Palmieri, Surendra Mettapalli, Newton Alex

Cloud Computing

Book Details

Book Title

Azure Data Engineer Associate Certification Guide

Author

Giacinto Palmieri, Surendra Mettapalli, Newton Alex

Publisher

Packt Publishing Pvt. Ltd

Publication Date

2024

ISBN

9781805124689

Number of Pages

493

Language

English

Format

PDF

File Size

9.8MB

Subject

cloud-computing

Table of Contents

  • Cover
  • Azure Data Engineer Associate Certification Guide – Second Edition
  • Contributors
  • About the Authors
  • About the Reviewer
  • Preface
  • Part 1: Azure Basics
  • Chapter 1: Introducing Azure Basics
  • Making the Most Out of this Book – Your Certification and Beyond
  • Technical Requirements
  • Introducing the Azure Portal
  • Exploring Azure Accounts, Subscriptions, and Resource Groups
  • Introducing Azure Services
  • Exploring Azure VMs
  • Exploring Azure Storage
  • Exploring Azure Networking (VNet)
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Part 2: Data Storage
  • Chapter 2: Implementing a Partition Strategy
  • Technical Requirements
  • Benefits of Partitioning
  • Designing a Partition Strategy for Files
  • Designing Partition Strategy for Analytical Workloads
  • Implementing Partition Strategy for Streaming Workloads
  • Partition Strategy for Efficiency and Performance
  • Designing Partition Strategy for Azure Synapse Analytics
  • Recognizing Partitioning Needs in ADLS Gen2
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 3: Designing and Implementing the Data Exploration Layer
  • Technical Requirements
  • Introduction to Data Exploration
  • SQL Serverless and Spark Clusters
  • Azure Synapse Analytics Database Templates
  • Microsoft Purview
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Part 3: Data Processing
  • Chapter 4: Ingesting and Transforming Data
  • Technical Requirements
  • Designing and Implementing Incremental Loads
  • Transforming Data Using Apache Spark
  • Transforming Data Using T-SQL
  • The Transforming Options Available in ADF
  • Transformations Using Synapse Pipelines
  • Transforming Data Using Stream Analytics
  • Splitting Data
  • Shredding JSON to Manage Data Elements
  • Encoding and Decoding Data
  • Configuring Error Handling for the Transformation
  • Normalizing and Denormalizing Values
  • Performing Data Exploratory Analysis
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 5: Developing a Batch Processing Solution
  • Technical Requirements
  • Batch-Processing Technologies
  • Storage
  • Data Ingestion
  • Transformation
  • Using PolyBase to Load Data to a SQL Pool
  • Implementing Azure Synapse Link and Querying Replicated Data
  • Creating Data Pipelines
  • Scaling Resources
  • Configuring Batch Size
  • Creating Tests for Data Pipelines
  • Integrating Jupyter/Python Notebooks into a Data Pipeline
  • Upserting Data
  • Reverting Data to a Previous State
  • Configuring Exception Handling
  • Configuring Batch Retention
  • Reading from and Writing to a Delta Lake
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 6: Developing a Stream Processing Solution
  • Technical Requirements
  • Implementing a Streaming Use Case with Azure
  • Processing Data Using Spark Structured Streaming
  • Creating Windowed Aggregates
  • Handling Schema Drifts
  • Processing Time Series Data
  • Processing Data across Partitions
  • Configuring Checkpoints and Watermarking
  • Scaling Resources
  • Developing Testing Processes for Data Pipelines
  • Optimizing Pipelines for Analytical or Transactional Purposes
  • Handling Interruptions
  • Configure Exception Handling
  • Upserting Data
  • Replaying Archived Stream Data
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 7: Managing Batches and Pipelines
  • Technical Requirements
  • Trigger Batches
  • Handling Failed Batch Loads
  • Validating Batch Loads
  • Managing Data Pipelines in ADF or Synapse
  • Scheduling Data Pipelines in ADF or Synapse
  • Implementing Version Control for Pipeline Artifacts
  • Managing Spark Jobs in a Pipeline
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Part 4: Secure, Monitor, and Optimize Data Storage and Processing
  • Chapter 8: Implementing Data Security
  • Technical Requirements
  • Implementing Data Masking
  • Encrypting Data at Rest and in Motion
  • Implementing Row-Level and Column-Level Security
  • Implementing Azure Role-Based Access Control
  • Implementing POSIX-Like ACLs for ADLS Gen2
  • Resolving Conflicting Rules: RBAC and ACLs
  • Implementing a Data Retention Policy
  • Implementing Secure Endpoints: Public and Private
  • Implementing Resource Tokens in Azure Databricks
  • Loading DataFrames with Sensitive Information
  • Managing Sensitive Information
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 9: Monitoring Data Storage and Data Processing
  • Technical Requirements
  • Implementing Logging by Azure Monitor
  • Configuring Monitoring Services
  • Monitoring Stream Processing
  • Measuring the Performance of Data Movement
  • Monitoring and Updating Statistics
  • Monitoring Data Pipeline Performance
  • Measuring Query Performance
  • Scheduling and Monitoring Pipeline Tests
  • Interpreting Azure Monitor Metrics and Logs
  • Implementing a Pipeline Alert Strategy
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 10: Optimizing and Troubleshooting Data Storage and Data Processing
  • Technical Requirements
  • Managing Small Files
  • Handling Skew in Data
  • Handling Data Spill
  • Optimizing Resource Management
  • Tuning Queries Using Indexers
  • Tuning Queries Using Caching
  • Troubleshooting a Failed Spark Job
  • Summary
  • Exam Readiness Drill – Chapter Review Questions
  • Working On Timing
  • Chapter 11: Accessing the Online Practice Resources
  • How to Access These Materials
  • Troubleshooting Tips
  • Back to the Book
  • Why subscribe?
  • Other Books You May Enjoy