Big Data and ETL Testing Fundamentals

Course Summary

This one day course is designed to familiarize business professionals in the Big Data and ETL space with the basics of testing and validating. This course focuses on getting professionals the knowledge required in order to successfully test and validate Big Data and ETL processes.

Intended Audience

  • Manual Testers
  • Automation Engineers
  • Quality Assurance Analysts
  • Developers
  • Project Managers
  • anyone involved with providing software quality for Big Data

At the end of the course, you will be able to:

  • Describe the purpose of a Big Data and the ETL process
  • Determine an appropriate testing strategy
  • Understand a source-target mapping document
  • Describe an approach to test each business rule
  • Recognize the different testing methods
  • Determine appropriate sample sizes and data permutations
  • Explain the different data error types
  • Have knowledge of the different testing tools
  • Understand the importance automated testing


  • Data Concepts
  • Big Data Concepts
  • What is ETL?
  • What is Business Intelligence (BI)?
  • What is Data Science and Analytics?
  • Transactional vs. Analytical Databases vs. Big Data Stores
  • Big Data Concepts
  • What is the Hadoop Ecosystem?
  • How does Hadoop Process Big Data?
  • Resources Types Involved
  • Main Structures
  • Introduction
    • Test points and legs
    • Single Leg strategy
    • Multi leg strategy
    • Single Leg vs Multi Leg

Principles of ETL Testing

Data Mapping DocumentTesting methods
  • Visual Compare
  • Record Counts
  • Minus Queries
  • Automation
Working with flat files
  • Excel Files
  • Comma delimited files
  • Fixed width files
  • XML Files
How much data to test?Testing incremental loadsMultiple Sources
  • Selective column and row type
  • Translation
  • Lookups
  • Transpose
  • Field Splitting
  • Field Merging
  • Calculated and Derived
  • Table Splitting
  • Assess: Test Strategy
    • Data Permutations
    • Test Data Sampling
    • Test Points
    • Leveraging Test Tools
  • Plan: Test Planning
    • Test List
    • Resource Estimation
    • Prioritizing
    • Scheduling
    • Defect workflow
    • Test Plan
  • Design: Test Case authoring
    • ETL Manual Test Creation
  • Visual Compare
  • Record Counts
  • Minus Queries
  • Home Grown
    • ETL Automated Test Creation
  • QuerySurge
    • BI Report Test Creation
  • Execute: Test Case Execution
    • Manual Tests
    • Automated Tests
  • Evaluate and Improve
    • Lessons learned
    • New Test cases
  • Missing Data
  • Truncation
  • Type Mismatch
  • Null Translation
  • Misplaced Data
  • Extra records
  • Logic Issues
  • Duplicate Records
  • Precision
  • Sequence
  • Rejected Rows
  • Undocumented Requirements
  • Simple Issues

Transformation Types

Testing Process

Defect Types

Students also registered for...


Big Data Immersion

Learn More

Advanced Big Data Testing using Hive and HQL

Learn More

Introduction to Big Data Testing using Hive and HQL

Learn More