2. Lab Goal

Let’s tune the physical model in Amazon Redshift leveraging best practices and with the objective of meeting the performance SLA of running the queries in seconds.
The goal is to leverage the massively parallel processing of Amazon Redshift, where the queries will execute on each and every compute node and by minimizing I/O.
Let’s review the concepts in Amazon Redshift which will help us achieve these goals.

If this were a production cluster, you’d get daily suggestions from the Amazon Redshift Advisor on these topics for free.

Concepts

ConceptGoal Benefits
Compression/Encoding • Allow more data to be stored within an Amazon Redshift cluster

• Improve query performance by decreasing I/O

Allows two to four times more data to be stored within the cluster

Zone Maps
• Automatically built in-memory block metadata
• Contains per-block min and max values
• All blocks automatically have zone maps

• Eliminates unnecessary I/O
• Effectively prunes blocks that cannot contain data for a given query

Automatically improves filter performance

Sort Keys

• Make queries run faster by increasing the effectiveness of zone maps and reducing I/O

Improves filter performance

Distribution keys

• Distribute data evenly for parallel processing across compute nodes

• Minimize data movement during query processing

Improves join performance