Before we start the analyzing the product_reviews data, let’s check the tables using SVV_TABLE_INFO
select "table", diststyle, skew_rows, tbl_rows,stats_off, sortkey1, sortkey_num
from svv_table_info
where "table" in
('product_reviews', 'customer','customer_address', 'date_dim');
You will see the following output.
• date_dim and prduct_reviews tables were created using the default distribution style of auto. Amazon Redshift automatically manages the distribution style, it automatically creates the small tables as Auto (ALL) and when the table becomes large it converts them to AUTO(EVEN)
• skew_rows - Ratio between the slice with the most and least number of rows. customer and customer_address tables both have skew_rows of 1.00 which is ideal, as the data is distributed evenly across all the compute nodes. Perfect!
• stats_off column value of “0.00” indicates the tables are already analyzed. Perfect!