Step | Do |
Let’s create the date_dim table using the default distribution style auto. This is a small static dimension table
|
Run the following table creation SQL in query editor
CREATE TABLE public.date_dim
(
d_date_sk INTEGER NOT NULL,
d_date_id CHAR(16) NOT NULL,
d_date DATE,
d_month_seq INTEGER,
d_week_seq INTEGER,
d_quarter_seq INTEGER,
d_year INTEGER,
d_dow INTEGER,
d_moy INTEGER,
d_dom INTEGER,
d_qoy INTEGER,
d_fy_year INTEGER,
d_fy_quarter_seq INTEGER,
d_fy_week_seq INTEGER,
d_day_name CHAR(9),
d_quarter_name CHAR(6),
d_holiday CHAR(1),
d_weekend CHAR(1),
d_following_holiday CHAR(1),
d_first_dom INTEGER,
d_last_dom INTEGER,
d_same_day_ly INTEGER,
d_same_day_lq INTEGER,
d_current_day CHAR(1),
d_current_week CHAR(1),
d_current_month CHAR(1),
d_current_quarter CHAR(1),
d_current_year CHAR(1)
)
;
|
Create the product_reviews table with distribution style auto. Sort key of columns marketplace and review_date. These columns are good choice for sortkey as end users are going to use these in the query frequently to filter on
|
Run the following table creation SQL in query editor
CREATE TABLE product_reviews
(
marketplace varchar(2),
customer_id bigint,
review_id varchar(24),
product_id varchar(24),
product_parent bigint,
product_title varchar(512),
star_rating int,
helpful_votes int,
total_votes int,
vine char(1),
verified_purchASe char(1),
review_headline varchar(256),
review_body varchar(max),
review_date date,
year int,
product_category varchar(32))
SORTKEY (
marketplace,
review_date);
|
Review the compression/encoding of the columns. Amazon Redshift automatically creates encoding of column using the performant AZ64 encoding for numeric, date, datetimestamp columns and lzo for char and varchar columnsn
|
Run the following table creation SQL in query editor
SELECT "column", type, encoding FROM pg_table_def
WHERE tablename = 'product_reviews';
|
You will see the following output.
• The default encoding of AZ64 is applied to numeric and date columns. Default encoding of lzo is applied for character columns
• The first sort_key column marketplace is not encoded, best practice is to not encode the first sort key column