Amazon Redshift is a fully transactional, ACID compliant Cloud data warehouse.
Redshift has three Cluster types
- Redshift : Leader node and Compute Node with Data stored locally on compute nodes. Computes nodes can be of Dense Compute (DC2) and Dense Storage (DS2 types)
- Redshift Spectrum : Redshift Spectrum nodes with data stored in S3. You still need a Redshift cluster .
- Redshift Managed Storage : Storage and compute completely decoupled. Storage will be managed by S3 and data transfers back & forth between Compute and Storage layers through Redshift AQUA.
Redshift Key Components
Leader Node : Stores Metadata. Interacts with Worker Nodes for executing the SQL Processing in Parallel.
Compute Nodes : Data stored in Local disks and executes queries in Parallel.
- Create sort key on frequently filtered columns with lowest cardinality.
- Do not create more than four sort key columns.
- Implement compression on columns (use AZ64 with best performance where possible).
- Keep data types as wide as possible.
- If columns do not cause skew, co-locate large tables using DISTSTYLE KEY.
- Limit Commits to reduce commit overhead.
- For Data Ingestion utilizing COPY statement, split the single file in to equal number of available slices for parallelism and improved performance. To find slices – SELECT count (slice) from stv_slices.
- Redshift spectrum –
- Save cluster resources for querying and reporting rather than on ELT.
- Aggregate incoming data.
- Select subset of columns and/or rows.
- Consider using DROP TABLE or TRUNCATE instead of DELETE Staging.
- For copying millions of rows, use ALTER TABLE APPEND instead of INSERT INTO SELECT.
- For staging tables, use temporary table or table with “BACKUP NO” option.
- WLM –
- Use Auto WLM , if your workload is unpredictable.
- Use Manual WLM, if you have well architected workload patterns.
- Keep number of WLM queues to Minimum (around 3).
- Cluster setup – Keep at least 20% of Free space.
Redshift Pricing depends on the type of cluster and usage details. For more information about Redshift Pricing , check out our post at http://www.cloudinfonow.com/aws-redshift-pricing/