What is AWS Redshift Spectrum?
Redshift Spectrum is a feature in AWS Redshift data warehousing service.
How is Redshift Spectrum different from other features of AWS Redshift?
With AWS Redshift Spectrum, users can query and retrieve data from files in Amazon S3 with out the need of loading the data into Redshift tables. This is especially useful if performance is not a top priority and avoid complexity of data movement.
What are Key features of AWS Redshift Spectrum?
- Redshift Spectrum is highly cost effective due to pay per usage functionality. You will pay for the amount of data scanned i.e. $5 per TB of data scan.
- Redshift Spectrum is easy to setup and use. Minimal Administration required when compared to other features of Redshift.
- Redshift Spectrum is serverless and highly scalable. AWS will scale the capacity as per the user load and can handle large concurrent transactions.
- Redshift Spectrum can act as Lakehouse layer.
How to setup, configure & use AWS Redshift Spectrum?
AWS Redshift Spectrum requires a external data catalog service. AWS Recommends AWS Glue. You can also utilize Hive metastore running on EMR. Spectrum also needs a base Redshift Cluster since any queries goes through the cluster. To Setup, Configure & use AWS Redshift Spectrum, follow the below steps
- Build a Redshift Cluster with minimum configuration.
- Create the required IAM roles & Policies with necessary permission.
- Integrate with External Data Catalog.
- Create External tables with location pointing to S3 bucket paths.
- To query the data using JDBC & ODBC drivers, download the drivers and install in the JDBC/ODBC tools.
- User management can be handled locally on Redshift cluster or you can integrate with AD through SSO.
What are some of AWS Redshift Spectrum Best Practices?
- Ensure the data stored in S3 buckets referred in Spectrum tables utilize Columnar format, especially Parquet which is suitable.
- Implement Partitioning where available. Partitioning the data will avoid scanning the entire data set and improve performance.