04 Dec 19 · victor · #general ·   Bookmark   ×

How to Choose the right cloud data warehouse for your company - The Ultimate Checklist

Information is wealth. Today, less than 0.5% of data is actually being used, and businesses lose over $600 billion a year because of bad data. That means that, by and large, businesses are not properly storing, using and analyzing the data that they have coming in each day.

This is a huge problem, especially in today’s data-driven culture. Your data is only powerful if you can use it - and it is no good to you if you can’t properly organize and analyze it.

For many, this bad data problem comes from choosing the wrong type of data storage and running ineffective analytics as a result. Here, we break down the basics of data warehouses and discuss the important criteria that companies must keep in mind when considering a data warehouse for their business

Choosing a data warehouse is a crucial decision that you may have to make, and a bad decision can haunt you for years. When it comes to popular cloud data warehousing solutions, you have,

Cloud Data Warehouse Evaluation Checklist:

In the big picture, it’s important to choose a data warehouse solution that will fit in with your business model, your budget and your existing systems. Here's a comprehensive checklist for you to evaluate. Make sure it ticks all the crucial boxes.

1. Rich SQL Support: Its obvious, The cloud data warehouse should offer good SQL support. Run the below tests to make sure the product works fine and covers all your use cases.

Create a Table
Create a View
Create a Table as Select from Another table (CTAS)
Rename a Table
Create a View
Create a View with complex joins
Create views with multiple layers
Alter table add, modify, drop columns
Delete a Table, Delete a Table using complex joins
Access provisioning - Test Groups, Roles and Grants
Join Multiple tables on non Key fields
Test Updates on Tables
Test Inserts into Table
Test Merge or Upsert statements
Test Analytical functions - ROW_NUM, MIN, MAX, AVG, PERCENTILE
Test Recursive functions
Test ETL Loads and Unloads
Thoroughly check all the datatypes are present and supported by your reporting tools
Transaction control - Does it rollback the batch if any transaction fails
Check support for Constraints - Identity, Not Nulls, Primary Keys, Foreign Keys
Check connectivity and driver support with your consuming applications
Test Stored procedures, macros, conditional constructs etc

2. Speed : Perform speed benchmarks between your current solution vs the cloud data warehouse. Its essential to have a good QoS (Quality of Service) in your data warehouse. After all we want to provide an exceptional experience for your analytical users.

Performance benchmarks between your-current-solution vs New-cloud-datawarehouse
Concurrency and spike load testing to ensure the solution scales with load
Involve your reporting users to test their dashboards and gather their feedback

3. Scalability & Concurrency : We need to ensure the cloud data warehouse scales in proportion to the load and also supports concurrent read / write operations.

Spike testing - Test if the cloud Data Warehouse scales up to handle any surge in the no of queries
Test concurrent write / reads to make sure there is no lock waits
Test the latency between Read and Write Operations
Test Scale down functionality

4. Cost and Return-on-Investment: The POC is never complete without performing extensive tests on the operating costs. After all, its our primary goal to evaluate the ROI, we need to make sure the solution is sustainable for our organisation

Estimate Storage costs - consider active storage and long term storage discounts
Estimate Compute costs
Estimate Other Fixed costs
Estimate the staff training and migration costs
Estimate the opportunity costs, to make sure there is no impact to the ongoing operations
Run cost benchmarks with your on premise solution
Run an ROI on the cloud proposal

5.Innovation: Modern cloud platforms come with a bunch of innovative features. These features will save you costs as well as make your life easier. Here are few of the cool things you need to look for, before you settle on your platform

Time travel - Cloud DW platforms let you go back in time and look at the data at that point in time. You can also use this to recover any accidentally deleted data
Zero copy Cloning - You can duplicate your database, tables without paying for storage twice
Automated Backups
Automated Restores
Caching of query results - Would help you on your cloud bill
Compression - Would help you on your cloud storage bill
System schemas - To get metadata about your objects
Cost monitoring and Alerts

6.Security: Typical cloud providers stay hyper up-to-date with security patches and protocols to keep their host of customers safe and happy. Make sure the boxes are checked while choosing your cloud data warehouse.

Is the data at rest encrypted
Is the data in transit encrypted
Are there tools for handling sensitive data elements (PII)
Authentication controls - SSO / 2 Factor Auth
How are the encryption keys managed

Cloud data warehouse provides you low entry of cost and happily scales as your company grows. Hope this checklist helps you choose the right data warehouse for your company and also help you run effective analytics and BI that you will need for long-term stability and success


posted on 04 Dec 19

Enjoy great content like this and a lot more !

Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds