06 June 2019 · Bozhack-miller ·       Add to Favorites   Report

Bigquery is super expensive when selecting just few rows

For example,

select * from logs.nobids_05 limit 1

gives me "This query will process 274 GB when run".

Google's response:

BigQuery is an analytical database. It's architecture and pricing are optimized for analysis at scale, not for single row handling.

Every operation in BigQuery involves a full table scan, but only of the columns mentioned in the query. The goal is to have predictable costs: Before running the query you are able to know how much data will be involved, therefore its cost. It might seem a big price to query just one row, but the good news is the cost remains constant, even when the queries get way more complex and CPU intensive.

Once in a while you might need to run a single row query, and the costs might seem excessive, but the assumption here is that you are using this tool to analyze data at scale, and the overall costs of having data stored in it should be more than competitive with other tools available. Since you've been working with other tools, I'd love to see a total cost comparison of analytical sessions within real case scenarios

Bozhack-miller

posted on 06 June 2019

Read great educational content like this and a lot more !

Members get free exclusive access to content, new courses, and discounts. Signup for a free account to write a post / comment / upvote posts. Creating an account takes less than 5 seconds and you can start earning badges & points too

Copied