
Machine Learning with BigQuery ML
By :

In this section, the pricing model for BigQuery and BigQuery ML is explained. Since the pricing of GCP services is ever-evolving, we suggest that you consult https://cloud.google.com/bigquery/pricing to get the latest updates.
Let's look at the models for BigQuery.
BigQuery pricing is scalable according to the use of this technology. There are three main cost drivers:
BigQuery storage costs are calculated based on the uncompressed size of your datasets. BigQuery offers two layers of storage:
Tip
Thanks to BigQuery long-term storage, it is no longer necessary to transfer archived data to Google Cloud Storage to save money. You can keep your data online and accessible at a very low cost.
BigQuery compute costs are calculated based on the volumes of data that are processed by the executed queries. The compute cost can vary according to the model that the customer has chosen:
Tip
Keep in mind that you're not charged to store BigQuery public datasets: you pay only to access and query them. For this reason, using BigQuery public datasets can be a cost-effective way to perform tests on BigQuery, paying only for compute and not for storage.
Loading data into BigQuery is usually free, apart from the ingestion processes that happen through the BigQuery streaming API. As of October 2020, you will be charged $0.010 for every 200 MB ingested with this interface. Each row is treated as a minimum of 1 KB.
Tip
If your use case doesn't require you to ingest data in real time, we suggest you use the bulk loading mechanism to ingest data into BigQuery, which is always free of charge.
The pricing model of BigQuery ML is similar to that for BigQuery compute costs. As we saw in the previous section, customers can choose between the following options:
If the customer has already chosen to activate flat rate mode with a fixed number of BigQuery slots available, BigQuery slots are also leveraged to train, evaluate, and run the BigQuery ML models.
If the customer is using the on-demand pricing model, it is necessary to split the BigQuery ML algorithms into two different categories:
At the time of writing, the pricing of internal models is based on the volumes of data processed during the main stages of the ML life cycle (training, evaluation, and prediction):
Figure 1.15 – BigQuery ML pricing for internal ML models
The pricing of external models is based on the cost of the external AI Platform resources used for the training of the model plus an additional BigQuery ML fee applied on top:
Figure 1.16 – BigQuery ML pricing for external ML models
Prices are always under review and subject to change on GCP. For this reason, we suggest consulting https://cloud.google.com/bigquery-ml/pricing.
BigQuery offers a wide variety of operations free of charge, as well as free tiers to experiment with this technology at no cost.
The following operations are always free in BigQuery:
To encourage experimentation with BigQuery, every month a user has the ability to leverage a free budget of operations under a certain threshold, as seen in the following table:
Figure 1.17 – BigQuery ML free tiers
Now that we've seen the BigQuery free tiers that we can use, let's take a look at the pricing calculator.
If you want to have a good estimation of the cost of using BigQuery with on-demand pricing, you can use the Google Cloud pricing calculator: https://cloud.google.com/products/calculator. The following screenshot shows the monthly cost of storing, ingesting through streaming, and processing the following data volumes:
Figure 1.18 – BigQuery pricing calculator
You can use the pricing calculator to estimate the consumption of all the other GCP services to get a better understanding of your GCP costs.