
Machine Learning with BigQuery ML
By :

Starting from 1998 with the launch of Google Search, Google has developed one of the largest and most powerful IT infrastructures in the world. Today, this infrastructure is used by billions of users to use services such as Gmail, YouTube, Google Photo, and Maps. After 10 years, in 2008, Google decided to open its network and IT infrastructure to business customers, taking an infrastructure that was initially developed for consumer applications to public service and launching Google Cloud Platform (GCP).
The 90+ services that Google currently provides to large enterprises and small- and medium-sized businesses cover the following categories:
All the services just mentioned can be accessed through four different interfaces:
Figure 1.1 – Screenshot of Google Cloud Console
Now that we've learned how to interact with GCP, let's discover how GCP is different from other cloud providers.
GCP is not the only public cloud provider on the market. Other companies have embarked on this kind of business, for example, with Amazon Web Services (AWS), Microsoft Azure, IBM, and Oracle. For this reason, before we get too deep into this book, it could be valuable to understand how GCP is different from the other offerings in the cloud market.
Each cloud provider has its own mission, strategy, history, and strengths. Let's take a look at why Google Cloud can be considered different from all the other cloud providers.
Google provides an end-to-end security model for its data centers across the globe, using customized hardware developed and used by Google, and application encryption is enabled by default. The security best practices adopted by Google for GCP are the same as those developed to run applications with more than 1 billion users, such as Gmail and Google Maps.
At the time of writing, Google's infrastructure is available in 24 different regions, 74 availability zones, and 144 network edge locations, enabling customers to connect to Google's network and ensuring the best experience in terms of bandwidth, network latency, and security. This network allows GCP users to move data across different regions without leaving Google's proprietary network, minimizing the risk of sending information across the public internet. As of today, it is estimated that about 40% of internet traffic goes through Google's proprietary network.
In the following figure, we can see how GCP regions are distributed across the globe:
Figure 1.2 – A map of Google's global availability
The latest version of the map can be seen at the following URL: https://cloud.google.com/about/locations.
Google provides a lot of fully managed and serverless services to allow its customers to focus on high-value activities rather than maintenance operations. A great example is BigQuery, the serverless data warehouse that will be introduced in the next section of this chapter.
100% of the energy used for Google's data centers comes from renewable energy sources. Furthermore, Google has committed to being the first major company to operate carbon-free for all its operations, such as its data centers and campuses, by 2030.
Google is a pioneer of the AI industry and is leveraging AI and ML to improve its consumer products, such as Google Photos, but also to improve the performance and efficiency of its data centers. All of Google's expertise in terms of AI and ML can be leveraged by customers through adopting GCP services such as AutoML and BigQuery ML. That will be the main focus of this book.
Now that we have discussed some of the key elements of GCP as a service, let's look at AI and ML more specifically.