Developers
August 5, 2020

Using Google AutoML Tables To Build ML Models Automatically

Build, analyze, and deploy machine learning models automatically. It can be used for many purposes, including fraud detection and credit risk analysis.

Today we will talk about AutoML Tables. What are they? Well, they let you build, analyze, and deploy machine learning models automatically.

You can use it for a wide range of machine learning tasks, for example, asset valuations, fraud detection, credit risk analysis, and customer retention prediction.

There are a couple of features that help make AutoML Tables more useful and user friendly. We enumerate them in the following list.

  • Improved Python Client library
  • The ability to obtain explanations for your online predictions
  • The ability to export your model and serve it in a container
  • The ability to view model search progress and final model hyperparameters in Cloud Logging

Cloud AI Platform Pipelines provides the capacity to deploy robust repeatable machine learning pipelines. It allows monitoring, auditing, version tracking, reproducibility, and delivers an enterprise-ready, easy to install a secure execution environment for your machine learning workflows.

The pipeline fulfills a function. It creates a dataset and imports the data into the dataset from a BigQuery view. It later trains a model on the same data. One step further, it fetches metrics and information about the trained model.

After having the metrics and information about the model, it uses the information to determine if to deploy or not the model for online prediction. If it decides to deploy the model, it can make prediction requests, and obtain prediction explanations as well as prediction results.

You can manage the workflow from the Cloud Console Tables UI, or via a notebook or script. By specifying the process as a workflow, you gain some advantages, the workflow becomes reliable and the pipelines make it easy to monitor the results. If your dataset is updated regularly, you can schedule workflows to run daily. Each day building a model that trains on an updated dataset.

Datasets and Machine Learning

Public datasets that are useful for experimenting with machine learning are provided by The Cloud Public Datasets Program. They use data that is essential to join two public datasets that are stored in BigQuery. The datasets are: London Bike rentals and NOAA Weather Data. By using this dataset, a regression model occurs. It predicts the duration of a bike rental based on information about the start and end rental stations.

The Cloud AI Platform Pipelines is currently in Beta mode. It provides a way to deploy robust, repeatable machine learning pipelines. It also delivers an enterprise-ready, easy to install environment for your machine learning workflows. AI Platform Pipelines is based on Kubeflow Pipelines (KFP). They are installed on a Google Kubernetes Engine cluster (GKE).

Once the dataset is defined, the pipeline trains the model. This happens in the “automl-create-model-for-tables” pipeline step. Via pipeline parameters, one can specify the training budget, the optimization objective and determine which columns are included or excluded from the model.

Cloud logging and the details of your model

It is recommended to specify a non-default optimization objective depending upon the characteristics of your dataset. The table we mention describes the available optimization objectives and when is the perfect time to use them. 

Via Cloud Logging, you can check out the details about an AutoML Tables model. Using Logging you can see the final model as well as the object values that the model uses for training and tuning.  

The easiest way to access these logs is to got to the AutoML Tables page that is located in the Cloud Console. In the console, select the Models tab and click on your interested model. Finally, click on the model link to see the final logs. 

Once you finish training your model, the pipeline moves to the next step. The next step is the model evaluation. At this stage, we access evaluation metrics. We´ll use this information to decide if the model will be deployed or not.  

All of the pipeline steps can be accomplished via the AutoML Tables UI that is located in the Cloud Console including functionality not implemented on this example. Also, the ability to export the model´s test set and the prediction results to BigQuery for further analysis. The Tables also come with a feature that lets you export your full custom model packaged so that you can serve it via a Docker container. 

In conclusion, Google AutoML Tables let you build, analyze, and deploy machine learning models automatically. There are a couple of features that make the service useful and user-friendly. Improved Python Client library, The ability to obtain explanations for your online predictions, The ability to export your model and serve it in a container, The ability to view model search progress, and final model hyperparameters in cloud logging, among others. Cloud AI Platform Pipelines provides the capacity to deploy robust repeatable machine learning pipelines. If you are interested in the service, don´t forget to check out the Google Cloud Blog

 

TagsGCPMachine LearningGoogle AutoML
Lucas Bonder
Technical Writer
Lucas is an Entrepreneur, Web Developer, and Article Writer about Technology.

Related Articles

Back
DevelopersAugust 5, 2020
Using Google AutoML Tables To Build ML Models Automatically
Build, analyze, and deploy machine learning models automatically. It can be used for many purposes, including fraud detection and credit risk analysis.

Today we will talk about AutoML Tables. What are they? Well, they let you build, analyze, and deploy machine learning models automatically.

You can use it for a wide range of machine learning tasks, for example, asset valuations, fraud detection, credit risk analysis, and customer retention prediction.

There are a couple of features that help make AutoML Tables more useful and user friendly. We enumerate them in the following list.

  • Improved Python Client library
  • The ability to obtain explanations for your online predictions
  • The ability to export your model and serve it in a container
  • The ability to view model search progress and final model hyperparameters in Cloud Logging

Cloud AI Platform Pipelines provides the capacity to deploy robust repeatable machine learning pipelines. It allows monitoring, auditing, version tracking, reproducibility, and delivers an enterprise-ready, easy to install a secure execution environment for your machine learning workflows.

The pipeline fulfills a function. It creates a dataset and imports the data into the dataset from a BigQuery view. It later trains a model on the same data. One step further, it fetches metrics and information about the trained model.

After having the metrics and information about the model, it uses the information to determine if to deploy or not the model for online prediction. If it decides to deploy the model, it can make prediction requests, and obtain prediction explanations as well as prediction results.

You can manage the workflow from the Cloud Console Tables UI, or via a notebook or script. By specifying the process as a workflow, you gain some advantages, the workflow becomes reliable and the pipelines make it easy to monitor the results. If your dataset is updated regularly, you can schedule workflows to run daily. Each day building a model that trains on an updated dataset.

Datasets and Machine Learning

Public datasets that are useful for experimenting with machine learning are provided by The Cloud Public Datasets Program. They use data that is essential to join two public datasets that are stored in BigQuery. The datasets are: London Bike rentals and NOAA Weather Data. By using this dataset, a regression model occurs. It predicts the duration of a bike rental based on information about the start and end rental stations.

The Cloud AI Platform Pipelines is currently in Beta mode. It provides a way to deploy robust, repeatable machine learning pipelines. It also delivers an enterprise-ready, easy to install environment for your machine learning workflows. AI Platform Pipelines is based on Kubeflow Pipelines (KFP). They are installed on a Google Kubernetes Engine cluster (GKE).

Once the dataset is defined, the pipeline trains the model. This happens in the “automl-create-model-for-tables” pipeline step. Via pipeline parameters, one can specify the training budget, the optimization objective and determine which columns are included or excluded from the model.

Cloud logging and the details of your model

It is recommended to specify a non-default optimization objective depending upon the characteristics of your dataset. The table we mention describes the available optimization objectives and when is the perfect time to use them. 

Via Cloud Logging, you can check out the details about an AutoML Tables model. Using Logging you can see the final model as well as the object values that the model uses for training and tuning.  

The easiest way to access these logs is to got to the AutoML Tables page that is located in the Cloud Console. In the console, select the Models tab and click on your interested model. Finally, click on the model link to see the final logs. 

Once you finish training your model, the pipeline moves to the next step. The next step is the model evaluation. At this stage, we access evaluation metrics. We´ll use this information to decide if the model will be deployed or not.  

All of the pipeline steps can be accomplished via the AutoML Tables UI that is located in the Cloud Console including functionality not implemented on this example. Also, the ability to export the model´s test set and the prediction results to BigQuery for further analysis. The Tables also come with a feature that lets you export your full custom model packaged so that you can serve it via a Docker container. 

In conclusion, Google AutoML Tables let you build, analyze, and deploy machine learning models automatically. There are a couple of features that make the service useful and user-friendly. Improved Python Client library, The ability to obtain explanations for your online predictions, The ability to export your model and serve it in a container, The ability to view model search progress, and final model hyperparameters in cloud logging, among others. Cloud AI Platform Pipelines provides the capacity to deploy robust repeatable machine learning pipelines. If you are interested in the service, don´t forget to check out the Google Cloud Blog

 

GCP
Machine Learning
Google AutoML
About the author
Lucas Bonder -Technical Writer
Lucas is an Entrepreneur, Web Developer, and Article Writer about Technology.

Related Articles