Mlengine Boilerplate

screenshot of Mlengine Boilerplate

Repository to quickly get you started with new Machine Learning projects on Google Cloud Platform. More info(slides):

Overview

The MLEngine-Boilerplate is a fantastic starting point for anyone looking to dive into machine learning projects on the Google Cloud Platform (GCP). This repository simplifies the setup process by providing a structured framework that encompasses everything from data preprocessing to model deployment. With built-in functionalities, it's designed to cater to both beginners and experienced developers who want to streamline their workflow and focus on developing effective models.

This boilerplate supports various stages of a machine learning project, including preprocessing pipelines powered by Apache Beam and model training with TensorFlow. It not only allows for local execution but also integrates smoothly with Google's ML Engine, making it easy to deploy and manage machine learning models in the cloud.

Features

  • Preprocessing Pipeline: Utilizes Apache Beam for efficient data preprocessing that can run on Cloud Dataflow or locally, allowing flexibility in handling data.

  • Model Training: Enables training with TensorFlow, whether locally or via ML Engine, catering to different development environments and needs.

  • Deployment Ready: The boilerplate is equipped to easily deploy saved models to ML Engine, facilitating a smooth transition from development to production.

  • Starter Code: Comes with template starter code to help users utilize their saved models quickly on ML Engine, reducing the initial development time.

  • Cloud SDK Integration: Requires the installation of Cloud SDK and gcloud to streamline the interaction with the Google Cloud infrastructure.

  • Customizable Code Structure: Users can personalize the code for preprocessing and model training by modifying key Python files, making it adaptable to specific project requirements.

  • Future Functionalities: The repository is actively being developed, with plans to add features such as hyperparameter tuning and support for TensorFlow Transform, ensuring it stays relevant with evolving ML practices.