Professional Documents
Culture Documents
Complete Machine
Learning Lifecycle
Corey Zumar
March 27th, 2019
Outline
• Overview of ML development challenges
• MLflow components
• Demo
2
Machine Learning
Development is Complex
3
μ
ML Lifecycle λθ Tunin
g
Scal
Data Prep e
μ
Model λθ Tunin
Delta Raw Data Exchang Training g
e Scal
e
Scal
e
Deploy
Governanc
e Scal
e
4
Custom ML Platforms
Facebook FBLearner, Uber Michelangelo, Google TFX
+ Standardize the data prep / training / deploy loop:
if you work with the platform, you get these!
– Limited to a few algorithms or frameworks
– Tied to one company’s infrastructure
6
MLflow Components
7
Key Concepts in Tracking
Parameters: key-value inputs to your code
Metrics: numeric values (can update over time)
Artifacts: arbitrary files, including data and models
Source: training code that ran
Version: version of the training code
Tags and Notes: any additional information
8
MLflow Tracking
Tracking APIs
(REST, Python, Java, R)
UI
9
MLflow Tracking
import mlflow
with mlflow.start_run():
# train model
Record and query
experiments: code, mlflow.log_metric("mse", model.mse())
mlflow.log_artifact("plot", model.plot(test_df))
configs, results, mlflow.tensorflow.log_model(model)
…etc
10
MLflow backend stores
1. Entity Store
• FileStore (local filesystem)
• SQLStore (via SQLAlchemy)
• REST Store
2. Artifact Repository
• S3 backed store
• Azure Blob storage
• Google Cloud storage
• DBFS artifact repo
11
Demo
Goal: Classify hand-drawn digits
12
MLflow Projects Motivation
Diverse set of training
tools Result:
ML code is difficult
Diverse set of
to productionize.
environments
13
MLflow Projects
Local Execution
Project Spec
Code Config
Remote Execution
Dependencie Data
s
14
MLflow Projects
Packaging format for reproducible ML runs
• Any code folder or GitHub repository
• Optional MLproject file with project configuration
Defines dependencies for reproducibility
• Conda (+ R, Docker, …) dependencies can be specified in MLproject
• Reproducible in (almost) any environment
Execution API for running projects
• CLI / Python / R / Java
• Supports local and remote execution
15
Example MLflow Project
my_project/ conda_env: conda.yaml
├── MLproject
│ entry_points:
│ main:
│ parameters:
│ training_data: path
│ lambda: {type: float, default: 0.1}
├── conda.yaml command: python main.py {training_data}
├── main.py {lambda}
└── model.py
... $ mlflow run git://<my_project>
16
Demo
Goal: Classify hand-drawn digits
17
MLflow Models Motivation
Inference
Code
Serving Tools
ML
Frameworks
18
MLflow Models
Inference Code
Model Format
Flavor 1 Flavor 2
Batch & Stream
Scoring
Standard for ML
ML models Serving Tools
Frameworks
19
MLflow Models
Packaging format for ML Models
• Any directory with MLmodel file
Defines dependencies for reproducibility
• Conda environment can be specified in MLmodel configuration
Model creation utilities
• Save models from any framework in MLflow format
Deployment APIs
• CLI / Python / R / Java
20
Example MLflow Model
run_id: 769915006efd4c4bbd662461
my_model/ time_created: 2018-06-28T12:34
├── MLmodel flavors:
│ tensorflow:
│ saved_model_dir: estimator
Usable with Tensorflow
│ signature_def_key: predict tools / APIs
│ python_function:
loader_module: Usable with any Python
│
mlflow.tensorflow tool
└ estimator/
├─ saved_model.pb
└─ variables/
mlflow.tensorflow.log_model(...)
...
21
Model Flavors Example
PyTorch mlflow.pytorch.log_model()
Train a model
predict = mlflow.pyfunc.load_pyfunc(…)
Flavor 1:
Pyfunc predict(input_dataframe)
Model Flavor 2:
Format model = mlflow.pytorch.load_model(…)
PyTorch
with torch.no_grad():
model(input_tensor)
22
Model Flavors Example
predict = mlflow.pyfunc.load_pyfunc(…)
predict(input_dataframe)
23
Demo
Goal: Classify hand-drawn digits
24
Get started with MLflow
tinyurl.com/mlflow-slack
2525
0.9.0 Release
27
Thank you!
28