Skip to content

Commit 5bc4290

Browse files
authored
Cloud ML transfer learning (#32)
Add Cloud ML transfer learning example (not quite complete yet), plus top-level INSTALL changes and workshop Dockerfile changes.
1 parent b3d9100 commit 5bc4290

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+3148
-168
lines changed

INSTALL.md

Lines changed: 76 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,22 @@
11

2+
23
# Installation instructions for the TensorFlow workshop
34

5+
- [Docker-based installation](#docker-based-installation)
6+
- [Download the container image](#download-the-container-image)
7+
- [Create a directory to hold data files needed by the workshop](#create-a-directory-to-hold-data-files-needed-by-the-workshop)
8+
- [Run the container](#run-the-container)
9+
- [Restarting the container later](#restarting-the-container-later)
10+
- [Virtual environment-based installation](#virtual-environment-based-installation)
11+
- [Install Conda + Python 2.7 to use as your local virtual environment](#install-conda--python-27-to-use-as-your-local-virtual-environment)
12+
- [Install TensorFlow into a virtual environment](#install-tensorflow-into-a-virtual-environment)
13+
- [Install some Python packages](#install-some-python-packages)
14+
- [Install the Google Cloud SDK](#install-the-google-cloud-sdk)
15+
- [Cloud ML setup](#cloud-ml-setup)
16+
- [Cloud ML SDK installation (for 'transfer learning' preprocessing)](#cloud-ml-sdk-installation-for-transfer-learning-preprocessing)
17+
- [Set up some data files used in the examples](#set-up-some-data-files-used-in-the-examples)
18+
- [Optional: Clone/Download the TensorFlow repo from GitHub](#optional-clonedownload-the-tensorflow-repo-from-github)
19+
420
You can set up for the workshop in two different, mutually-exclusive ways:
521

622
- [Running in a docker container](#docker-based-installation).
@@ -17,7 +33,7 @@ To use it, you'll need to have [Docker installed](https://docs.docker.com/engine
1733
Once Docker is installed and running, download the workshop image:
1834

1935
```sh
20-
$ docker pull gcr.io/google-samples/tf-workshop:v4
36+
$ docker pull gcr.io/google-samples/tf-workshop:v5
2137
```
2238

2339
[Here's the Dockerfile](https://github.com/amygdala/tensorflow-workshop/tree/master/workshop_image) used to build this image.
@@ -32,7 +48,7 @@ Once you've downloaded the container image, you can run it like this:
3248

3349
```sh
3450
$ docker run -v `pwd`/workshop-data:/root/tensorflow-workshop-master/workshop-data -it \
35-
-p 6006:6006 -p 8888:8888 gcr.io/google-samples/tf-workshop:v4
51+
-p 6006:6006 -p 8888:8888 gcr.io/google-samples/tf-workshop:v5
3652
```
3753

3854
Edit the path to the directory you're mounting as appropriate. The first component of the `-v` arg is the local directory, and the second component is where you want to mount it in your running container.
@@ -63,7 +79,8 @@ $ docker exec -it <container_id> bash
6379

6480
We highly recommend that you use a virtual environment for your TensorFlow installation rather than a direct install onto your machine. The instructions below walk you thorough a `conda` install, but a `virtualenv` environment will work as well.
6581

66-
The instructions specify using Python 2.7, but Python 3.x will work for everything but the "Cloud ML" sections of the workshop.
82+
Note: The 'preprocessing' stage in the [Cloud ML transfer learning](workshop_sections/transfer_learning/cloudml)
83+
example requires installation of the Cloud ML SDK, which requires Python 2.7. Otherwise, Python 3 should likely work.
6784

6885
### Install Conda + Python 2.7 to use as your local virtual environment
6986

@@ -73,11 +90,11 @@ Follow the instructions [here](https://www.continuum.io/downloads). The [minico
7390

7491
### Install TensorFlow into a virtual environment
7592

76-
Follow the instructions [on the TensorFlow site](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#anaconda-installation) to create a Conda environment with Python 2.7, *activate* it, and then use [conda-forge](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#using-conda) to install TensorFlow within it.
93+
Follow the instructions [on the TensorFlow site](https://www.tensorflow.org/get_started/os_setup#anaconda_installation) to create a Conda environment with Python 2.7, *activate* it, and then install TensorFlow within it.
7794

78-
**Note**: as of this writing, `conda-forge` installs TensorFlow 0.11. That is fine for this workshop. If you'd prefer to install using pip, follow the ["using pip" section](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#using-pip) instead.
95+
**Note**: Install TensorFlow version 0.12.
7996

80-
If you'd prefer to use virtualenv, see [these instructions](https://www.tensorflow.org/versions/r0.11/get_started/os_setup.html#virtualenv-installation) instead.
97+
If you'd prefer to use virtualenv, see [these instructions](https://www.tensorflow.org/get_started/os_setup#virtualenv_installation) instead.
8198

8299
Remember to activate your environment in all the terminal windows you use during this workshop.
83100

@@ -112,17 +129,65 @@ gcloud components install beta
112129

113130
To get the `gcloud beta ml` commands.
114131

132+
### Cloud ML setup
115133

116-
## [Optional: Get Started With Google Cloud Machine Learning](#cloud-ml-setup)
117-
118-
Follow the following instructions in order:
119-
120-
NOTE: You DO NOT need to follow the "Setting up your Environment" section
134+
Follow the instructions below to create a project, initialize it for Cloud ML, and set up a storage bucket to use for the workshop examples.
121135

122136
* [Setting Up Your GCP Project](https://cloud.google.com/ml/docs/how-tos/getting-set-up#setting_up_your_google_cloud_project )
123137
* [Initializing Cloud ML for your project](https://cloud.google.com/ml/docs/how-tos/getting-set-up#initializing_your_product_name_short_project)
124138
* [Setting up your Cloud Storage Bucket](https://cloud.google.com/ml/docs/how-tos/getting-set-up#setting_up_your_cloud_storage_bucket)
125139

140+
### Cloud ML SDK installation (for 'transfer learning' preprocessing)
141+
142+
The Cloud ML SDK is needed to run the 'preprocessing' stage in the [Cloud ML transfer
143+
learning](workshop_sections/transfer_learning/cloudml) example. It requires Python 2.7 to install. It's possible to
144+
skip this part of setup for most of the exercises.
145+
146+
To install the SDK, follow the setup instructions
147+
[on this page](https://cloud.google.com/ml/docs/how-tos/getting-set-up).
148+
(Assuming you've followed the instructions above, you will have already done some of these steps. **Install TensorFlow version 0.12** as described in [this section](#install-tensorflow-into-a-virtual-environment), not 0.11)
149+
150+
**Note**: if you have issues with the pip install of `python-snappy`, and are running in a conda virtual environment, try `conda install python-snappy` instead.
151+
152+
You don't need to download the Cloud ML samples or docs for this workshop, though you may find it useful to grab them
153+
anyway.
154+
155+
## Set up some data files used in the examples
156+
157+
### Transfer learning example
158+
159+
Because we have limited workshop time, we've saved a set of
160+
[TFRecords]([TFRecords](https://www.tensorflow.org/api_docs/python/python_io/))
161+
generated as part of the [Cloud ML transfer learning](workshop_sections/transfer_learning/cloudml)
162+
example. To save time, copy them now to your own bucket as follows.
163+
164+
Copy a zip of the generated records to some directory on your local machine:
165+
166+
```shell
167+
gsutil cp gs://oscon-tf-workshop-materials/transfer_learning/cloudml/hugs_preproc_tfrecords.zip .
168+
```
169+
170+
and then expand the zip:
171+
172+
```shell
173+
unzip hugs_preproc_tfrecords.zip
174+
```
175+
176+
Set the `BUCKET` variable to point to your GCS bucket (replacing `your-bucket-name` with the actual name):
177+
178+
```shell
179+
BUCKET=gs://your-bucket-name
180+
```
181+
182+
Then set the `GCS_PATH` variable as follows, and copy the unzipped records to a `preproc` directory under that path:
183+
184+
```shell
185+
GCS_PATH=$BUCKET/hugs_preproc_tfrecords
186+
gsutil cp -r hugs_preproc_tfrecords/ $GCS_PATH/preproc
187+
```
188+
189+
Once you've done this, you can delete the local zip and `hugs_preproc_tfrecords` directory.
190+
126191
## Optional: Clone/Download the TensorFlow repo from GitHub
127192

128193
We'll be looking at some examples based on code in the tensorflow repo. While it's not necessary, you might want to clone or download it [here](https://github.com/tensorflow/tensorflow), or grab the latest release [here](https://github.com/tensorflow/tensorflow/releases).

workshop_image/Dockerfile

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,16 @@ FROM gcr.io/tensorflow/tensorflow:latest-devel
1515

1616
RUN pip install --upgrade pip
1717
RUN apt-get update
18+
RUN apt-get install -y unzip python-dev python-pip zlib1g-dev libjpeg-dev libblas-dev
19+
RUN apt-get install -y liblapack-dev libatlas-base-dev libsnappy-dev libyaml-dev gfortran
1820
RUN apt-get install -y python-scipy
19-
RUN pip install sklearn nltk pillow
20-
RUN python -c "import nltk; nltk.download('punkt')"
2121

22+
RUN pip install sklearn nltk pillow setuptools
23+
RUN pip install flask google-api-python-client
24+
RUN pip install pandas python-snappy scipy scikit-learn requests uritemplate
25+
RUN pip install --upgrade --force-reinstall https://storage.googleapis.com/cloud-ml/sdk/cloudml.latest.tar.gz
26+
27+
# RUN python -c "import nltk; nltk.download('punkt')"
2228

2329
RUN curl https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-132.0.0-linux-x86_64.tar.gz | tar xvz
2430
RUN ./google-cloud-sdk/install.sh -q
Lines changed: 11 additions & 155 deletions
Original file line numberDiff line numberDiff line change
@@ -1,159 +1,15 @@
11

2-
# Transfer learning
2+
This directory contains two examples of transfer learning using the "Inception V3" image classification model.
33

4-
- [Introduction](#introduction)
5-
- [1. Take a look at the the Inception v3 model](#1-take-a-look-at-the-the-inception-v3-model)
6-
- [Data sets](#data-sets)
7-
- [The "hugs/no-hugs" data set](#the-hugsno-hugs-data-set)
8-
- [(Or, you can use the Flowers data set if you want)](#or-you-can-use-the-flowers-data-set-if-you-want)
9-
- [Pre-generated 'bottleneck' values for both example datasets](#pre-generated-bottleneck-values-for-both-example-datasets)
10-
- [2. Run a training session and use the model for prediction](#2-run-a-training-session-and-use-the-model-for-prediction)
11-
- [Train the model](#train-the-model)
12-
- [Do prediction using your learned model in an ipython notebook](#do-prediction-using-your-learned-model-in-an-ipython-notebook)
13-
- [3. A Custom Esimator for the transfer learning model](#3-a-custom-esimator-for-the-transfer-learning-model)
14-
- [Named Scopes and TensorBoard Summary information](#named-scopes-and-tensorboard-summary-information)
15-
- [4. Exercise: Building the Custom Estimator's model graph](#4-exercise-building-the-custom-estimators-model-graph)
4+
The [cloudml](cloudml) example shows how to use [Cloud Dataflow](https://cloud.google.com/dataflow/) ([Apache
5+
Beam](https://beam.apache.org/)) to do image preprocessing, then train and serve your model on Cloud ML. It supports
6+
distributed training on Cloud ML.
7+
It is based on the example [here](https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/flowers), with
8+
some additional modifications to make it easy to use other image sets, and a prediction web server that demos how to
9+
use the Cloud ML API for prediction once your trained model is serving.
1610

11+
The [TF_Estimator](TF_Estimator) example takes a similar approach, but is not packaged to run on Cloud ML. It also
12+
shows an example of using a custom [`Estimator`](https://www.tensorflow.org/api_docs/python/contrib.learn/estimators).
1713

18-
## Introduction
19-
20-
This lab shows how we can use an existing model to do *transfer learning* -- effectively bootstrapping an existing model to reduce the effort needed to learn something new.
21-
22-
Specifically, we will take an 'Inception' v3 architecture model trained on ImageNet images, and using its penultimate "bottleneck" layer, train a new top layer that can recognize other classes of images.
23-
We'll see that our new top layer does not need to be very complex, and that we don't need to do much training of this new model, to get good results for our new image classifications.
24-
25-
The core of the approach is the same as that used in [this TensorFlow example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining), but here we will use a custom [Estimator](https://www.tensorflow.org/versions/r0.11/api_docs/python/contrib.learn.html#estimators) (and train on a different set of photos).
26-
27-
## 1. Take a look at the the Inception v3 model
28-
29-
We can use the `view_inception_model.ipynb` Jupyter notebook to take a look at the structure of the Inception model before we start working with it.
30-
31-
First, download the inception model from:
32-
http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz , extract it,
33-
and copy the model file `classify_image_graph_def.pb` into `/tmp/imagenet` (you may need to first create the directory). This is where our python scripts will look for it, so we're saving a later download by putting it in the same place.
34-
35-
Then, start a jupyter server in this directory. For convenience, run it in a new terminal window. (Don't forget to activate your virtual environment first as necessary).
36-
37-
```sh
38-
$ jupyter notebook
39-
```
40-
41-
Load and run the `view_inception_model.ipynb` notebook. Poke around the model graph a bit.
42-
43-
<a href="https://storage.googleapis.com/oscon-tf-workshop-materials/images/incpv3.png" target="_blank"><img src="https://storage.googleapis.com/oscon-tf-workshop-materials/images/incpv3.png" width="500"/></a>
44-
45-
See if you can find the 'DecodeJpeg/contents:0' and 'pool_3/_reshape:0' nodes-- these will be our input and 'bottleneck' nodes, respectively, for the transfer learning.
46-
47-
<a href="https://storage.googleapis.com/oscon-tf-workshop-materials/images/incpv3_pool_3_reshape.png" target="_blank"><img src="https://storage.googleapis.com/oscon-tf-workshop-materials/images/incpv3_pool_3_reshape.png" width="500"/></a>
48-
49-
Note: If you should want to write the model graph to a text file to browse it that way, you can use
50-
the `tf.train.write_graph()` method. See [`mnist_hidden.py`](../mnist_series/the_hard_way/mnist_hidden.py) for
51-
a (commented-out) example of how to call it.
52-
53-
## Data sets
54-
55-
We've provided training images for you, but if you want to play around further, you can use any image datasets you like. The training script simply assumes you have a top-level directory containing class-named subdirectories, each containing images for that class. It then infers the classes to be learned from the directory structure.
56-
57-
### The "hugs/no-hugs" data set
58-
59-
For this exercise, we'll use a training set of images that have been sorted into two categories -- whether or not one would want to hug the object in the photo.
60-
(Thanks to Julia Ferraioli for this dataset).
61-
62-
This dataset does not have a large number of images, but as we will see, prediction on new images still works surprisingly well. This shows the power of 'bootstrapping' the pre-trained Inception model.
63-
64-
65-
```sh
66-
$ curl -O https://storage.googleapis.com/oscon-tf-workshop-materials/transfer_learning/hugs_photos.zip
67-
$ unzip hugs_photos.zip
68-
```
69-
70-
### (Or, you can use the Flowers data set if you want)
71-
72-
If you want to do flower classification instead, as with the original tutorial, you can find the data here:
73-
74-
```sh
75-
$ curl -O http://download.tensorflow.org/example_images/flower_photos.tgz
76-
$ tar xzf flower_photos.tgz
77-
```
78-
79-
80-
### Pre-generated 'bottleneck' values for both example datasets
81-
82-
When you run the transfer learning training, you'll first need to generate "bottleneck values" for the images, using the Inception v3 model. (We'll take a look at how that works).
83-
If this process is too time-consuming for the workshop context, you can download the pre-calculated bottleneck files for both the data sets above:
84-
85-
- https://storage.googleapis.com/oscon-tf-workshop-materials/transfer_learning/bottlenecks_hugs.zip
86-
- https://storage.googleapis.com/oscon-tf-workshop-materials/transfer_learning/bottlenecks_flowers.zip
87-
88-
## 2. Run a training session and use the model for prediction
89-
90-
Let's start by training our new model and using the results to make predictions.
91-
92-
### Train the model
93-
94-
```sh
95-
$ python transfer_learning.py --image_dir=hugs_photos --bottleneck_dir=bottlenecks_hugs
96-
```
97-
98-
**Note the name of the model directory** that is created by the script.
99-
100-
### Do prediction using your learned model in an ipython notebook
101-
102-
Start up a jupyter server in this directory as necessary. Select the `transfer_learning_prediction.ipynb` notebook in the listing that comes up.
103-
104-
Find this line:
105-
```
106-
MODEL_DIR = '/tmp/tfmodels/img_classify/your-model-dir'
107-
```
108-
109-
and edit it to point to the model directory used for your training run.
110-
111-
Then, run the notebook.
112-
You should see some predictions made for the images in the `prediction_images` directory!
113-
114-
If you like, you can try adding additional images to that directory, and rerunning the last part of the notebook to find out whether they're more huggable than not.
115-
116-
## 3. A Custom Esimator for the transfer learning model
117-
118-
Before we jump into the coding part of the lab, we'll take a look at `transfer_learning_skeleton.py`.
119-
It has the scaffolding in place for building a custom Estimator to do the transfer learning.
120-
We'll look at how the `fit()`, `evaluate()`, and `predict()` methods are being used.
121-
122-
We'll also take a look at how the Inception model is being loaded and accessed.
123-
124-
## Named Scopes and TensorBoard Summary information
125-
126-
Note that this code includes some examples of use of `tf.name_scope()` when defining nodes, particularly
127-
in the `add_final_training_ops()` function. You'll be able to spot these scope names when you look at the model graph in TensorBoard.
128-
We saw use of `tf.name_scope` earlier in ['mnist_hidden.py'](../mnist_series/the_hard_way/mnist_hidden.py) as well.
129-
130-
The code in `add_final_training_ops()` also includes some examples of defining summary information for TensorBoard (we saw a simple example of doing this in ['mnist_hidden.py'](../mnist_series/the_hard_way/mnist_hidden.py) also).
131-
132-
However, here, as we're wrapping things in an Estimator, we don't need to an an explicit `tf.merge_summary` op-- it will do that for us.
133-
134-
135-
## 4. Exercise: Building the Custom Estimator's model graph
136-
137-
Start with [`transfer_learning_skeleton.py`](transfer_learning.py), and complete the `_make_model`
138-
function definition. This function builds the model graph for the custom estimator.
139-
140-
As noted above, the Inception model graph is doing the heavy lifting here. We will just train a new
141-
top layer to identify our new classes: that is, we will just add a new softmax and fully-connected
142-
layer. The input to this layer is the generated "bottleneck" values. The `add_final_training_ops`
143-
function defines this layer, then defines the loss function and the training op.
144-
145-
Then, the `add_evaluation_step` function adds an op to evaluate the accuracy of the results. Add
146-
'loss' and 'accuracy' metrics to the prediction_dict, as per the `METRICS` dict below
147-
`make_model_fn` in the code, which we will then pass to the Estimator's `evaluate()` method.
148-
149-
Then, add support for generating prediction value(s).
150-
See if you can figure out how to derive the index of the highest-value the entry in the result
151-
vector, and store that value at the `"index"` key in the `prediction_dict`. As a hint, take a look
152-
at the ops used in `add_evaluation_step()`.
153-
154-
As shown in the skeleton of `_make_model`, be sure to return the prediction dict, the loss, and the
155-
training op. This info sets up the Estimator to handle calls to its `fit()`, `evaluate()`, and
156-
`predict()` methods.
157-
158-
159-
If you get stuck, you can take a peek at `transfer_learning.py`, but try not to do that too soon.
14+
The list of image sources for the images used in the "hugs/no-hugs" training is here:
15+
https://storage.googleapis.com/oscon-tf-workshop-materials/transfer_learning/hugs_photos_sources.csv

0 commit comments

Comments
 (0)