Tensorboard and summaries

Tensorboard is a very very VERY convenient tool to visualize your graph, training evolution, and results. You can add summary operations in your graph that, during your session execution, will outputs tensor values in an event file.

Tensorboard runs in parallel and will regularly (every 30s by default) reads the event file and display it accordingly.

In [1]:
import os
import glob
from datetime import datetime
import tensorflow as tf

First, let's just create some input pipeline with tf.data.Dataset (you can ignore this and use placeholders for your inputs instead, it's just a fancy way to have everything "in-graph")

In [11]:
## Note Dataset available in tf.data in Tensorflow 1.4
## Otherwise, can also be found in tf.contrib.data in Tensorflow 1.3
def get_inputs_queue(filenames, 
                     preprocess_inputs=None,
                     batch_size=32,
                     num_threads=1):
    """Returns a Dataset object containing the inputs batches read from a list of files.
    
    Args:
      filenames: List of image files to read the input from.
      preprocess_inputs: Image preprocessing function.
      batch_size: Batch size.
      num_threads: Number of readers for the batch queue.
      
    Returns: 
      A batch of input images from the Dataset iterator.
    """
    
    ## Load image from file and preprocess
    def parsing_function(filename):
        image = tf.read_file(filename)
        image = tf.image.decode_image(image)
        if preprocess_inputs is not None:
            image = preprocess_inputs(image)
        return image, filename
    
    # Create dataset object
    dataset = tf.data.Dataset.from_tensor_slices(filenames)
    dataset = dataset.map(parsing_function, num_parallel_calls=num_threads)
    dataset = dataset.shuffle(10)
    dataset = dataset.repeat()
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(1)
    iterator = dataset.make_one_shot_iterator()
    
    return iterator.get_next()

def preprocess_inputs(image, size=128):
    """Preprocess input images by cropping and resizing them to a square size.
    
    Args:
      inputs: A dictionnary of Tensors.
      size: The square size to resize image to.
      
    Returns:
      The preprocessed image.
    """ 
    # Central crop to minimal side
    height = tf.shape(image)[0]
    width = tf.shape(image)[1]
    min_side = tf.minimum(height, width)
    offset_height = (height - min_side) // 2
    offset_width = (width - min_side) // 2
    
    # Crop
    image = tf.image.crop_to_bounding_box(
        image, offset_height, offset_width, min_side, min_side)
    
    # Resize
    if size is not None and size > 0:
        image = tf.image.resize_images(image, (size, size))
        image = tf.reshape(image, (size, size, 3))
        
    return image

Now let's create a small graph that just load these images and compute their average, per-batch and per-pixel, color value. We will use Tensorboard to display:

  • [image] The input images in the current batch
  • [text] The name of the input files the image are loaded from
  • [scalar] The average pixel value for the red, gree and blue channel
In [13]:
# Batch size
BATCH_SIZE = 5
# Bases path of images to load
BASE_PATH = "/home/aroyer/Data/Pascal VOC 2006/voc2006_train/VOC2006/PNGImages/*.png"

with tf.Graph().as_default():  
    
    ## Inputs: Queue loading batch of input images (BATCH_SIZE, width, height, 3)
    # Could be replaced by a placeholder you feed manually
    images, filenames = get_inputs_queue(glob.glob(BASE_PATH),
                              preprocess_inputs=preprocess_inputs,
                              batch_size=BATCH_SIZE)
    
    ## Tensorboard image summary
    # Given a Tensor of N images, will display the `max_outputs` first ones
    # collection keyword is used to arrange the summary in Tensorboard
    tf.summary.image('image', images, max_outputs=2, family='inputs')
    
    ## Tensorboard text summary
    # Let's display the name of the files we loaded for this batch 
    #(string in markdown format)
    summary_str = tf.reduce_join(filenames, 0, separator='\n\t')
    tf.summary.text('file_names', summary_str)
    
    ## Tensorboard scalar summary 
    # Here let's just display the average pixel values
    avgs = tf.reduce_mean(images, axis=(0, 1, 2))
    tf.summary.scalar('red', avgs[0], family='average')
    tf.summary.scalar('blue', avgs[1], family='average')
    tf.summary.scalar('green', avgs[2], family='average')
    
    ## For convenience: merge summaries in one operation in the graph
    # Note: by default, will also display the sesison graph, no need to add an operation for it
    # Note: you can also specify collections to group different summaries together
    summary_op = tf.summary.merge_all()
    
    with tf.Session() as sess:
        ## Create summary writer for given log directory
        # use sub directories for clean organization
        LOG_DIR = os.path.join('.', datetime.now().strftime("%m-%d_%H-%M"))
        summary_writer = tf.summary.FileWriter(LOG_DIR, sess.graph)
        
        ## Compute the summaries and add write them to the event file
        num_steps = 50
        for step in range(num_steps):
            print('\r step %d/%d' % (step + 1, num_steps), end='')
            summary_op_ = sess.run(summary_op)
            if step % 10 == 0:
                summary_writer.add_summary(summary_op_, global_step=step)
                
        ## As always, close files
        summary_writer.close()
 step 50/50

Once this is run, it should create an event file in the LOG_DIR directory. This can then be visualised by

  • running Tensorboard: tensorboard --logdir="path_to_log_directory" --port=8888
  • Navigating to localhost:8888 in your browser