This thesis presents a deep neural network model that augments an existing semantic
image segmentation model with optical flow data to improve segmentation performance
on video sequences. Three network topologies combining optical flow data
layers with RGB data layers are compared. The best performing model, FlowSegA,
achieves an average per-class accuracy of 72.696% on the SegNet test set. This
is an improvement of 4.8 percentage-points versus SegNet, the RGB-only segmentation
model on which FlowSeg-A is based. The main accuracy improvements come
from the classes SignSymbol (15.4% improvement), Bicyclist(10.2%), and Pole
(9.0%). These accuracy improvements are achieved with only 1,152 (0.005%) more
parameters, and FlowSeg-A achieves this performance using the same training set and
training schedule as the SegNet algorithm.