Shot classification in broadcast soccer video.
Event understanding systems, responsible for automatically generating human relatable event descriptions from video sequences, is an open problem in computer vision research that has many applications in the sports domain, such as indexing and retrieval systems for sports video. Background modelling and shot classification of broadcast video are important steps in event understanding in video sequences. Shot classification seeks to identify shots, i.e. the labelling of continuous frame sequences captured by a single camera action such as long shot, close-up and audience shot, while background modelling seeks to classify pixels in an image as foreground/background. Many features used for shot classification are built upon the background model therefore background modelling is an essential part of shot classification. This dissertation reports on an investigation into techniques and procedures for background modelling and classification of shots in broadcast soccer videos. Broadcast video refers to video which would typically be viewed by a person at home on their television set and imposes constraints that are often not considered in many approaches to event detection. In this work we analyse the performances of two background modelling techniques appropriate for broadcast video, the colour distance model and Gaussian mixture model. The performance of the background models depends on correctly set parameters. Some techniques offer better updating schemes and thus adapt better to the changing conditions of a game, some are shown to be more robust to changes in broadcast technique and are therefore of greater value in shot classification. Our results show the colour distance model slightly outperformed the Gaussian mixture model with both techniques performing similar to those found in literature. Many features useful for shot classification are proposed in the literature. This dissertation identifies these features and presents a detailed analysis and comparison of various features appropriate for shot classification in broadcast soccer video. Once a feature set is established, a classifier is required to determine a shot class based on the extracted features. We establish the best use of the feature set and decision tree parameters that result in the best performance and then use a combined feature set to train a neural network to classify shots. The combined feature set in conjunction with the neural network classifier proved effective in classifying shots and in some situations outperformed those techniques found in literature.