Segmentation in plans

the segmentation in plans is the automatic identification, by data-processing methods, terminals of the plans in a Vidéo. That consists in locating automatically the points of Montage defined originally by the Réalisateur, by measuring discontinuities between the successive images of the video. These points of assembly are obviously known of the realizer of the video, but are generally not revealed, or available. In order to avoid with a human operator a tiresome length location of the plans by visionnage, of the automatic methods were developed by the researchers in data processing.

It is the oldest problem and more studied in video Indexation, considered as being an essential building block to allow the analysis and the video research. There exists for the moment only few direct applications of the segmentation in plans for the general public, or in software of digital video. However, it is a major stage in the analysis of the video, allowing the definition and the use of techniques of Recherche of information in vidéos.

Definition

The segmentation in plans consists in determining different the plans from a video. This has direction only if it video contains indeed plans, i.e. it was assembled by a Réalisateur. Certain types of vidéos (video surveillance, vidéos personal…) thus do not lend themselves to this type of technique. The vidéos generally considered are films or emissions of Télévision.

The segmentation in plans (incorrectly) is sometimes called “segmentation in scenes”, by certain researchers. The Segmentation in scenes is however a different task, which consists in identifying the scenes, this concept being defined like a regrouping of plans sharing a certain semantic coherence.

One can also refer to the segmentation in plans as with a “reverse Hollywood problem”, to stress that it is about the opposite operation of the Montage: it is the déconstruction of the video in order to identify building blocks filmed by the realizer: plans.

Various types of transitions between plans

There exist very many ways of carrying out a transition between two plans. Simplest is the abrupt transition: one passes from a plan to another without image of transition. To make this passage more flexible, the realizers imagined a large variety of progressive transitions, the Fondu S with the black, the dissolve , the shutters, and of others, made increasingly easy well by the use of data processing, and even of software general public of video Montage.

For the segmentation in plans, the researchers distinguish generally only two types: abrupt transitions (also called cuts, of English “cut”), and the progressive transitions, which include all the other types of transitions.

History

The first work on the segmentation in plans goes back to the beginning of the year 1990. It is oldest of the tasks of the most explored video indexing and.

Another problem involved in the performance of the algorithms appears as of the first research. If the results of detection for the abrupt transitions are quickly rather good, it is not the case for the progressive transitions. One then sees appearing at the end of the years 1990 and with beginning of the year 2000 of many articles concentrating on the difficulties of detection of the progressive transitions.

In 2002, Alan Hanjalic, University of technology of Delft, publishes an article with the provocative title: “Shot Boundary Detection: Unraveled and Resolved? ” (or similarity measures) between observations. The application of this distance between two successive images, on the whole of the video stream, produces a unidimensional signal, in which one seeks the peaks (resp then. hollows if measurement of similarity), which correspond to the moments of strong dissimilarity.

Observations and distances

The simplest observation is quite simply the whole of the Pixel S of the image. For 2 images I_1 and I_2 of dimension N×M, the obvious distance is then the average of the absolute differences pixels with pixels (L1 distance):

d (I_1, I_2) = \ frac {1} {Nm} \ sum_ {i=1} ^ {NR} \ sum_ {j=1} ^ {M}|I_1 (x_i, y_j) - I_2 (x_i, y_j)|

More refined approaches can measure only the significant changes, by filtering the pixels which generate too weak differences, which do nothing but add noise.

Unfortunately, the techniques in the field pixellic are very sensitive to the movements of objects or camera. Techniques of Bloc matching were indeed proposed to reduce the sensitivity to the movement, but the methods in the field pixellic were largely supplanted by the methods based on the Histogramme S.

The histogram, of Brightness or color, is an observation very much used. It is easy to calculate, and is relatively robust with the noise and the movements of objects, due to the fact that a histogram is unaware of the space modifications in the image. Very many techniques of calculation (on the whole image, blocks…) and of distances (L1, the Similarity cosine, Test of the χ ²…) were proposed. A comparison of the performances of various observations, on varied video contents, showed that the use of histograms produced stable results and of good quality.

The methods using the histogram suffer however from important defects: they are not robust with brutal changes of illumination (flashes of photographers, sun…), nor with fast movements.

In order to solve these problems, another observation is frequently used: contours of the image. Those are detected on each image, thanks to a method of Détection of contour S and, possibly after Recalage, contours are compared. This technique is robust with the movement, like with the changes of illumination. On the other hand, the complexity is high.

Other observations were proposed: characterization of the Movement of camera, or detection in the field compressed starting from the coefficients DCT, or a combination of observations, for example intensity and movement.

A more satisfactory method is to determine the value of the threshold starting from an estimate of the distribution of discontinuities. The distribution is supposed Gaussian parameters \ mathcal {NR} (\ driven, \, \ sigma^2) and the threshold is defined like S= \ mu+r \ sigma, where R is used to regulate the number of false alarms.

A better theoretically founded approach is to use the Decision theory.

A very different method is worked out by Truong and Al , which propose not to make a local decision, but a total decision, try some to find the segmentation optimal on the whole of the video considered. The authors adopt a step based on the Maximum a posteriori, in order to find the segmentation which maximizes the probability P (S|O) , probability that the segmentation S is optimal, knowing the observations O. In order to avoid a systematic exploration of all the possible segmentations, a dynamic technique of Programming is used.

Improvements

The methods exposed before are not always effective to detect the progressive transitions. Heng and Al .

The dissolves are particularly difficult to detect, and certain work concentrates only on this task.

Another main issue is that of the brutal changes of illumination, flashes, spots, appearance/disappearance of the sun… Specific methods were developed to decrease to them false alarms related to these events, while being helped of the detection of contours.

The complexity of the methods is as evaluated, and is very different according to the algorithms, going from 20 times faster as real-time, with more than 20 times slower, or the analysis of vidéos of sport.

Certain software of video Assembly, for example Windows Movie Maker and VirtualDub, uses the segmentation in plans to generate a pre-cutting for the user, who allows to make nonlinear assembly simply. For the Film enthusiast S interested by the film analysis, these techniques can possibly have an interest automatically to determine the number of plans in a film and their localization.

The segmentation in plans is also used in the techniques of Restauration of image, for the correction of the defects inherent in the changes of plan, such as the echoes of calibration and the deformations of image.

Random links:Creil | Gasteropelecidae | Department of Nicaragua | Michele Girardon | Frank Froehling | Tat_Tvam_Asi