TU Delft
print this page print this page     
2014/2015 Electrical Engineering, Mathematics and Computer Science Bachelor Computer Science and Engineering
Multimedia Analysis
Responsible Instructor
Name E-mail
Prof.dr. M.A. Larson    M.A.Larson@tudelft.nl
Name E-mail
Dr. H.S. Hung    H.Hung@tudelft.nl
Contact Hours / Week x/x/x/x
0/0/4/0 hc; 0/0/4/0 lab
Education Period
Start Education
Exam Period
Course Language
Course Contents
The course provides basic knowledge and hands on experience in content analysis for the full range of different types of multimedia, including audio, speech, text, images and video. The emphasis is on techniques needed to develop systems that provide users with a variety of functionalities, e.g., search, organization, discovery and sharing. The course builds on concepts from signal processing (spectrogram analysis and basic audio classification) and on image processing (2-D filtering, visual features, segmentation and basic image classification). It introduces new concepts such as text classification, automatic speech recognition and multimodal video indexing. It makes the bridge between fundamental techniques and applications that allow users to interact with multimedia collections at a level that goes beyond a signal and encompasses aspects of human interpretation of the meaning of multimedia content.
Study Goals
Multimedia signals:
Remembering: describe different sources of multimedia signals.
characterize the impact of editing on multimedia.
Understanding: differentiate between objective and subjective descriptions of multimedia objects.
differentiate between objective and subjective descriptions of multimedia objects.

Multimedia systems:
Remembering: name the elements of the workflow of a multimedia system (in particular, a search engine).
Remembering: describe the user goals that a multimedia system has been designed to meet and the domain in which it operates.
Understanding/Analysing: establish a simple set of requirements with respect to which a multimedia system can be evaluated.
Understanding: explain the use of basic machine learning techniques in the implementation of multimedia systems.
Applying: carry out simple computations for evaluating multimedia systems.
Analysing: explain challenges of a specific example of multimedia content analysis.
Evaluating: evaluate failure modes of multimedia descriptors or analysis methods.

Audio Analysis:
Remembering: describe and carry out the steps necessary to extract audio features used for audio analysis.
Understanding: explain the advantages of basing audio features on human perception.
Applying: build and test a simple audio classifier.
Analysing: compare and contrast feature extraction methodologies for speech vs. music processing systems

Speech Analysis:
Remembering: describe the basic elements of human language and its production.
Remembering: name the modules of a speech recognition system and describe their function and interaction.
Applying: build a simple model (e.g., a vector space model) for indexing and retrieving either text documents or speech recognition transcripts.
Evaluating and Analysing: discuss characteristics of speech recognition transcripts and their impact on multimedia systems.

Image analysis:
Remembering: name basic (global and local) visual features and describe their typical uses.
Remembering: describe the steps necessary to extract basic (global and local) visual features and describe their typical uses.
Remembering: describe, set up, and test a simple classifier for visual material.
Applying: implement some basic image descriptors from its corresponding description.
Analyse: compare and contrast different visual features with respect to a specific use case.
Evaluating: hypothesise about how certain visual features may describe objects and depictions of objects in multimedia content.

Video analysis:
Remembering: name basic (global and local) visual features and describe their typical uses.
Understanding: describe the differences between edited and unedited video
Applying: implement some basic video descriptors from its corresponding mathematical description.
Analysing: compare and contrast different video features or video analysis techniques.
Evaluating: hypothesise about expected feature responses in edited video and experimentally validate this.

Multi-modal analysis:
Remembering: name and explain the difference between high- and low-level feature fusion.
Evaluating: identify the weak points in various currently existing multimedia systems and offer constructive suggestions for improvements that are finally implemented.
Creating: suggest a multimodal approach to an existing single modality solution.
Creating and Evaluating: set up and evaluate a simple system for analyzing multimedia material.
Education Method
Lectures and labs
Computer Use
Bring your own laptop. (We expect 8G internal memory plus 32G free space. Please contact the instructors if this is a problem.)
Literature and Study Materials
Provided with the lectures.
Lab (pass/fail) plus written exam. A make-up lab can be requested only in extreme situations related to the personal health and well-being of the student or immediate family members. Make-ups are not possible once the quarter is completed. A "pass" on the course labs cannot be carried over to future quarters.
Permitted Materials during Tests
Grade is final exam grade. (Lab is pass/fail)