Generative, Multimodal AI

Credits

6 ECTS - 36h

Instructors

Karteek Alahari, Xavier Alameda-Pineda, Ahlame Douzal, Eric Gaussier, Georges Quénot and Didier Schwab

Description

The course is split into two parts. During the first part, a wide range of machine learning algorithms will be discussed. The second part will focus on deep learning, and presentations more applied to the three data modalities and their combinations. The following is a non-exhaustive list of topics discussed:

Computing dot products in high dimension & Page Rank
Matrix completion/factorization (Stochastic Gradient Descent, SVD)
Monte-carlo, MCMC methods: Metropolis-Hastings and Gibbs Sampling
Unsupervised classification: Partitionning, Hierarchical, Kernel and Spectral clustering
Alignment and matching algorithms (local/global, pairwise/multiple), dynamic programming, Hungarian algorithm,…
Introduction to Deep Learning concepts, including CNN, RNN, Metric learning
Attention models: Self-attention, Transformers
Auditory data: Representation, sound source localisation and separation.
Natural language data: Representation, Seq2Seq, Word2Vec, Machine Translation, Pre-training strategies, Benchmarks and evaluation
Visual data: image and video representation, recap of traditional features, state-of-the-art neural architectures for feature extraction
Object detection and recognition, action recognition.
Multimodal learning: audio-visual data representation, multimedia retrieval.
Generative Adversarial Networks: Image-image translation, conditional generation

Assessment

Final exam

Data Science