M2DS Alternants - Course on Thematic Research Seminar - Spring 2022

Instructors:

  • Binh Nguyen (binguyen@telecom-paris.fr)
  • Anna Korba (anna.korba@ensae.fr)

Practical Information

  • Room: PC102 (somewhere in between Amphi Arago, Carnot, Monge in the GrandCampus Polytechnique). Check also: https://www.polytechnique.edu/mapwize/

  • Time: 9h30-12h15, each Thursday from 07/04 to 30/06/2022.

  • Grading: based on presentation given at the 3rd session of each topic, final note is the average of the 4 presentations.

  • First two sessions for each topic is lecture + practical Python (technically Jupyter) notebook for coding.

  • Interaction during classes is encouraged: it’s better for you to think as well, so prepare for some derivations/questions of the theory, and tinkering with the practical sessions.

  • Advice: you should start working with the assigned paper as early as possible, because you might have related questions, and can check with us in the second session for each of the topics.

Topics

  • Topic 1 (07, 14 and 21/04): Sparsity learning with Lasso

    Reference: Bühlmann, P., & Geer, S. A. van de. (2011). Statistics for high-dimensional data: Methods, theory and applications. Springer.

    Papers:

    1. (Adaptive Lasso) Zou, H. (2006), ‘The adaptive lasso and its oracle properties’, Journal of the American Statistical Association 101(476), 1418–1429.
    2. Meinshausen, N. (2007), ‘Relaxed lasso’, Computational Statistics & Data Analysis 52, 374–393.
    3. (group lasso) Yuan, M. & Lin, Y. (2006), ‘Model selection and estimation in regression with grouped variables’, Journal of the Royal Statistical Society: Series B 68(1), 49–67.
    4. (lasso for missing data) Loh, P.-L. & Wainwright, M. J. (2012), ‘High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity’, The Annals of Statistics 40(3), 1637–1664.
    5. (Debiased Lasso) Van de Geer, S., Bühlmann, P., Ritov, Y. A., & Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3), 1166-1202.
  • Topic 2 (28/04, 05 and 12/05): Generative Adversarial Networks

    References:

    1. Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).
    2. Arjovsky and Bottou. “Towards Principled Methods for Training Generative Adversarial Networks.” ICLR 2017.
    3. Salimans, Tim, et al. “Improved techniques for training gans.” NeurIPS 2016
    4. Arjovsky et. al. “Wasserstein generative adversarial networks.” ICML 2017.

    Papers:

    1. Reed, Scott, et al. “Generative adversarial text to image synthesis.” International conference on machine learning. PMLR, 2016.
    2. (Conditional GAN) Wang, Ting-Chun, et al. “High-resolution image synthesis and semantic manipulation with conditional gans.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    3. (Style GAN) Karras, Tero, et al. “A style-based generator architecture for generative adversarial networks.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
    4. (DC-GAN) Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015).
    5. (InfoGAN) Chen, Xi, et al. “Infogan: Interpretable representation learning by information maximizing generative adversarial nets.” Advances in neural information processing systems 29 (2016).
    6. (LS-GAN) Mao, Xudong, et al. “Least squares generative adversarial networks.” Proceedings of the IEEE international conference on computer vision. 2017.
  • Topic 3 (19, 02/06 and 09/06): A brief introduction to Optimal Transport to Machine Learning

    References:

    1. Gabriel Peyré and Marco Cuturi (2019), “Computational Optimal Transport: With Applications to Data Science”, Foundations and Trends® in Machine Learning: Vol. 11: No. 5-6, pp 355-607. http://dx.doi.org/10.1561/2200000073
    2. Mémoli, F. (2011). Gromov–Wasserstein Distances and the Metric Approach to Object Matching. Foundations of Computational Mathematics, 11(4), 417–487. https://doi.org/10.1007/s10208-011-9093-5

    Papers:

    1. Kusner, M., Sun, Y., Kolkin, N., & Weinberger, K. (2015, June). From word embeddings to document distances. In International conference on machine learning (pp. 957-966). PMLR.
    2. Demetci, P., Santorella, R., Sandstede, B., Noble, W. S., & Singh, R. (2020). Gromov-Wasserstein optimal transport to align single-cell multi-omics data. BioRxiv.
    3. Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. Advances in Neural Information Processing Systems, 30.
    4. Chapel, L., Alaya, M. Z., & Gasso, G. (2020). Partial optimal tranport with applications on positive-unlabeled learning. Advances in Neural Information Processing Systems, 33, 2903-2913.
    5. Genevay, A., Peyré, G., & Cuturi, M. (2018, March). Learning generative models with sinkhorn divergences. In International Conference on Artificial Intelligence and Statistics (pp. 1608-1617). PMLR.
    6. Alvarez-Melis, D., & Jaakkola, T. (2018). Gromov-Wasserstein Alignment of Word Embedding Spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 1881-1890).
  • Topic 4 (16, 20 and 30/06): Score-based diffusion (generative) model

    References:

    1. Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, pp. 11895–11907, 2019
    2. Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., and Poole, B. (2021). Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations.
    3. Yang Song’s blog: https://yang-song.github.io/blog/2021/score/ that gives short explanation of the above two papers
    4. Huggingface annotated (i.e. code along) version: https://huggingface.co/blog/annotated-diffusion