|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "40d83007", |
| 6 | + "metadata": { |
| 7 | + "editable": true |
| 8 | + }, |
| 9 | + "source": [ |
| 10 | + "<!-- HTML file automatically generated from DocOnce source (https://github.com/doconce/doconce/)\n", |
| 11 | + "doconce format html Generative.do.txt -->\n", |
| 12 | + "<!-- dom:TITLE: Project 1 -->" |
| 13 | + ] |
| 14 | + }, |
| 15 | + { |
| 16 | + "cell_type": "markdown", |
| 17 | + "id": "64f1a45e", |
| 18 | + "metadata": { |
| 19 | + "editable": true |
| 20 | + }, |
| 21 | + "source": [ |
| 22 | + "# Project 1\n", |
| 23 | + "**[FYS5429/9429](https://www.uio.no/studier/emner/matnat/fys/FYS5429/index-eng.html), Advanced machine learning and data analysis for the physical sciences, University of Oslo, Norway**\n", |
| 24 | + "\n", |
| 25 | + "Date: **Spring semester 2026, deadline March 20**" |
| 26 | + ] |
| 27 | + }, |
| 28 | + { |
| 29 | + "cell_type": "markdown", |
| 30 | + "id": "76e96c03", |
| 31 | + "metadata": { |
| 32 | + "editable": true |
| 33 | + }, |
| 34 | + "source": [ |
| 35 | + "# Discriminative and Generative Deep Learning Models: A Mathematical and Computational Study" |
| 36 | + ] |
| 37 | + }, |
| 38 | + { |
| 39 | + "cell_type": "markdown", |
| 40 | + "id": "ea4f0d35", |
| 41 | + "metadata": { |
| 42 | + "editable": true |
| 43 | + }, |
| 44 | + "source": [ |
| 45 | + "## Project Overview\n", |
| 46 | + "\n", |
| 47 | + "Deep learning methods can broadly be divided into **discriminative\n", |
| 48 | + "models**, which learn decision boundaries for labeled data, and\n", |
| 49 | + "**generative models**, which learn probability distributions over\n", |
| 50 | + "data. Convolutional neural networks (CNNs) dominate modern\n", |
| 51 | + "classification tasks, while generative models such as variational\n", |
| 52 | + "autoencoders (VAEs), Boltzmann machines, and diffusion models provide\n", |
| 53 | + "probabilistic descriptions of data and enable synthesis, uncertainty\n", |
| 54 | + "quantification, and representation learning.\n", |
| 55 | + "\n", |
| 56 | + "The goal of this project is to develop a unified mathematical and\n", |
| 57 | + "computational understanding of these model classes. Students will\n", |
| 58 | + "analyze classification and generative learning as optimization\n", |
| 59 | + "problems over high-dimensional function spaces, emphasizing\n", |
| 60 | + "probabilistic modeling, variational principles, and numerical\n", |
| 61 | + "optimization." |
| 62 | + ] |
| 63 | + }, |
| 64 | + { |
| 65 | + "cell_type": "markdown", |
| 66 | + "id": "3ce8a645", |
| 67 | + "metadata": { |
| 68 | + "editable": true |
| 69 | + }, |
| 70 | + "source": [ |
| 71 | + "## Classification with Convolutional Neural Networks\n", |
| 72 | + "\n", |
| 73 | + "A convolutional neural network defines a parametric mapping" |
| 74 | + ] |
| 75 | + }, |
| 76 | + { |
| 77 | + "cell_type": "markdown", |
| 78 | + "id": "23e3e4db", |
| 79 | + "metadata": { |
| 80 | + "editable": true |
| 81 | + }, |
| 82 | + "source": [ |
| 83 | + "<!-- Equation labels as ordinary links -->\n", |
| 84 | + "<div id=\"_auto1\"></div>\n", |
| 85 | + "\n", |
| 86 | + "$$\n", |
| 87 | + "\\begin{equation}\n", |
| 88 | + "f_\\theta : \\mathbb{R}^{H \\times W \\times C} \\rightarrow \\{1,\\dots,K\\},\n", |
| 89 | + "\\label{_auto1} \\tag{1}\n", |
| 90 | + "\\end{equation}\n", |
| 91 | + "$$" |
| 92 | + ] |
| 93 | + }, |
| 94 | + { |
| 95 | + "cell_type": "markdown", |
| 96 | + "id": "bc029c6e", |
| 97 | + "metadata": { |
| 98 | + "editable": true |
| 99 | + }, |
| 100 | + "source": [ |
| 101 | + "where inputs are structured data (e.g.\\ images) and outputs are class labels.\n", |
| 102 | + "\n", |
| 103 | + "Mathematically, CNNs combine:\n", |
| 104 | + "* Convolutional linear operators with local receptive fields,\n", |
| 105 | + "\n", |
| 106 | + "* Nonlinear activation functions,\n", |
| 107 | + "\n", |
| 108 | + "* Pooling and subsampling operations.\n", |
| 109 | + "\n", |
| 110 | + "Here the aim is to focus on\n", |
| 111 | + "* Convolutions as structured sparse linear maps,\n", |
| 112 | + "\n", |
| 113 | + "* Translation equivariance and symmetry reduction,\n", |
| 114 | + "\n", |
| 115 | + "* Parameter sharing and its effect on sample complexity.\n", |
| 116 | + "\n", |
| 117 | + "The classification problem is formulated as empirical risk minimization with cross-entropy loss," |
| 118 | + ] |
| 119 | + }, |
| 120 | + { |
| 121 | + "cell_type": "markdown", |
| 122 | + "id": "58d50ff6", |
| 123 | + "metadata": { |
| 124 | + "editable": true |
| 125 | + }, |
| 126 | + "source": [ |
| 127 | + "<!-- Equation labels as ordinary links -->\n", |
| 128 | + "<div id=\"_auto2\"></div>\n", |
| 129 | + "\n", |
| 130 | + "$$\n", |
| 131 | + "\\begin{equation}\n", |
| 132 | + "\\mathcal{L}_{\\text{clf}}(\\theta) = -\\frac{1}{N}\\sum_{i=1}^N \\log p_\\theta(y_i \\mid x_i),\n", |
| 133 | + "\\label{_auto2} \\tag{2}\n", |
| 134 | + "\\end{equation}\n", |
| 135 | + "$$" |
| 136 | + ] |
| 137 | + }, |
| 138 | + { |
| 139 | + "cell_type": "markdown", |
| 140 | + "id": "fef66833", |
| 141 | + "metadata": { |
| 142 | + "editable": true |
| 143 | + }, |
| 144 | + "source": [ |
| 145 | + "and optimized using stochastic gradient descent." |
| 146 | + ] |
| 147 | + }, |
| 148 | + { |
| 149 | + "cell_type": "markdown", |
| 150 | + "id": "e1f3bf5a", |
| 151 | + "metadata": { |
| 152 | + "editable": true |
| 153 | + }, |
| 154 | + "source": [ |
| 155 | + "## Probabilistic Generative Modeling\n", |
| 156 | + "\n", |
| 157 | + "Generative models aim to learn an approximation $p_\\theta(x)$ to an unknown data distribution. This project considers three complementary paradigms." |
| 158 | + ] |
| 159 | + }, |
| 160 | + { |
| 161 | + "cell_type": "markdown", |
| 162 | + "id": "35e9c4ed", |
| 163 | + "metadata": { |
| 164 | + "editable": true |
| 165 | + }, |
| 166 | + "source": [ |
| 167 | + "### Variational Autoencoders\n", |
| 168 | + "\n", |
| 169 | + "VAEs introduce latent variables $z$ and define" |
| 170 | + ] |
| 171 | + }, |
| 172 | + { |
| 173 | + "cell_type": "markdown", |
| 174 | + "id": "a9571bdf", |
| 175 | + "metadata": { |
| 176 | + "editable": true |
| 177 | + }, |
| 178 | + "source": [ |
| 179 | + "<!-- Equation labels as ordinary links -->\n", |
| 180 | + "<div id=\"_auto3\"></div>\n", |
| 181 | + "\n", |
| 182 | + "$$\n", |
| 183 | + "\\begin{equation}\n", |
| 184 | + "p_\\theta(x,z) = p_\\theta(x \\mid z)p(z),\n", |
| 185 | + "\\label{_auto3} \\tag{3}\n", |
| 186 | + "\\end{equation}\n", |
| 187 | + "$$" |
| 188 | + ] |
| 189 | + }, |
| 190 | + { |
| 191 | + "cell_type": "markdown", |
| 192 | + "id": "7597482a", |
| 193 | + "metadata": { |
| 194 | + "editable": true |
| 195 | + }, |
| 196 | + "source": [ |
| 197 | + "with training based on variational inference.\n", |
| 198 | + "\n", |
| 199 | + "Here you will derive the evidence lower bound (ELBO)," |
| 200 | + ] |
| 201 | + }, |
| 202 | + { |
| 203 | + "cell_type": "markdown", |
| 204 | + "id": "85c6b25f", |
| 205 | + "metadata": { |
| 206 | + "editable": true |
| 207 | + }, |
| 208 | + "source": [ |
| 209 | + "<!-- Equation labels as ordinary links -->\n", |
| 210 | + "<div id=\"_auto4\"></div>\n", |
| 211 | + "\n", |
| 212 | + "$$\n", |
| 213 | + "\\begin{equation}\n", |
| 214 | + "\\mathcal{L}_{\\text{VAE}} = \\mathbb{E}_{q_\\phi(z\\mid x)}[\\log p_\\theta(x\\mid z)] - \\mathrm{KL}(q_\\phi(z\\mid x)\\|p(z)),\n", |
| 215 | + "\\label{_auto4} \\tag{4}\n", |
| 216 | + "\\end{equation}\n", |
| 217 | + "$$" |
| 218 | + ] |
| 219 | + }, |
| 220 | + { |
| 221 | + "cell_type": "markdown", |
| 222 | + "id": "0b52b9f6", |
| 223 | + "metadata": { |
| 224 | + "editable": true |
| 225 | + }, |
| 226 | + "source": [ |
| 227 | + "and analyze:\n", |
| 228 | + "* Encoder - decoder architectures,\n", |
| 229 | + "\n", |
| 230 | + "* Reparameterization trick and differentiability,\n", |
| 231 | + "\n", |
| 232 | + "* Trade-offs between reconstruction accuracy and latent regularization." |
| 233 | + ] |
| 234 | + }, |
| 235 | + { |
| 236 | + "cell_type": "markdown", |
| 237 | + "id": "786bf678", |
| 238 | + "metadata": { |
| 239 | + "editable": true |
| 240 | + }, |
| 241 | + "source": [ |
| 242 | + "### Boltzmann Machines\n", |
| 243 | + "\n", |
| 244 | + "Boltzmann machines define energy-based models" |
| 245 | + ] |
| 246 | + }, |
| 247 | + { |
| 248 | + "cell_type": "markdown", |
| 249 | + "id": "8f5199d2", |
| 250 | + "metadata": { |
| 251 | + "editable": true |
| 252 | + }, |
| 253 | + "source": [ |
| 254 | + "<!-- Equation labels as ordinary links -->\n", |
| 255 | + "<div id=\"_auto5\"></div>\n", |
| 256 | + "\n", |
| 257 | + "$$\n", |
| 258 | + "\\begin{equation}\n", |
| 259 | + "p_\\theta(x) = \\frac{1}{Z_\\theta} e^{-E_\\theta(x)},\n", |
| 260 | + "\\label{_auto5} \\tag{5}\n", |
| 261 | + "\\end{equation}\n", |
| 262 | + "$$" |
| 263 | + ] |
| 264 | + }, |
| 265 | + { |
| 266 | + "cell_type": "markdown", |
| 267 | + "id": "6327b154", |
| 268 | + "metadata": { |
| 269 | + "editable": true |
| 270 | + }, |
| 271 | + "source": [ |
| 272 | + "where $Z_\\theta$ is the partition function.\n", |
| 273 | + "\n", |
| 274 | + "Here we can examine:\n", |
| 275 | + "* Energy landscapes and statistical mechanics analogies,\n", |
| 276 | + "\n", |
| 277 | + "* Maximum likelihood learning and gradient structure,\n", |
| 278 | + "\n", |
| 279 | + "* Approximate inference methods such as contrastive divergence.\n", |
| 280 | + "\n", |
| 281 | + "Connections between Boltzmann machines and variational principles are emphasized." |
| 282 | + ] |
| 283 | + }, |
| 284 | + { |
| 285 | + "cell_type": "markdown", |
| 286 | + "id": "d8610dfc", |
| 287 | + "metadata": { |
| 288 | + "editable": true |
| 289 | + }, |
| 290 | + "source": [ |
| 291 | + "### Implementation and Experiments\n", |
| 292 | + "\n", |
| 293 | + "The practical component consists of implementing:\n", |
| 294 | + "* A CNN for image classification,\n", |
| 295 | + "\n", |
| 296 | + "* At least one generative model (VAE or Boltzmann machine, and if you wish to you could also include diffusion models).\n", |
| 297 | + "\n", |
| 298 | + "Experiments will use standard labeled datasets (select the dataset of your choice):\n", |
| 299 | + "* Classification accuracy and confusion structure,\n", |
| 300 | + "\n", |
| 301 | + "* Quality of generated samples,\n", |
| 302 | + "\n", |
| 303 | + "* Latent-space geometry and interpolation.\n", |
| 304 | + "\n", |
| 305 | + "Computational results are interpreted through the lens of the mathematical models." |
| 306 | + ] |
| 307 | + }, |
| 308 | + { |
| 309 | + "cell_type": "markdown", |
| 310 | + "id": "1038beed", |
| 311 | + "metadata": { |
| 312 | + "editable": true |
| 313 | + }, |
| 314 | + "source": [ |
| 315 | + "## Expected Outcomes\n", |
| 316 | + "\n", |
| 317 | + "By completing this project, you will:\n", |
| 318 | + "* Understand deep learning through probabilistic and variational principles,\n", |
| 319 | + "\n", |
| 320 | + "* Connect classification and generation within a unified framework,\n", |
| 321 | + "\n", |
| 322 | + "* Analyze neural networks as numerical optimization problems,\n", |
| 323 | + "\n", |
| 324 | + "* Gain insight applicable to physics-inspired machine learning, statistical inference, and scientific data analysis." |
| 325 | + ] |
| 326 | + }, |
| 327 | + { |
| 328 | + "cell_type": "markdown", |
| 329 | + "id": "057f0aea", |
| 330 | + "metadata": { |
| 331 | + "editable": true |
| 332 | + }, |
| 333 | + "source": [ |
| 334 | + "## Introduction to numerical projects\n", |
| 335 | + "\n", |
| 336 | + "Here follows a brief recipe and recommendation on how to write a report for each\n", |
| 337 | + "project.\n", |
| 338 | + "\n", |
| 339 | + " * Give a short description of the nature of the problem and the eventual numerical methods you have used.\n", |
| 340 | + "\n", |
| 341 | + " * Describe the algorithm you have used and/or developed. Here you may find it convenient to use pseudocoding. In many cases you can describe the algorithm in the program itself.\n", |
| 342 | + "\n", |
| 343 | + " * Include the source code of your program. Comment your program properly.\n", |
| 344 | + "\n", |
| 345 | + " * If possible, try to find analytic solutions, or known limits in order to test your program when developing the code.\n", |
| 346 | + "\n", |
| 347 | + " * Include your results either in figure form or in a table. Remember to label your results. All tables and figures should have relevant captions and labels on the axes.\n", |
| 348 | + "\n", |
| 349 | + " * Try to evaluate the reliabilty and numerical stability/precision of your results. If possible, include a qualitative and/or quantitative discussion of the numerical stability, eventual loss of precision etc.\n", |
| 350 | + "\n", |
| 351 | + " * Try to give an interpretation of you results in your answers to the problems.\n", |
| 352 | + "\n", |
| 353 | + " * Critique: if possible include your comments and reflections about the exercise, whether you felt you learnt something, ideas for improvements and other thoughts you've made when solving the exercise. We wish to keep this course at the interactive level and your comments can help us improve it.\n", |
| 354 | + "\n", |
| 355 | + " * Try to establish a practice where you log your work at the computerlab. You may find such a logbook very handy at later stages in your work, especially when you don't properly remember what a previous test version of your program did. Here you could also record the time spent on solving the exercise, various algorithms you may have tested or other topics which you feel worthy of mentioning." |
| 356 | + ] |
| 357 | + }, |
| 358 | + { |
| 359 | + "cell_type": "markdown", |
| 360 | + "id": "a83a3d3f", |
| 361 | + "metadata": { |
| 362 | + "editable": true |
| 363 | + }, |
| 364 | + "source": [ |
| 365 | + "## Format for electronic delivery of report and programs\n", |
| 366 | + "\n", |
| 367 | + "The preferred format for the report is a PDF file. You can also use DOC or postscript formats or as an ipython notebook file. As programming language we prefer that you choose between C/C++, Fortran2008 or Python. The following prescription should be followed when preparing the report:\n", |
| 368 | + "\n", |
| 369 | + " * Send us an email in order to hand in your projects with a link to your GitHub/Gitlab repository.\n", |
| 370 | + "\n", |
| 371 | + " * In your GitHub/GitLab or similar repository, please include a folder which contains selected results. These can be in the form of output from your code for a selected set of runs and input parameters.\n", |
| 372 | + "\n", |
| 373 | + "Finally, \n", |
| 374 | + "we encourage you to collaborate. Optimal working groups consist of \n", |
| 375 | + "2-3 students. You can then hand in a common report." |
| 376 | + ] |
| 377 | + } |
| 378 | + ], |
| 379 | + "metadata": {}, |
| 380 | + "nbformat": 4, |
| 381 | + "nbformat_minor": 5 |
| 382 | +} |
0 commit comments