Machine Learning from Scratch

Year offered: 2021
|

ML from scratch is a student-led tutorial / seminar series initiated by Johannes Bill and others from Jan Drugowitsch Lab at Harvard Medical School. The objective is to teach neuroscience students to learn cutting edge machine learning models by implementing them.

I started participating from 2022, and I prepared the tutorial and led a few seminars in it! 


Class Materials:

 

From Transformer to LLM: Architecture, Training and Usage

Transformer Tutorial Series

Attention

In this session, we walked through the architecture, training and applications of transformers (slides), the lecture slides covered

  • The basic principles of NLP
  • Basics of attention mechanism and transformer
  • Training language models (language modelling objective)
  • Usage of pretrained models (finetuning vs prompting)
  • Application of transformer beyond language (vision, audio, music, image generation, game&control) 

Jupyter Notebook Tutorial Series

We prepared this series of jupyter notebooks for you to gain hands-on experience about transformers from their architecture to the training and usage. 

  • Fundamentals of Transformer and Language modelling
  • Beyond Language: 
    in the following notebooks, we will demonstrate the flexibility of the transformer model by 
    •  Learn to do arithmetics by sequence modelling.
      In this notebook, you will train a GPT2 on arithmetic dataset, and let it learn to do arithmetics (partially) by next token prediction. 
    •  Image generation by sequence modelling.
      In this notebook, you will train a GPT2-like transformer for generative modelling of MNIST images, by predicting the sequence of patches in an image. 
    •  Audio signal classification (~ 20 min)
      In this notebook, you will train a transformer on Spoken MNIST dataset, and classify the audio sequences. 
    •  Image classification  (~ 30 min)
      In this notebook, you will train a transformer on images -- formated as a sequence of patches, and predict the identity of the image. 
    • Music generation by sequence modelling. (Difficult, training takes hrs)
      In this notebook, you will train a transformer to predict next note in a music dataset consists of piano rolls. By doing so it could be used to generate classic piano music. 
  • Using Large Language Model
    Finally we will get a glimpse at the LLMs, by using OpenAI APIs to achieve some useful things
  • Official Github repo
ChatPDF

Related material 

 

 

Understanding Stable Diffusion from "Scratch"

 

diffusion_proc1.gif

In this session, we walked through all the building blocks of Stable Diffusion (slides / PPTX attached), including

  • Principle of Diffusion models. 
  • Model score function of images with UNet model 
  • Understanding prompt through contextualized word embedding 
  • Let text influence image through cross attention 
  • Improve efficiency by adding an autoencoder 
  • Large scale training.
Stable Diffusion model overview

We prepared the Colab notebooks for you to

  • Playing with Stable Diffusion and inspecting the internal architecture of the models. (Open in Colab)
  • Build your own Stable Diffusion UNet model from scratch in a notebook. (with < 300 lines of codes!) (Open in Colab)
  • Build a Diffusion model (with UNet + cross attention) and train it to generate MNIST images based on the "text prompt". (Open in Colab)
  • Github Repo.  Official Github Page

In the end, we trained, a tiny-tiny diffusion model to generate MNIST digits from numbers 

Conditional Diffusion digit 4

... and a tiny diffusion model to generate faces from facial attributes on CelebA dataset 

UNet Sample of Faces

 

Related material

 

  

Mathematical Foundation of Diffusion Generative Models

 

Diffusion schematics

In this tutorial, we covered the mathematical foundation of diffusion generative models. We aim to give you a solid understanding of 

  • The score function as the gradient to data distribution
  • Score function enables the reversal of forward diffusion process 
  • Learning the score function by denoising score matching (and its equivalence to explicit score matching)  
  • Approximate the score function with a neural network. 
  • Sampling from diffusion models. 

with concrete examples in low dimension data (2d) and apply them to high dimensional data (point cloud or images). 

 

Jupyter / Colab Notebook tutorial series

  • Theory tutorial: Mathematical Fundation Open in Colab Notebook
  • Day 1 Coding tutorial: Diffusion, Reverse Diffusion and Score function Open in Colab notebook
    In this tutorial you will gain more intuition about the score functions by examining the analytical score function of a general class of distribution: Gaussian Mixture model
    You will empirically validate that the exact score functions enabled the reversal of diffusion process and recovers the original data distribution. 
     

    gmm_score_decompose
  • Day 2 Coding tutorial: Denoising Score Matching, and Train Neural Network to Approximate Score Open in Colab Notebook
    In this notebook, you will build a toy neural network model to learn the score function in a few different ways: by supervised learning on the exact score, by denoising score matching from the data samples. You will empirically validate that these two methods can both approximate the score of data and be used to recover the original data distribution in reverse diffusion. 
     

    recovery_distribution
  • Solutions to the coding exercises: Colab notebook