more fun with distilbert! this technique: (1) forward pass of model to transformer hidden state (2) add random noise to hidden state (3) predict tokens from the modified hidden state (noise in each line has increased intensity of noise)

