okay I THINK I finally found a way of doing this that comes close to meeting all of my criteria for this project (i.e., each step shows visible and meaningful change; the change is gradual, but the result "converges" after relatively few steps): calculate the probability of token in source text vs. token sampled from the distribution of mask token at that position, then find "peaks" of improbable tokens, and replace w/sampled token at those peaks; stop when any output repeats
@aparrish it would be cool if you could feed it a context and it would help steer the algorithm. Like "this sentence should be about lasers". Very cool project!!
@mooog the project I'm riffing on does this explicitly: https://github.com/jeffbinder/visions-and-revisions
for the thing I'm working on, I'm really just interested in the transformation from uniform unigram randomness back to somewhat coherent sentence, bringing out the texture of the language model along the way
"doing this" = using DistilBERT to gradually transform a sequence words picked at random from a word list into text that appears to make sense