another way to find similar ngram contexts: each context has an embedding derived from the sum of positional encoding (they're not just for transformers!) multiplied by "word vectors" (actually just truncated SVD of the transpose of the context matrix). then load 'em up in a nearest neighbor index
(this is cool because I can use it even on ngrams that *don't* occur in the source text, though all of the words themselves need to be in the vocabulary)
I like having this extra setting to fiddle with! but based on my limited testing, the temperature doesn't really matter once the length of the ngram hits a certain limit, since most ngrams only have one or two possible continuations. like... with word 3-grams, it's pretty difficult to distinguish 0.35 from 2.5
Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.