another change I made was having it train on bitmaps of random words weighted by the frequency of the words in a reference corpus (i.e. in this case spaCy's unigram probabilities). the idea was that this would help it learn higher-frequency letter combinations and generate words that mostly replicate the "look" of English in use (rather than words in a word list). the drawback is that it looks like half the latent space is trying to spell out "the"
definitely bit off more than I could chew when it comes to making something that I feel is conceptually sound with this. the instant temptation is to go full "alien artifact" (and include GAN-generated body horror imagery or whatever), or at least make page layouts that resemble those of typical novels. but then the project feels like it's "about" layout, or "about" books as artifacts, which aren't topics that I personally care to spend time making arguments about at the moment
had an inkling to train a separate model for words with initial capitals, so I can introduce some structure (like sentences and paragraphs). the drawback here being that it won't have the same latent space as the lower-case model so interpolations won't work across the two. (training a separate model also for words with final punctuation)
I wish more of this project was more "making weird text things" and less "reinventing typesetting from scratch," but here we are
(right now I'm just blitting the GAN output straight to Pillow buffers in Python, bc otherwise I would have to write out all of the images to a directory and include them in a LaTeX template or something? and I think I'd rather be lost in the hell of reinventing typesetting than the hell of getting LaTeX to do what I want with tens of thousands of images)
Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.