the year is 2025. openai lobbies congress to force websites that publish user-generated content to guarantee content is free of synthetic data (and mark it as such w/metadata in the html). google lobbies for compulsory no-cost licensing of all content published to the web, unless the site owner follows [proprietary standard that costs millions to implement]. facebook pays below poverty wages to thousands of contractors in locked, device-free rooms to type sentences, any sentences as LM fodder
the delicious irony: creators of industrial language models are now worried about no longer being able to use the web as their "commons" (i.e. other people's labor that they appropriate and commercialize) because their own outputs are "polluting" it (via https://mailchi.mp/jack-clark/import-ai-266-deepmind-looks-at-toxic-language-models-how-translation-systems-can-pollute-the-internet-why-ai-can-make-local-councils-better)
weird idea, work in progress: (1) get DistilBERT hidden states (768 dimensions) for 768 sentences (of Frankenstein, in this instance) → stack vertically to form a 768x768 square → subtract the column-wise mean, normalize → lil bit of gaussian blur and threshold → "skeletonize" with skimage → "asemic" "writing"?
hi all, anyone have recommendations for articles about running tildeverse servers/pubnixes? I'm looking for manifestos, theoretical approaches, historical reports and personal accounts as much as I'm looking for practical advice and technical information
(I am aware of Paul Ford's tilde.club medium post)
@aparrish I work on copyright as my job. I highly recommend you read my blog post on GitHub Copilot I wrote some weeks ago (link below!)
anyone out there working on variations of copyleft-ish licenses that include provisions for machine learning models trained on the content in question? (i'd personally like to have, e.g., MIT and CC-BY licenses that include a clause like "any model trained on this must also include the attribution")
whoops, typo in the URL. here's the correct one: https://posts.decontextualize.com/queer-in-ai-2021/
(this is my favorite of the bunch fwiw)
I've got a workflow going now where I can create a presentation and a nicely formatted transcript (i.e., my speaker notes) from one file, and post it to the web straight from my notes app (Zettlr), so hopefully it's easier for me to do all this in the future
"Rewordable versus the alphabet fetish" outlines how conventional spelling games (like Scrabble) are based on cryptography (via Poe's The Gold-Bug) and mystical alphabetical metaphysics—and how we attempted to circumvent those influences in Rewordable, a board game I co-designed a few years ago. includes a very adorable illustration I found of neoplatonist medieval philospher John Scotus Erigena https://posts.decontextualize.com/rewordable-versus/
(originally a talk at NYU Game Center's Practice conference)
"Language models can only write poetry" is an attempt to categorize the outputs of language models (from Tzara's newspaper cutups to Markov chains to GPT-3) using speech act theory. Touches on the immersive fallacy, Jhave's ReRites, Janelle Shane's AI Weirdness, Kristeva, etc. https://posts.decontextualize.com/language-model
(excerpted from my Vector Institute / BMOLab "distinguished lecture")
"Desire (under)lines: Notes toward a queer phenomenology of spell check" asserts that spell check is a "straightening device" (following Sara Ahmed's use of that term) that attempts to curb spelling's material, expressive, and sensual qualities. Includes a fun proto-Indo-European etymology aside https://posts.decontextualize.com/queer-in-ai-20
(originally prepared for the EACL 2021 Queer in AI Social)
hi fediverse, I just posted transcripts of a few talks and lectures I've given over the past few years, mostly concerning the connections between machine learning, language and poetry: https://posts.decontextualize.com
(notes and summaries in individual posts below)
I'll be part of these free, outdoor, computer-generated performances in NYC, Thu 7pm, along with @aparrish & others https://hbstudio.org/calendar/performing-algorithms/
alright yeah shout out to @jonbro for this idea. here's the Deep Space Nine introduction made entirely using a Sketchfab browser preview of a 3d model of deep space nine (with original musical accompaniment)
one thing I hate about twitter is having zero control of your posts’ reach. it strips agency, flattens context, contributes to uncontrollable harassment in the service of “connection”. imo mastodon’s insistence on following the same value of maximization, removing any ability to foster small local community, is equally harmful. if decentralization is about control then why strip that at the individual and community level? it’s no better than twitter. glad to be on the hometown fork
HomeTown is a fork of Mastodon which lets you make instance-only posts, adjust character limits and read long form rich text blog posts.
It's still part of the Fediverse, and HomeTown users can interact with Mastodon users totally fine.
Tech people can get self-hosting instructions here:
Non-tech people can use a managed hosting service to start their own HomeTown instance:
a few weeks ago I made a tutorial for how to use the Hugging Face Transformers python library to generate text, focusing on the distilgpt2 model in particular. the tutorial explains subword tokenization and shows a number of simple (but effective) ways you can control the model's output: https://github.com/aparrish/rwet/blob/master/transformers-playground.ipynb
computer-generated "recipes" that I made as an example in the workshop I'm teaching. the instructions are composed of random transitive verbs plus random direct objects from Bob Brown's _The Complete Book of Cheese_ https://www.gutenberg.org/ebooks/14293
Poet, programmer, game designer, computational creativity researcher. Assistant Arts Professor at NYU ITP. she/her
Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.