Pinned post

the year is 2025. openai lobbies congress to force websites that publish user-generated content to guarantee content is free of synthetic data (and mark it as such w/metadata in the html). google lobbies for compulsory no-cost licensing of all content published to the web, unless the site owner follows [proprietary standard that costs millions to implement]. facebook pays below poverty wages to thousands of contractors in locked, device-free rooms to type sentences, any sentences as LM fodder

Show thread

the delicious irony: creators of industrial language models are now worried about no longer being able to use the web as their "commons" (i.e. other people's labor that they appropriate and commercialize) because their own outputs are "polluting" it (via mailchi.mp/jack-clark/import-a)

weird idea, work in progress: (1) get DistilBERT hidden states (768 dimensions) for 768 sentences (of Frankenstein, in this instance) → stack vertically to form a 768x768 square → subtract the column-wise mean, normalize → lil bit of gaussian blur and threshold → "skeletonize" with skimage → "asemic" "writing"?

allison boosted

@aparrish This demonstrates that I don't know how to "DM" people on Mastodon. @aparrish knows what Taper is, having contributed, but others, please take a look! https://taper.badquar.to

Show thread

hi all, anyone have recommendations for articles about running tildeverse servers/pubnixes? I'm looking for manifestos, theoretical approaches, historical reports and personal accounts as much as I'm looking for practical advice and technical information

(I am aware of Paul Ford's tilde.club medium post)

allison boosted

@aparrish I work on copyright as my job. I highly recommend you read my blog post on GitHub Copilot I wrote some weeks ago (link below!)

https://ariadnavigo.xyz/posts/github-copilot-and-copyright/

anyone out there working on variations of copyleft-ish licenses that include provisions for machine learning models trained on the content in question? (i'd personally like to have, e.g., MIT and CC-BY licenses that include a clause like "any model trained on this must also include the attribution")

whoops, typo in the URL. here's the correct one: posts.decontextualize.com/quee

(this is my favorite of the bunch fwiw)

Show thread

it was @brainwane's "if you give a speech you care about, post a transcript" post that finally motivated me to clean these up and put them online harihareswara.net/sumana/2021/

I've got a workflow going now where I can create a presentation and a nicely formatted transcript (i.e., my speaker notes) from one file, and post it to the web straight from my notes app (Zettlr), so hopefully it's easier for me to do all this in the future

Show thread

"Rewordable versus the alphabet fetish" outlines how conventional spelling games (like Scrabble) are based on cryptography (via Poe's The Gold-Bug) and mystical alphabetical metaphysics—and how we attempted to circumvent those influences in Rewordable, a board game I co-designed a few years ago. includes a very adorable illustration I found of neoplatonist medieval philospher John Scotus Erigena posts.decontextualize.com/rewo

(originally a talk at NYU Game Center's Practice conference)

Show thread

"Language models can only write poetry" is an attempt to categorize the outputs of language models (from Tzara's newspaper cutups to Markov chains to GPT-3) using speech act theory. Touches on the immersive fallacy, Jhave's ReRites, Janelle Shane's AI Weirdness, Kristeva, etc. posts.decontextualize.com/lang

(excerpted from my Vector Institute / BMOLab "distinguished lecture")

Show thread

"Desire (under)lines: Notes toward a queer phenomenology of spell check" asserts that spell check is a "straightening device" (following Sara Ahmed's use of that term) that attempts to curb spelling's material, expressive, and sensual qualities. Includes a fun proto-Indo-European etymology aside posts.decontextualize.com/quee

(originally prepared for the EACL 2021 Queer in AI Social)

Show thread

hi fediverse, I just posted transcripts of a few talks and lectures I've given over the past few years, mostly concerning the connections between machine learning, language and poetry: posts.decontextualize.com

(notes and summaries in individual posts below)

allison boosted

I'll be part of these free, outdoor, computer-generated performances in NYC, Thu 7pm, along with @aparrish & others https://hbstudio.org/calendar/performing-algorithms/

allison boosted

AND I get to hire a software curation specialist at NYU as a part of this work! Taking applications now: https://apply.interfolio.com/91696

Show thread
allison boosted

alright yeah shout out to @jonbro for this idea. here's the Deep Space Nine introduction made entirely using a Sketchfab browser preview of a 3d model of deep space nine (with original musical accompaniment)

allison boosted

masto meta 

one thing I hate about twitter is having zero control of your posts’ reach. it strips agency, flattens context, contributes to uncontrollable harassment in the service of “connection”. imo mastodon’s insistence on following the same value of maximization, removing any ability to foster small local community, is equally harmful. if decentralization is about control then why strip that at the individual and community level? it’s no better than twitter. glad to be on the hometown fork

allison boosted

HomeTown is a fork of Mastodon which lets you make instance-only posts, adjust character limits and read long form rich text blog posts.

It's still part of the Fediverse, and HomeTown users can interact with Mastodon users totally fine.

Tech people can get self-hosting instructions here:

https://github.com/hometown-fork/hometown

Non-tech people can use a managed hosting service to start their own HomeTown instance:

https://federation.spacebear.ee

#HomeTown #Mastodon #MastoTips #FediTips #Fediverse

a few weeks ago I made a tutorial for how to use the Hugging Face Transformers python library to generate text, focusing on the distilgpt2 model in particular. the tutorial explains subword tokenization and shows a number of simple (but effective) ways you can control the model's output: github.com/aparrish/rwet/blob/

computer-generated "recipes" that I made as an example in the workshop I'm teaching. the instructions are composed of random transitive verbs plus random direct objects from Bob Brown's _The Complete Book of Cheese_ gutenberg.org/ebooks/14293

Show older
Friend Camp

Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.

<svg xmlns="http://www.w3.org/2000/svg" id="hometownlogo" x="0px" y="0px" viewBox="25 40 50 20" width="100%" height="100%"><g><path d="M55.9,53.9H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,53.9,55.9,53.9z"/><path d="M55.9,58.2H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,58.2,55.9,58.2z"/><path d="M55.9,62.6H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,62.6,55.9,62.6z"/><path d="M64.8,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C66,54.4,65.4,53.9,64.8,53.9z"/><path d="M60.4,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C61.6,54.4,61.1,53.9,60.4,53.9z"/><path d="M63.7,48.3c1.3-0.7,2-2.5,2-5.6c0-3.6-0.9-7.8-3.3-7.8s-3.3,4.2-3.3,7.8c0,3.1,0.7,4.9,2,5.6v2.4c0,0.7,0.6,1.3,1.3,1.3 s1.3-0.6,1.3-1.3V48.3z M62.4,37.8c0.4,0.8,0.8,2.5,0.8,4.9c0,2.5-0.5,3.4-0.8,3.4s-0.8-0.9-0.8-3.4C61.7,40.3,62.1,38.6,62.4,37.8 z"/><path d="M57,42.7c0-0.1-0.1-0.1-0.1-0.2l-3.2-4.1c-0.2-0.3-0.6-0.5-1-0.5h-1.6v-1.9c0-0.7-0.6-1.3-1.3-1.3s-1.3,0.6-1.3,1.3V38 h-3.9h-1.1h-5.2c-0.4,0-0.7,0.2-1,0.5l-3.2,4.1c0,0.1-0.1,0.1-0.1,0.2c0,0-0.1,0.1-0.1,0.1C34,43,34,43.2,34,43.3v7.4 c0,0.7,0.6,1.3,1.3,1.3h5.2h7.4h8c0.7,0,1.3-0.6,1.3-1.3v-7.4c0-0.2,0-0.3-0.1-0.4C57,42.8,57,42.8,57,42.7z M41.7,49.5h-5.2v-4.9 h10.2v4.9H41.7z M48.5,42.1l-1.2-1.6h4.8l1.2,1.6H48.5z M44.1,40.5l1.2,1.6h-7.5l1.2-1.6H44.1z M49.2,44.6h5.5v4.9h-5.5V44.6z"/></g></svg>