Show newer

hi fediverse, I just posted transcripts of a few talks and lectures I've given over the past few years, mostly concerning the connections between machine learning, language and poetry: posts.decontextualize.com

(notes and summaries in individual posts below)

allison boosted

I'll be part of these free, outdoor, computer-generated performances in NYC, Thu 7pm, along with @aparrish & others https://hbstudio.org/calendar/performing-algorithms/

allison boosted

AND I get to hire a software curation specialist at NYU as a part of this work! Taking applications now: https://apply.interfolio.com/91696

Show thread
allison boosted

alright yeah shout out to @jonbro for this idea. here's the Deep Space Nine introduction made entirely using a Sketchfab browser preview of a 3d model of deep space nine (with original musical accompaniment)

allison boosted

masto meta 

one thing I hate about twitter is having zero control of your posts’ reach. it strips agency, flattens context, contributes to uncontrollable harassment in the service of “connection”. imo mastodon’s insistence on following the same value of maximization, removing any ability to foster small local community, is equally harmful. if decentralization is about control then why strip that at the individual and community level? it’s no better than twitter. glad to be on the hometown fork

allison boosted

HomeTown is a fork of Mastodon which lets you make instance-only posts, adjust character limits and read long form rich text blog posts.

It's still part of the Fediverse, and HomeTown users can interact with Mastodon users totally fine.

Tech people can get self-hosting instructions here:

https://github.com/hometown-fork/hometown

Non-tech people can use a managed hosting service to start their own HomeTown instance:

https://federation.spacebear.ee

#HomeTown #Mastodon #MastoTips #FediTips #Fediverse

a few weeks ago I made a tutorial for how to use the Hugging Face Transformers python library to generate text, focusing on the distilgpt2 model in particular. the tutorial explains subword tokenization and shows a number of simple (but effective) ways you can control the model's output: github.com/aparrish/rwet/blob/

computer-generated "recipes" that I made as an example in the workshop I'm teaching. the instructions are composed of random transitive verbs plus random direct objects from Bob Brown's _The Complete Book of Cheese_ gutenberg.org/ebooks/14293

allison boosted

oh my gods. they literally have no shame about this.

GitHub Support just straight up confirmed in an email that yes, they used all public GitHub code, for Codex/Copilot regardless of license.

doesn't do so well at the inverse task, i.e., generating with the probabilities of any token containing a vowel letter OTHER than 'E' zeroed out

Show thread

getting a language model to write lipograms by simply zeroing out the probability of any token in the vocabulary that has a particular letter in it (in this case, 'E')

allison boosted

#Github #Copilot gives an idea why #Microsoft paid so much for Github. They were after data: Tons of food for their AI, millions of contributors that now 'work' for MS for free.
You publish your code under GPLv3, even AGPLv3? So what? The AI learns from your code and uses it to generate code that is possibly proprietary. Does #GPL forbid this practice? (I don't think so)

That's the M$ way to break copyright law.

It's time for alternatives like @codeberg .

the university of milan has released over four hundred meows for non-commercial and research purposes zenodo.org/record/4008297 (via data-is-plural.com/)

allison boosted

Lately I've been reading a lot of children's picture books, over and over? I thought "Goodnight Moon" was pretty spooky, but I had trouble finding anyone writing about that online. @redoak jokingly suggested that I become the conspiracy theorist blogger I want to see in the world, so... I did it. Here's a totally serious take on why "Goodnight Moon" is an esoteric text, from me, a serious scholar of esotericism (aka podcast listener): https://pseudony.ms/blags/goodnight-nobody.html

logit biasing, markov chain style. here I'm doing it with phonetics—basically I check the possible outcomes for each context, and then artificially boost the probability of predictions that have certain phonetic characteristics. (in this case, more /k/ and /b/ sounds)

Show thread

(tomorrow I'm going to see if stealing alternatives from similar ngrams helps... but I am beginning to more viscerally understand why the solution to language modeling that really caught on is just... More Training Data)

Show thread

I like having this extra setting to fiddle with! but based on my limited testing, the temperature doesn't really matter once the length of the ngram hits a certain limit, since most ngrams only have one or two possible continuations. like... with word 3-grams, it's pretty difficult to distinguish 0.35 from 2.5

Show thread

generating with a markov chain using softmax sampling w/temperature (a la neural networks). this is an order 3 character model, and you can really see the difference between low temperature (instantly starts repeating itself) and high temperature (draws from wacky corners of the distribution) (if you've generated text with a markov chain before, it's probably using what amounts to a temperature of 1.0)

Show thread
Show older
Friend Camp

Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.

<svg xmlns="http://www.w3.org/2000/svg" id="hometownlogo" x="0px" y="0px" viewBox="25 40 50 20" width="100%" height="100%"><g><path d="M55.9,53.9H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,53.9,55.9,53.9z"/><path d="M55.9,58.2H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,58.2,55.9,58.2z"/><path d="M55.9,62.6H35.3c-0.7,0-1.3,0.6-1.3,1.3s0.6,1.3,1.3,1.3h20.6c0.7,0,1.3-0.6,1.3-1.3S56.6,62.6,55.9,62.6z"/><path d="M64.8,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C66,54.4,65.4,53.9,64.8,53.9z"/><path d="M60.4,53.9c-0.7,0-1.3,0.6-1.3,1.3v8.8c0,0.7,0.6,1.3,1.3,1.3s1.3-0.6,1.3-1.3v-8.8C61.6,54.4,61.1,53.9,60.4,53.9z"/><path d="M63.7,48.3c1.3-0.7,2-2.5,2-5.6c0-3.6-0.9-7.8-3.3-7.8s-3.3,4.2-3.3,7.8c0,3.1,0.7,4.9,2,5.6v2.4c0,0.7,0.6,1.3,1.3,1.3 s1.3-0.6,1.3-1.3V48.3z M62.4,37.8c0.4,0.8,0.8,2.5,0.8,4.9c0,2.5-0.5,3.4-0.8,3.4s-0.8-0.9-0.8-3.4C61.7,40.3,62.1,38.6,62.4,37.8 z"/><path d="M57,42.7c0-0.1-0.1-0.1-0.1-0.2l-3.2-4.1c-0.2-0.3-0.6-0.5-1-0.5h-1.6v-1.9c0-0.7-0.6-1.3-1.3-1.3s-1.3,0.6-1.3,1.3V38 h-3.9h-1.1h-5.2c-0.4,0-0.7,0.2-1,0.5l-3.2,4.1c0,0.1-0.1,0.1-0.1,0.2c0,0-0.1,0.1-0.1,0.1C34,43,34,43.2,34,43.3v7.4 c0,0.7,0.6,1.3,1.3,1.3h5.2h7.4h8c0.7,0,1.3-0.6,1.3-1.3v-7.4c0-0.2,0-0.3-0.1-0.4C57,42.8,57,42.8,57,42.7z M41.7,49.5h-5.2v-4.9 h10.2v4.9H41.7z M48.5,42.1l-1.2-1.6h4.8l1.2,1.6H48.5z M44.1,40.5l1.2,1.6h-7.5l1.2-1.6H44.1z M49.2,44.6h5.5v4.9h-5.5V44.6z"/></g></svg>