anyone out there working on variations of copyleft-ish licenses that include provisions for machine learning models trained on the content in question? (i'd personally like to have, e.g., MIT and CC-BY licenses that include a clause like "any model trained on this must also include the attribution")
Are there any ideas out there on what lever one could use to restrict what the machines may train on? The going legal theory seems to be that copyright isn't it and the model is not a derivative, so any license is ineffective.
@clacke (I think it's important to phrase this differently—the "machines" don't "train on" anything. people train models on particular data for particular purposes.) the "going legal theory" that models aren't derivative is obvious nonsense in my mind... the idea behind a license that explicitly requires attribution on trained models would be to change the practice of exploiting labor in this way through public shaming & at least a vague threat of legal action