The Single Piece of AI Legislation I'd Start With

One Simple Rule

I know that creating laws is very hard — especially at a global scale.¹ And figuring out what to do about the rise of AI is the same. But I have an idea about where I’d start, that I would love to spitball.

My opinion regarding generative AI is currently something like this:

It’s not as useful and amazing as the salesmen claim it is. And overestimating it has its dangers.
At the same time, it also does have plenty of very useful use-cases — and more to come.
However, it being useful isn’t the same as it being a net-good, or that there aren’t very problematic consequences that need to be dealt with. (I wrote more about this here.)

My suggestion…

… for the first rule is quite simple (in concept):

It must be easy to find out if a piece of content is part of a model’s training data or not.

I know I’m far from the first to think of this. And during the process of writing this, I was happy to learn that, among others, Dua Lipa and Paul McCartney agree with me. 🔥

I know that AI companies will say something like: “But that’s not possible with the way we’re training these models!”

To that, I say:“OK, then change it.” If it worsens them for a while, so be it. Checking if people are pirating books on company laptops should be easy. And also for New York Times to find out if their stuff is being used by OpenAI.

Whether or not training is fair use is a different discussion,² but that can come afterward.

If the AI companies don’t want this transparency, I’d call that straight up suspicious cowardice.

And in the US, apparently. ↩︎
I don’t think so. ↩︎