I think the likely problematic part is the straightforward reappropriation of all copyright.
I don't think that the use of data for training is much of a copyright-relvant thing (and indeed, the freedom to do so...). I also don't think that in general trained models should necessarily have a copyrightable relation to the inputs.
he truth is that if you have a reasonably uniquely named class and prompt with class MyElaborateCreation: it will happily reproduce the entire code. Certainly chunks large enough that copyright would seem to apply squarely. As such it has all the smell of a circumvention device.
Julia Reda (like Luis Vila before her) appear to make the assumption (based on early examples like the rsqrt trick) that the reproduction will be rare small samples. I have serious doubts that if you get to "here is the original", "here is the infringing other source" to the point where there is clear infringement, "but it was generated by an AI" is an excuse. In fact, I do believe that that the most legitimate legal opinions will leave out all the AI references and just judge copyright infringement by what is the allegedly infringing work relative to what is the original work.
Now, it would be interesting if someone tried to send a DMCA-notice to copilot by finding an infringement on code they own. I am sure this will happen eventually and then we will see.
Power management, mobile and firmware developer on Linux. Security developer at Aurora. Ex-biologist. mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer. Also on Mastodon.
Assumptions assumptions
Date: 2021-07-13 10:58 am (UTC)I think the likely problematic part is the straightforward reappropriation of all copyright.
I don't think that the use of data for training is much of a copyright-relvant thing (and indeed, the freedom to do so...). I also don't think that in general trained models should necessarily have a copyrightable relation to the inputs.
he truth is that if you have a reasonably uniquely named class and prompt with
class MyElaborateCreation:
it will happily reproduce the entire code. Certainly chunks large enough that copyright would seem to apply squarely. As such it has all the smell of a circumvention device.Julia Reda (like Luis Vila before her) appear to make the assumption (based on early examples like the rsqrt trick) that the reproduction will be rare small samples. I have serious doubts that if you get to "here is the original", "here is the infringing other source" to the point where there is clear infringement, "but it was generated by an AI" is an excuse. In fact, I do believe that that the most legitimate legal opinions will leave out all the AI references and just judge copyright infringement by what is the allegedly infringing work relative to what is the original work.
Now, it would be interesting if someone tried to send a DMCA-notice to copilot by finding an infringement on code they own. I am sure this will happen eventually and then we will see.