How about the whole problem be treated like a much more ancient intelligent device called the Programmer. We all get exposed to all kinds code under all kinds of copyright licenses during our careers. In that context, our experience is pretty similar to a machine learning model and it seems like the output should be treated similarly. If a programmer switches job and "copies" a significant chunk of code from their previous employer (even from memory), I suspect it would cause copyright concerns. OTOH, if they use general experience acquired from previous jobs, then I don't see how that would be an issue. So programmers already need to think and ensure they don't blindly copy code they've seen before. And it seems like any ML-based code generator should do the same somehow (or risk getting their users into trouble).
What about programmers?