AI copyright far from settled as Japan and Israel stake out early positions


What data can be processed to train AI models? Japan and Israel are staking out initial positions, but like everything else on this topic, these are still in the early stages.

Large language and image models are trained on a huge amount of data from the Internet. Much of this data is copyrighted and has not been explicitly released for AI model training.

As a result, there has been a debate about the legal viability of such models, especially in the fields of design and art, and since the advent of widely available image generators such as Stable Diffusion.

Japanese law supports generative AI

In a hearing with Japanese politician Takashi Kii in late April, Japan’s Minister of Education, Culture, Sports, Science and Technology, Keiko Nagaoka, confirmed that existing Japanese law allows the use of data collected on the Internet for both non-commercial and commercial purposes . She said this in response to his question about potential copyright issues with generative AI.


While this is not an explicit endorsement of the legitimacy of large AI models trained on copyrighted data, it is a snapshot of existing Japanese law. Takashi Kii expressed at this meeting that he believes new copyright rules are needed, adapted to the AI ​​era. So this is far from being resolved.

Kii also said that Japan does not yet have rules for dealing with generative AI in an educational context.

Israel’s Ministry of Justice weighs in on copyright and AI training data

A more specific position paper published by the Israeli Ministry of Justice in 2022 (via Project Disco) states that “typically” the fair use doctrine applies to AI training data from the web, and that some projects may fall under a doctrine that allows “incidental use of copyrighted material” if the copyrighted works are deleted at the end of the training process .

Excluded from this approach are datasets that are specifically trained on the works of individual creators to compete with them. For example, imagine an AI system trained exclusively on Harry Potter novels to generate more.

In addition, the statement refers only to the training and not to the output of the systems, which could infringe copyrights regardless of the training process, the Ministry of Justice notes.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top