Machine assisted translation
The war on the borders of the archaic world may be hindering the appreciation of the revolution that is right now changing the face of our civilization. The emergence of a new generation of artificial intelligence systems has triggered processes of truly tectonic scale. Now we should keep track of them permanently, but let’s start from discussing the fate of copyright in the AI world.
In human history there are only about twenty or thirty such innovations, which penetrated into all aspects of life, changed the way of thinking and behavior of people and shaped new lifestyle. The ongoing and, for many, unexpected breakthrough in the field of artificial intelligence is just one of them.
Although the dream of artificial intelligence is the same age as nuclear and rocket technology, progress in this field has been very modest. The field of AI has developed a reputation similar to that of nuclear fusion: whenever you ask about the result, it is expected in 30 years. With the slightest success, the developers tried to distance themselves from the theme of AI, saying: «we are engaged in real work — image recognition or multifactor optimization, not fantasies about thinking machines.» Even today, experts talk about neural networks merely as new methods of information processing. However, the capabilities that everyone has been testing since last year show that we are dealing with a major breakthrough. Whatever technical solutions engineers apply, the key point is that their manifestations are increasingly perceived by people as being much like their own thinking. It is not the technological aspect that is important, but the anthropic aspect.
The juridical steampunk
The steam engine triggered a mass transition of industry to machine production. Much of what had until then required enormous labor began to be done mechanically. Jobless artisans even rose up in spontaneous revolts against the machines, but they were reasonably answered: machines liberate them from hard physical labor and monotonous routine, which are beneath human dignity. A shining dream was born of a future where «robots work and humans are happy,» exclusively engaged in creative work in which the machine could not replace them.
Around the same era, a new judicial institution, copyright, was rapidly spreading around the world. It was based on the notion that creative labor is a divine spark, and reproduction is only a mechanical, machine duplication. Thereby the creative work was recognized as qualitatively superior to the physical one, and therefore the authors deserved a special remuneration. They were given the right to control the use of their works, especially their copying. That’s where the name of this institution — copyright — comes from.
By its legal essence copyright (as well as its distant relative — the patent for invention) is the privilege. It gives the author some special rights at the expense of the restriction of the civil rights of the other people. The rhetorical justification for this redistribution of freedoms was the public benefit of encouraging the creative activity of authors. Evil tongues, however, pointed out that copyright is much more important for publishers, who, buying author’s privilege got the protection from the rivals, i.e. got a small monopoly, and there is nothing more valuable in business.
The small monopolies merged into large ones, built up legal services and successfully lobbied the parliaments to increase the term and volume of copyright protection. Authors with their long-suffering manuscripts found themselves on the weak side of deals with publishers, for whom all these masterpieces (with a few exceptions) were interchangeable. This is how dumping in the primary copyright market emerged. Only in the second half of the last century, and only in developed countries under pressure of literary agents, the average royalty rate rose to about 10% of the retail price, but with the transition to the electronic format it began to decline again.
The 10% for God’s spark of creativity is surprisingly reminiscent of the church tithe, a tax that has long been considered a kind of payoff for the church’s patronage of secular activity. And yet 10% is the typical efficiency of steam engine. The copyright has approximately the same efficiency, if to measure it by the share of money got from the market, which reaches the authors. The rest is spent for the functioning of the juridical mechanism. However the church tithe and the steam engines were almost everywhere rejected, but the copyright survived till the epoch of artificial intellect, becoming the real juridical steampunk. But now there are hard times for it.
Let’s turn to the news. A photobank is suing”>is suing the developers of Midjourney, the neural network generating pictures based on word queries, for what it considers to be the network’s training on copyrighted images. The programmer is suing Copilot, a Microsoft-backed publicly available AI assistant of the programmer, for possible copyright infringement. The Russian Speakers Union demands”>is demanding legislative protection for voices from use in synthetic voice-over. A writer used AI to compose and illustrate a children’s book and sell it on Amazon, but his rivals become are outraged and demand that it be taken off the market because he did not write it himself and did not work hard on it at all.
Work that had previously been considered purely creative was suddenly, if not completely, then largely within the power of a machine. Yes, of course, ChatGPT will not come up with a new scientific idea, Copilot will not invent a new algorithm, and Midjourney will not create a new artistic style. For now, anyway. But they already take an enormous amount of routine part of the intellectual work, which in the copyright ascent age was considered creative. It turns out that most of this God’s spark is easily carved by a machine.
It would seem that, as in the case of the mechanization of physical labor, we can only rejoice that the computer is now ready to rid us of the mass of intellectual routines, freeing up resources for more profound thinking than what AI is capable of so far. But the new Luddites resent the encroachment of machines on the sacred territory of creativity, and demand that their potential for innovation be artificially limited so that the creative efforts of intellectual artisans remain in demand by the market. As the main weapon to defend their conservatism they appeal to copyright, which has always been presented as the mechanism to support creators and innovators.
There is nothing surprising here. The copyright has always restricted the freedom of creativity and information exchange. In many countries it was initially introduced in the same law package with the censorship. In the more close to us epoch the copyright path is marked by such scandalous stories as:
- attempt to ban VCRs (Sony v. Universal, 1983 – 1984);
- prosecutions for circumventing the intentional incompatibility of DVDs from different regions of the world (DeCSS, 1996 – 2003);
- killing of the first P2P music service Napster (2001), which greatly delayed the development of online distribution of multimedia;
- mass racketeering under the threat of copyright lawsuits for downloading pornographic movies from torrents (Prenda Law, 2012 – 2013);
- destruction of Google’s large-scale book digitization project (Open Book Alliance, 2011 – 2016);
- barring fanfic authors, such as Paramount’s lawsuit against an amateur film based on Star Trek universe (Axanar, 2015 – 2017).
The copyright and censorship
The last point of the list illustrates the most destructive for the culture negative effect of the copyright — the problem of the derived works. They are almost completely fallen out of the free creativity sphere and passed into the category of regulated production, because their creation is now started from the legal department, but not from the author, who has conceived the translation or staging, fanfic or crossover, arrangement or remake.
In a world full of brands and quotes, it becomes unsafe for an independent author to reflect reality at all. Even the background music in a café while recording a video blog can cause the video to be stricken from Youtube. And this is another class of conflicts: when there is no violation, but it is easier for a public platform to prohibit a suspicious publication than risk legal costs. This is already very close to the logic of censorship and self-censorship familiar to residents of ideological dictatorships. Not only what is forbidden is blocked, but also anything that might seem to someone similar to what is forbidden.
In order to at least approximately characterize the scale of the trouble, let us make a quantitative assessment. In the USSR, several thousand authors and tens of thousands of works were under censorship restrictions. Many of them were available in special repositories by special permission, and several thousand circulated in samizdat. There are no exact figures, but different researchers agree on an order of magnitude estimates.
In the United States there are many so-called orphan books, that is, books that can neither be reprinted nor digitized because of the loss of data on the copyright holders. Most of them are available only in the largest libraries and private collections, which is to some extent comparable to the availability of special libraries and samizdat in the USSR. According to the U.S. Copyright Office (2015), the number of orphan books runs into the millions, and British researchers estimate that by 2009 there were 13 to 15 million. That’s about a third of all books ever published in the U.S.
It turns out that quantitatively copyright in the U.S. is associated with two orders of magnitude more restrictions on access to books than censorship in the Soviet Union with a comparable population of countries. And this is just in one aspect of orphan publications. The given examples illustrate the archaic nature of copyright institute: declaring the support of creativity and innovations, in practice it suppresses them in the interests of those, who were able to get the upper hand on that multi-monopoly market.
The steam locomotive in a dead end
And still the copyright hasn’t yet faced — openly, at least — the tectonic innovations, which change the trajectory of civilization. Artificial intelligence is apparently a worthy opponent, which has a potential if not to stop, then to slow down the copyright locomotive.
The basis of copyright is not the protection of the creative ideas themselves, but only the form of their embodiment. In the steam age the machine could only reproduce this form and the copyright allowed the creator to participate in the profits from its reproduction. [This sentence is a direct result of machine translation from Russian. — T‑i]
Let us take the previous paragraph and ask ChatGPT to paraphrase what was said, preserving the meaning, but changing the words and turns of phrase if possible. Here’s the result, with not a single letter manually changed:
Copyright law is not designed to protect creative ideas as such, but rather the tangible expression of those ideas. During the Industrial Revolution, machines could only replicate the form of the work and copyright law provided a means for creators to benefit from this replication.
A little more verbose, but quite acceptable in meaning and style, and not a single antiplagiarism service will not pick up on it. But this is a purely machine-made product, and such can be prepared from any other text.
Here is a more complicated example. Kipling’s famous poem ‘If – ’ is perfectly suited to the current turbulent times:
If you can keep your head when all about you
Are losing theirs and blaming it on you,
If you can trust yourself when all men doubt you,
But make allowance for their doubting too.
Let’s ask ChatGPT to rewrite it with different words and a different rhyme:
If you can stay composed when those around
Are losing grip and laying blame on you,
If you can keep belief in what you’ve found,
But understand when doubt clouds others’ view..
Can we ask for another version? «Sure,» the neural network replies:
When those around you are caught in fear,
And point their fingers to lay blame severe,
Trust in yourself, though others may sneer,
And grant them space, as doubt may yet appear.
Checkmate, aspiring poets. Are these poems copyrighted? To whom do they belong? Of course, if you admit to everything, as here, many will say: these are derivative works — automatic verse translations from English to English. But what if you don’t admit it? Connoisseurs, of course, will notice the repetition of the idea, but the copyright does not protect it. And who is the author of the new versions?
A Threat to the Future
There is no generally accepted answer to this question, although it has been debated for a long time (2017). Even earlier (2011) there were disputes about the copyright of works created by elephants and monkeys. In some countries, such as Germany, the law does not allow anyone other than humans to be considered the author. This is a legacy of the ancient notion of «God’s spark». In the US and Australia there is no such restriction, but case law, which has the force of precedent, denies the copyright for machine creativity. In the UK and New Zealand, on the other hand, the law recognizes authorship by «the person by whom the arrangements necessary for the creation of the work are undertaken.»
The latter option seems, at first glance, to make the most sense. Authors have long been using sophisticated technology, such as cameras, аnd despite that the shots are easy to make, you still have to choose the good ones. Even now, in order to make a neural network create an interesting drawing or text, a certain originality in the wording of the task may be needed. However, there is a hidden danger in this approach, which can undermine the whole artificial intelligence industry, if things go wrong.
Firstly, the neural network itself is created and trained by someone. This means that according to the letter of British law, its developers automatically become co-authors of all its works. This can lead to a monstrous concentration of copyrights and censorship tools in the hands of such neural network developers. They already put restrictions that do not allow, for example, to order the writing of phishing emails, password cracking programs, and simply offensive or politically incorrect texts.
Secondly, training neural networks requires huge amounts of information. They use all kinds of texts and images, including copyrighted ones. Authors and copyright holders, sensing the threat of competition from neural networks, are sounding the alarm and demanding a ban on training artificial intelligence on their works. Such a demand may seem reasonable until we realize its destructive potential.
AI systems have already entered our lives far more deeply than is commonly thought. Enough to say that search engines such as Google, Bind, or Yandex are specialized AI trained on the pages available on the Internet, protected, of course, by copyright. If we now require that search engines obtain the prior written consent of copyright holders to use texts for indexing, the Internet will collapse. Machine translators, effectively removed the language barrier in business communication around the world. They, also, were trained on copyrighted parallel texts. Banning such use of copyrighted material is to destroy the Babylon Tower once again.
Obviously, nobody is ready for such catastrophes. And if so, to impose copyright restrictions on educational materials for neural networks in numerous other cases would be inappropriate discrimination. Especially since neural networks do not copy other authors in their works, which would be a violation of copyright. If they do borrow something, this is very difficult to express in an objective way, especially since no law prohibits one author to mimic the style and manner of another, as long as there is no direct copying.
However, if AI systems continue to improve, absorbing all of humanity’s accumulated culture, it cannot be denied that many authors’ works, especially those of a utilitarian nature, will be greatly devalued. After all, anyone would be able to generate a similar analog using artificial intelligence quite inexpensively. Moreover, such a generated analogue — for example, a textbook on a certain topic — can be additionally adapted to the needs of the customer, something that a circulation-oriented product will never provide.
End of the track?
So are we looking at the disappearance of the author’s profession as one who lives from royalties? Partly yes, but, as it usually happens in such cases, not all the way. In case of the copyright wane, the author’s work will become to some extent similar to the ordinary labor, when the person earns more by his personal participation, than by the royalties from the machine copies. This process has been going on for a long time. Artists used to make more money from selling originals than from reproduction rights, musicians’ income is shifting from selling copies to concerts, literary royalties are declining (The Authors Guild, 2018), and many of them increasingly see books as a personal brand promotion tool rather than a source of income.
Of course, there will always remain a special experience of personal contact with creative people, but their copyrighted works are likely to occupy roughly the same place in the information market as handmade and luxury goods occupy in the housewares market. The works made with feeling and issued in small circulations with personal participation of the author will always be in demand, but the epoch of mass replication of culture will experience the shift towards machine personalization. This is roughly how it has already happened with the news that we have begun to receive from our personal social media feeds.
To summarize, we state that copyright and artificial intelligence as cultural phenomena are in deep contradiction with each other. The success of one of them is poorly compatible with the success of the other. At the same time, copyright belongs to the world of the archaic, maybe warm and cozy, but going back to the past, while the thinking crystal with artificial intelligence is just the case, which Joseph Brodsky wrote that «the irruption of the future into the present is felt as being a source of discomphort, if not of downright discouragement.» (The UNESCO Courier, 1990, #6, p. 31)
Text: ALEXANDER SERGEEV
Alexander Sergeev 17.03.2023