The question of whether the human voice is a property of one’s own person has a long history within the worlds of legal discourse and literary studies. An abridged version of it looks something like this: in what sense, and in what situations, do our voices belong to us?
You and I can never be satisfied with sitting down before a great human problem and saying nothing can be done. We must do something. That is the reason we are on Earth.
—W.E.B. Du Bois
The question of whether the human voice is a property of one’s own person has a long history within the worlds of legal discourse and literary studies. An abridged version of it looks something like this: in what sense, and in what situations, do our voices belong to us? What properties can be said to constitute the singular content of one’s own voice in the first place? And how might such questions be altered by the presence, and processing power, of large language models (LLM)? In the contemporary arts and entertainment landscape, this larger philosophical problem appears in at least two distinct, related ways. For the purposes of this article, I will focus on its relevance to the music industry and the world of literary publishing. I aim to offer a historically informed framework through which we might imagine an ethical pathway forward for the use of AI in the arts, looking to both legal and literary studies as resources for imagining a more progressive, human-centered vision of its role in the practice of human artmaking.
Within the publishing world, the specter of automation raises its head at the level of both the visual and the auditory: that is, via the AI-generated book manuscript (a campus novel, for instance, produced in toto from a prompt offered by a human being) and the AI-generated audiobook (which can even be recited in the voice of a human author based on data gathered from various online and otherwise sources). In the dominant framing of this issue, AI is imagined as a cheaper, more efficient option for corporations interested in the production of literary text as a saleable commodity. Here, there is no mandate to pay for studio time, or depend on the labor of audio engineers and voice actors. No need to account for an author showing up late to a recording session, or else going through multiple takes to perfect a reading. Only the faintest echo of a human element remains.
The responses from various literary communities, as well as their hired representatives, to these phenomena has, understandably, been largely critical in tone, and protective in practice (“no-AI” clauses in book contracts, and the like). For recording artists, the core issues with AI take on a similar, bifurcated form. Think here of the lyrics to popular songs performed by AI-generated vocalists—on platforms like TikTok and Instagram, you can now hear characters from SpongeBob SquarePants and South Park perform a range of Billboard Hot 100 hits—as well as lyrics written by largely unknown artists, but sung by AI-generated versions of celebrity vocalists (two of the most prominent recent cases being rather prolific AI replicants of world-renown artists like Drake and The Weeknd). Using a range of case studies—for example, the legal battle between blues legend Bessie Smith’s estate and Columbia Records over the right to market songs after her death that she was never compensated for during her life—as well as theoretical insights from black studies, legal studies, and literary studies, “Artificial Eloquence” makes the case that AI need not be imagined solely, or primarily, as an adversary of working artists. Rather, in the spirit of the expressive traditions that AI is so often called upon to mimic in the popular music space in particular (hip-hop, R&B, and even gospel), I will assert herein that there are practical strategies through which we can reimagine the role of AI as apprentice and collaborator.
An Abridged History of the Right to One’s Own Voice in the US Case Law Context
Every sound we make is a bit of autobiography.
—Anne Carson
The sound of the voice most visibly becomes an object of legal concern in the United States most prominently after the advent of radio in the 1920s. Though there are certainly several late-nineteenth and early-twentieth century European cases (e.g., in France and Germany) wherein phonographic recordings of opera singers are the subject of dispute in the courts—which eventually led to open debate in Germany about a Recht an der eigenen Stimme: a “right to one’s own voice”—many of these were cases in which those who pirated various musical records then released individual singers’ performances into contexts to which they had not consented under the binding terms of their contracts. This influenced the content of future recording contracts in such a way that they protected against these sorts of exploitative, extractive scenarios. In contrast to this larger legal tradition of the “right to one’s own voice,” framed explicitly in those terms, the most prominent cases in the twentieth-century United States tend to emphasize a related but altogether distinct concern: the “dignity” of personhood as reflected in the human voice.1 The ensemble of cases I will highlight in this section pull in meaningful ways from both of these traditions, and highlight at various points the experiences and insights of African American artists in particular as a means through which to highlight the set of philosophical issues that rise to the fore when a people group once considered, quite literally, incapable of producing something like a unique literary voice, become the subject of longstanding debates over the ethical concerns surrounding the reproduction of not only the sound of the human voice, but the metaphysical content it ostensibly carries.
Take, for example, the enslaved poet Phillis Wheatley who, in October 1772, was asked to sit before a panel of eighteen lawmakers and scholars in Massachusetts, each of whom was tasked with determining whether it was truly possible that she had composed the poetry published under her name. They simply could not fathom, at least at first, that she had produced such a luminous literary voice. Early reviews from the United Kingdom—where Wheatley’s debut collection, Poems on Various Subjects, Religious and Moral was published in 1773—echoed this problem: if she could compose poetry that spoke resoundingly to a complex, irreducibly human interior world, then how could she remain enslaved?2 Wheatley’s voice, calling forth from the printed page, seemed to demand another world.
The arrival of Wheatley’s collection of poetry preceded by more than a decade the now-infamous claim in Thomas Jefferson’s 1786 text, Notes on the State of Virginia, that African Americans were incapable of both “love” and “poetry,” and said of Wheatley’s work, specifically, that “the compositions published under her name [were] below the dignity of criticism.”3 That they represented a kind of advanced mimicry. Within the bounds of Jefferson’s framing, the very presence of a Black writer, a Black poet, represented a contradiction in terms. Poetry could only be produced by persons; by those with logos, understanding, imagination. From his vantage, Wheatley represented a certain kind of threat, or philosophical problem: that of a living, writing machine.4 Or even, more terrifyingly perhaps, another human consciousness he could not fully fathom, persisting in and through unthinkable conditions.
King v. Mister Maestro Inc.5
In the 1963 case, King v. Mister Maestro, Inc., the Rev. Dr. Martin Luther King, Jr., sued 20th Century Fox Record Corporation, Movietonews, and Mister Maestro for selling recordings of his “I Have a Dream” speech as spoken word LPs without his consent, or any compensation. It bears mentioning here that the speech was first recorded earlier that year at the March on Washington, and that this case was brought to court almost a decade before federal copyright protection for sound recordings became a reality in 1972, following the passage of the Sound Recording Act of 1971.6 In this era, only one sort of copyright was applicable to LP records—those covering the textual content of the speech. Simply the words, and nothing more.7
After 1972, there would be two simultaneous copyrights—one for the text, which would have vested with King, and the sound recording copyright, which would have vested with the party who recorded the speech (in this case, we can presume, 20th Century Fox). After the passage of the Sound Recording Act, 20th Century Fox would have had to gain Dr. King’s permission to sell the recordings because he (or his estate) owned the first copyright. Another version of this argument would play out more than thirty years later, in 1999, when King’s estate sues Columbia Records (Estate of Martin Luther King v. CBS, Inc.). This later legal dispute will come about because Columbia “refused to pay royalties to the Estate” for its use of the copyright in the words of the “I Have a Dream” speech, which they used in a documentary series: 20th Century with Mike Wallace.8
In the 1963 case, the court acknowledges that Dr. King had “developed a unique literary and oratorical style” and goes on to say that “it seems unfair and unjust for defendants to use the voice and the words of Dr. King without his consent and for their own financial profit.”9 King’s voice and his words are, in a sense, inextricable: they work together under the banner of style. And it is precisely this style, this dance between written text and audible sound, that makes the recording legible as the protectable, personal property of King. Additionally, the court cites one of the leading copyright authorities to the effect that “a sine qua non of publication—that is, dedication or relinquishment to the public domain—should be the acquisition by members of the public of a possessory interest in tangible copies of the work in question.”10 Put another way, the advance typescripts of “I Have A Dream” that King gave to the press prior to the public performance of the speech in August 1963 did not meet the bar, the court decided, of “general publication.” They were tangible copies, to be sure, but they had not been given public distribution in the jurisprudential sense. As such, King did not dedicate his speech to the public domain, and 20th Century Fox was not entitled to sell copies of a spoken word–LP, derived from the sound recordings of that public recitation, without his consent.
It is important to note here that although this outcome does not protect Dr. King’s recorded rendition of “I Have a Dream” as a form of intellectual property, it nevertheless has that essential effect by virtue of the stated relationship between the sound recording procured and produced by 20th Century Fox, and King’s ownership of the copyright grounded in the words to which that specific recording corresponds. In other words, the sound of one’s own voice does not always need to be protected by law for it to be off limits in a de facto sense, or else subject to licensing fees, for third parties.
Gee v. CBS, Inc.11
Bessie Smith, popularly known as “The Empress of the Blues,” was born in 1894. She was, in no uncertain terms, the most prominent blues composer and vocalist of her time. In 1923, Smith signed with Columbia Records. As a new artist on their label, she would record and release the hit song, “Downhearted Blues,” which sold nearly 800,000 copies that year. Over the course of her career, she would go on to influence many of the blues and jazz musicians of her day—in addition to her striking stage presence, Smith’s output as a composer and vocalist was truly prolific—and well into the future. She died tragically, in a car crash in 1937.
In the case of Gee vs. CBS, Inc, which took place in 1988, the plaintiffs were Bessie Smith’s heirs: her adopted son, Jack Gee, Jr., and the executor of her late husband’s estate, William D. Harris. For our purposes, I want to highlight a series of claims made by the plaintiffs. The first surrounding the reissuance in 1952 of one song in particular, “At the Christmas Ball”: a record that Columbia Records profited from, though Bessie Smith was never compensated for it in her lifetime. And the second, a claim made by the estate focusing on the reissuance of eight other songs for which, they claimed, Smith was not fully paid in the 1930s. Under the legal rubric of “misappropriation of artistic property,” these nine songs were, additionally, the subject of a motion for summary judgment by the Smith estate. Smith, it is important to note, never received a single royalty payment during her career, despite being one of the bestselling artists of that era.
In essence, the core argument of the heirs’ case was that Columbia Records—not only throughout Bessie Smith’s life, but even after her death in 1937—continuously exploited her. They infringed upon her property rights by re-recording her original songs in both the 1950s and 1970s without the permission of her estate. These were rights protected by state law, which vested in Smith decades prior, back when she first started recording the work in 1923. Importantly, the language of the court clarifies that these rights protected the actual style of singing featured on these records, and not simply those that belonged to Smith as the composer of these particular songs. Without ever providing adequate compensation, Columbia Records used her likeness for posthumous album covers, her singing style for re-recordings of old songs, and her voice to expand their wealth, all while attempting to leave those who knew her with nothing. And yet, despite those decades of exploitation and attempted theft—and even the fact that they would eventually lose this case—those who loved her found a way to keep her work, and her legacy, undeniably alive.
Midler v. Ford Motor Co.12
Midler v. Ford Motor Co. was the first case in US history to extend what is commonly known as the right of publicity, to the sound of the human voice. For a formal definition of the term, we can look to Restatement (Third) of Unfair Competition, published by the American Law Institute, where it is defined as follows: “appropriat[ing] the commercial value of a person’s identity by using without consent the person’s name, likeness, or other indicia of identity for purposes of trade.”13 In the Midler case, the Ford Motor Company hired a recording artist to not only perform a cover of Bette Midler’s 1972 hit song, “Do You Want to Dance,” for a commercial to impersonate her, and work as a kind of “sound-alike” for the listening audience at home (who were, clearly, supposed to believe that Midler was in fact the one singing). As in a related case—Sinatra v. Goodyear Tire & Rubber Co.—Midler did not compose the song in question, and thus did not hold a claim over the rights to the text of the song lyrics themselves.14 Still, in 1988, Midler wins the case.
We see in the language of Midler v. Ford Motor Co. that the primary matter at hand is the dignity of the artist’s voice, and its relationship to their personhood. The right of publicity is also distinct, we should note, from a sound recording copyright. Here, a claim pertaining to a recording artist’s voice can inhere in an attempt to mimic a singer’s performance of a particular song, or, otherwise, a singer or voice actor’s style of delivery more generally.15 Actionable right of personality claims working in this vein usually involve an imitation or some other nonconsensual use of a copyrighted sound recording, which is distinct from a wrongful use of the recording as such, for example, sampling a piece of recorded material without the proper clearances, or else in some other way that results in a direct objection from the artist. The most frequently quoted lines from the Midler decision are instructive here: “A voice is not copyrightable. The sounds are not ‘fixed.’ What is put forward as protectible here is more personal than any work of authorship.”16
There is a meaningful difference being articulated in this language from the court between the distinguishable, atomized, recorded performances from Midler—which, of course, are copyrightable based on existing precedent—her voice in some general sense, and the more personal essence cast as the object of central concern here. The voice the listener knows, recognizes, and associates with a human being, living or else once alive, on the other end of the recording. A person on a stage, in front of a microphone, out in the world. The song’s ultimate source. Its vessel and keeper.17 The property in question here, the reason the imitation put forward by Ford represents not only a legal but larger ethical violation, it appears, is that it constitutes a misuse of her vocal personality, the texture and particularity of the grain of her voice.18 What is being protected in this case is not any tangible property, then, but something more ephemeral. As in King v. Mister Maestro, what is at stake is the way that words on a page, or even inscribed onto the air, intersect with the human voice, which is always, in a sense, fugitive. Always with us and just beyond our grasp, giving texture and form to a style that belongs to us, is irreducibly ours, even as we give it away.
Just as the emergence of tree intelligence forever changed the planet, so the emergence of consciousness (which long predated humans) forever changed the nature of evolution. Cultural transmission is orders of magnitude faster than genetic transmission, and digital transmission has accelerated the speed of culture a hundredfold or more. We may soon seem, to our artificial intelligence offspring, as motionless and insentient as trees seem to us. And here we live, trying to make a home between our predecessors and our descendants.
—Richard Powers19
This article began as a telephone conversation after teaching one afternoon, walking down Massachusetts Avenue with my headphones on, dreaming of different ways for history and poetry to live together on the printed page. I was on the line with my literary agent, Nate, talking about a new book we had been working on together. He mentioned that he was finalizing our contract with the publisher, and that they had just added a “no-AI clause” to it earlier that week. No doubt hearing the confusion embedded in the half-beat after he uttered this phrase, he then clarified a bit. Essentially, the agency had argued for additional language in the contract to ensure that no AI software could be used to record the audiobook for this latest project in my place.
After this initial conversation with Nate, I decided to find out how other writers were managing these sorts of questions around artificial intelligence and authorship. During that search, I came across an article in The Atlantic by Alex Reisner: “These 183,000 Books Are Fueling the Biggest Fight in Publishing and Tech.”20 Embedded in the article is a search tool you can use to find out which specific books have been used as training data for Meta’s large language models. I searched for the names of a handful of writers I know, mostly friends and mentors, and then, finally, my own. And there it was. My first book of poems, The Sobbing School,21 used to train an LLM without my knowledge or permission. In that moment, I was not quite sure how to feel, exactly. But I soon realized that I needed a more robust historical frame to help me better understand, and ultimately contribute to, the conversations now taking place in my community of writers. Some way to help us navigate this new environment, where we were discovering that our work had been used in this strange and unexpected fashion.
In this vein, my recommendations are threefold for policy changes that might help us to honor both the spirit of U.S. case law concerning questions of the right to one’s own voice, as well as the dignity of, and material debt owed to, the artists whose work has helped train the large language models that are currently in use:
Remuneration for all copyrighted work used, without consent, as training data for large language models presently utilized by the public. We know, at present, the names of hundreds of thousands of authors and artists whose work has been pirated to train AI systems. A fee for this usage should be negotiated with each involved party.
Wide-scale funding of partnership programs dedicated to engineering collaborative inroads for artists and authors interested in using not only their existing work, but their expertise, to further the development of AI systems in ethical directions. This could take the form, for instance, research fellowships tailored to the needs of individual companies, or else community-centered initiatives that bring in local artists to share their work and insights.
Making publicly available proper citations of every piece of artwork—written, visual, auditory, and otherwise—used as training data for large language models. This will ultimately serve to create a rich, living document of collaborators, both dead and living, for those interested not only in the output provided by a given interaction with an AI system, but a history of arguments, and a sense of proper attributional practices and protocols.
How might we create models of not only compensation, but collaboration, that honor the spirit of the arguments offered not only by a constellation of American artists that have gone on before us (King, Smith, et al.), but by contemporary ones, who have offered righteous critiques of the usage of their work by large tech corporations? What models might we have already, such as sampling, which, although imperfect, emphasizes three principles that I think are useful in this case: crate-digging (archival exploration), clearance (going through proper legal channels to gain usage permission), and collaboration across time and space (reaching out to another, consciousness to consciousness, with the aim of making something new)? When we sample, after all, when we riff, and cite, and call upon the names of our teachers, we assemble an ensemble of the people who shaped the art forms we hold dear, and the beautiful sounds they made. We call in their voices to lift us higher.