Über A Black Story May Contain Sensitive Content
In a q&a after a reading I gave at George Mason University in the spring of 2023, I paraphrased a conversation I had with my brother earlier that year. We were on the stoop of his home in Bed-Stuy, talking about technology, about AI and ChatGPT in particular. He uses it for coding, I have used it for writing. I don't remember, not exactly, the details of the conversation but I said something about training GPT on all the emails and texts our mother had written. "After all," I said (and thought), "what else is it good for?" Why else have we built such a strange and challenging tool if not to return to us that which will always be taken? What better use is there? The New York Times recently ran an article called "Using A.I. To Talk to the Dead: Some people are using artificial intelligence chatbots to create avatars of departed loved ones. It's a source of comfort for some, but it makes others a little squeamish." People have always longed for some sort of "direct line" to the dead. Along with purpose-made devices (like a séance trumpet), people have claimed to hear the voices of dead loved ones through the static of everyday radio waves, giving rise to the "ghosts in the machine." Even Thomas Edison tried to invent a "spirit phone," a means of using technology to commune with the dead. In this way technology has always been "haunted" and seen as a gateway to other worlds outside of our perception. In the spirit plane lurk our loved ones, anxiously waiting for the right technology to close the circuit, to connect. Even knowing that the psychic is likely a hoax or the Ouija board unreliable, it is still tantalizing to invest any purported otherworldly connection with a crumb of what if? My own work fine-tuning large language models is influenced by this kind of haunting, the what if, and inquires into how these models model voices that no longer exist, voices of writers we don't often get to hear, such as Gwendolyn Brooks. (No offense, Shakespeare, but you've been dead a while and we hear you all the time.) It is perhaps wrong to say that these models model voices: some machine learning models do generate audible voices, but large language models use wizardry called deep learning to generate new text by analyzing textual data for its patterns. The text in this manuscript has been generated using the large Generative Pre-trained Transformer text-generating neural network known as GPT3. A large language model, or LLM, is a machine learning model that algorithmically processes, understands, and predicts language in a variety of language tasks-such as question and answer chatbots, machine translation, document summary, and more. ChatGPT is a question-and-answer program that uses LLMs as its base architecture and it can perform language tasks quite well, such as writing a book report or term paper for Rhetoric 101. These models have been pre-trained, meaning they're already trained on the task of text generation. Fine-tuning is the process of further training a pre-trained language model, like GPT, on domain-specific data so that it performs better on specific language tasks. The predictions given are more adapted to the new data set, which is usually orders of magnitude smaller than the original training set. For example, an LLM fine-tuned on all of Shakespeare should theoretically perform better on a task related to Shakespearean-style writing than the standard model. I am not a machine learning researcher so I cannot speak to exactly how fine-tuning works or why even a small corpus of text is successful in shifting the model's tone and approach.
Mehr anzeigen