Pat Bennett’s prescription is a bit more complicated than “Take a couple of aspirins and call me in the morning.” But a quartet of baby-aspirin-sized sensors implanted in her brain are aimed at addressing a condition that’s frustrated her and others: the loss of the ability to speak intelligibly. The devices transmit signals from a couple of speech-related regions in Bennett’s brain to state-of-the-art software that decodes her brain activity and converts it to text displayed on a computer screen.
Bennett, now 68, is a former human resources director and onetime equestrian who jogged daily. In 2012, she was diagnosed with amyotrophic lateral sclerosis, a progressive neurodegenerative disease that attacks neurons controlling movement, causing physical weakness and eventual paralysis.
“When you think of ALS, you think of arm and leg impact,” Bennett wrote in an interview conducted by email. “But in a group of ALS patients, it begins with speech difficulties. I am unable to speak.”
Usually, ALS first manifests at the body’s periphery—arms and legs, hands and fingers. For Bennett, the deterioration began not in her spinal cord, as is typical, but in her brain stem. She can still move around, dress herself and use her fingers to type, albeit with increasing difficulty. But she can no longer use the muscles of her lips, tongue, larynx and jaws to enunciate clearly the phonemes—or units of sound, such as “sh”—that are the building blocks of speech.
Although Bennett’s brain can still formulate directions for generating those phonemes, her muscles can’t carry out the commands.
Rather than train the AI to recognize whole words, the researchers created a system that decodes words from phonemes. These are the sub-units of speech that form spoken words in the same way that letters form written words. “Hello,” for example, contains four phonemes: “HH,” “AH,” “L” and “OW.”
Using this approach, the computer only needed to learn 39 phonemes to decipher any word in English. This both enhanced the system’s accuracy and made it three times faster.
On March 29, 2022, a Stanford Medicine neurosurgeon placed two tiny sensors apiece in two separate regions—both implicated in speech production—along the surface of Bennett’s brain. The sensors are components of an intracortical brain-computer interface, or iBCI. Combined with state-of-the-art decoding software, they’re designed to translate the brain activity accompanying attempts at speech into words on a screen.
About a month after the surgery, a team of Stanford scientists began twice-weekly research sessions to train the software that was interpreting her speech. After four months, Bennett’s attempted utterances were being converted into words on a computer screen at 62 words per minute—more than three times as fast as the previous record for BCI-assisted communication.
“These initial results have proven the concept, and eventually technology will catch up to make it easily accessible to people who cannot speak,” Bennett wrote. “For those who are nonverbal, this means they can stay connected to the bigger world, perhaps continue to work, maintain friends and family relationships.”
Approaching the speed of speech
Bennett’s pace begins to approach the roughly 160-word-per-minute rate of natural conversation among English speakers, said Jaimie Henderson, MD, the surgeon who performed the surgery.
“We’ve shown you can decode intended speech by recording activity from a very small area on the brain’s surface,” Henderson said.
Henderson, the John and Jean Blume-Robert and Ruth Halperin Professor in the department of neurosurgery, is the co-senior author of a paper describing the results, published Aug. 23 in Nature.
His co-senior author, Krishna Shenoy, Ph.D., professor of electrical engineering and of bioengineering, died before the study was published.
Frank Willett, Ph.D., a Howard Hughes Medical Institute staff scientist affiliated with the Neural Prosthetics Translational Lab, which Henderson and Shenoy co-founded in 2009, shares lead authorship of the study with graduate students Erin Kunz and Chaofei Fan.
In 2021, Henderson, Shenoy and Willett were co-authors of a study published in Nature describing their success in converting a paralyzed person’s imagined handwriting into text on a screen using an iBCI, attaining a speed of 90 characters, or 18 words, per minute—a world record until now for an iBCI-related methodology.
In 2021, Bennett learned about Henderson and Shenoy’s work. She got in touch with Henderson and volunteered to participate in the clinical trial.
How it works
The sensors Henderson implanted in Bennett’s cerebral cortex, the brain’s outermost layer, are square arrays of tiny silicon electrodes. Each array contains 64 electrodes, arranged in eight by eight grids and spaced apart from one another by a distance of about half the thickness of a credit card. The electrodes penetrate the cerebral cortex to a depth roughly equaling that of two stacked quarters.
The implanted arrays are attached to fine gold wires that exit through pedestals screwed to the skull, which are then hooked up by cable to a computer.
An artificial-intelligence algorithm receives and decodes electronic information emanating from Bennett’s brain, eventually teaching itself to distinguish the distinct brain activity associated with her attempts to formulate each of the 39 phonemes that compose spoken English.
It feeds its best guess concerning the sequence of Bennett’s attempted phonemes into a so-called language model, essentially a sophisticated autocorrect system, which converts the streams of phonemes into the sequence of words they represent.
“This system is trained to know what words should come before other ones, and which phonemes make what words,” Willett explained. “If some phonemes were wrongly interpreted, it can still take a good guess.”