On Thursday, a team of researchers led by Microsoft announced a groundbreaking discovery: a biological zero-day vulnerability. This term refers to an unrecognized security flaw in a critical system designed to protect against biological threats. The system at risk monitors the purchase of DNA sequences to identify orders that may encode toxins or dangerous viruses. However, the researchers caution that this system is increasingly susceptible to missing a new and emerging threat: AI-designed toxins.
Biological threats manifest in various forms. They can include pathogens such as viruses and bacteria, as well as protein-based toxins like ricin, which was infamously sent to the White House in 2003. Additionally, chemical toxins produced via enzymatic reactions, such as those associated with red tide, pose significant risks. All these threats originate from the same biological process: DNA is transcribed into RNA, which is subsequently used to synthesize proteins.
For decades, acquiring a DNA sequence has been as simple as placing an online order with numerous companies that synthesize and ship the requested sequences. Recognizing the inherent risks, both governments and industries have collaborated to implement a screening process for DNA orders. Each sequence is scanned for its potential to encode proteins or viruses deemed hazardous, with any positive matches flagged for further human evaluation.
Over the years, the list of flagged proteins and the sophistication of screening algorithms have evolved. Initially, screenings were based on the similarity of DNA sequences, but as it became apparent that multiple DNA sequences could encode the same protein, algorithms were refined to account for these variations. The new research extends this understanding: not only can multiple DNA sequences encode the same protein, but different proteins can also perform identical functions. For example, creating a toxin requires the protein to adopt a specific three-dimensional structure, which brings critical amino acids into close proximity.
Traditionally, experimenting to determine which amino acid sequences could be altered while maintaining functionality was time-consuming and costly. However, the researchers noted that advancements in AI protein design tools have made it possible to predict when distantly related sequences can achieve similar structural formations and catalyze the same reactions.
The research team formulated a hypothesis: AI could potentially take an existing toxin and engineer a protein with a comparable function that is sufficiently different to evade detection by existing screening programs. The initial test involved using AI tools to create variants of the toxin ricin and subsequently testing these variants against current DNA screening software.
The results indicated a concerning risk of dangerous protein variants slipping through the cracks of existing screening software, prompting the researchers to treat this situation as akin to a zero-day vulnerability in cybersecurity. They contacted relevant organizations, including the International Gene Synthesis Consortium and various U.S. governmental bodies, to discuss the potential vulnerability while keeping details confidential until further analysis could be conducted.
As part of a larger analysis, the researchers began with 72 known toxins, utilizing three open-source AI packages to generate approximately 75,000 potential protein variants. While many of these AI-designed variants would likely prove non-functional due to failure to fold correctly, the researchers employed two software tools to evaluate the designs based on predicted physical structures and differences in amino acid positioning.
Once the screening process was complete, the researchers observed significant variations in how well the screening programs flagged these variants as potential threats. While two software programs performed effectively, one showed mixed results, and another allowed most variants to pass undetected. Following this evaluation, three of the software packages were updated, drastically improving their ability to identify toxic variants.
It’s essential to note that the evaluation was based on predicted protein structures, meaning that a variant not flagged by the software does not guarantee inactivity as a toxin. Although the functional proteins are expected to be rare among the group, the study indicates that some structurally similar proteins went unnoticed by the screening systems. Of particular note, approximately 1 to 3 percent of the variants closely resembling the original toxin were not flagged, raising concerns about their potential threat level.
While the research suggests that the risk of undetected AI-designed toxins is relatively low, it underscores the importance of ongoing vigilance in biosurveillance. The advancements in AI protein design could lead to the emergence of entirely novel proteins that do not bear resemblance to any known threats, raising the stakes for future biological security.
In conclusion, while this research does not identify an immediate major threat, it serves as a crucial reminder for those developing screening software to consider emerging challenges. As AI protein design continues to evolve, the potential for creating proteins that are both innovative and hazardous increases. This study emphasizes the need for the scientific community to stay ahead in the ongoing battle against biological threats.