S&T at BWC RevCon

Anyone who has been to a Biological Weapons Convention (BWC) Review Conference (RevCon) or knows the history of the BWC from a distance has perhaps generated a new meaning for the word ‘verification’—a relatively simple word in English but a very complex word with complex history in the context of BWC. For someone with a mostly natural science background and only general knowledge of all the intricate details that such large conventions operating on consensus and having more than 180 countries involved must have, the complexity of the BWC can feel overwhelming, to say the least. Trying to then grasp all the multifaceted details and levels of conversation and negotiations that occurred back in the 90s feels almost impossible. Multiple articles and opinion pieces have been written discussing the negotiations about the verification regime and the reasons for failure. Some of the represented reasons include the US trying to protect its biotechnological advancements, others advocating for development assistance for the Global South, and eventually representing three different groups of countries all expressing their slightly different interests. While I find these conversations interesting and insightful, I also fathom that this situates far out of my expertise.

What I could contribute, however, are insights regarding the potential scientific approaches once we are ready to make use of them. It has been suggested previously that verification (and determination of compliance) concerning BWC cannot and should not be considered binary. Rather, verification in the context of BWC should be a continuous process of data tracking, analysis, and evaluation to gauge the intent of the State Parties to comply with the BWC. At the same time continuously adapting to new developments and ambiguities inherent in the dual-use nature of modern biosciences. 

One of these pillars supporting diplomatic conversations and policy proposals should be scientific – using continuously evolving scientific tools adapted to the new complexities and possibilities of biotechnology, synthetic biology, and similar. 

DNA Sequencing technologies are still getting cheaper and machine learning methods are getting better. Methodologies taking advantage of these trends are being developed for genetic engineering attribution and could become increasingly relevant in the discussion related to compliance and verification. Being able to adequately and with increasing accuracy determine (or suggest) the origin of an unknown biological sample could also act as a relevant deterrence method – some adversaries that would otherwise perhaps consider developing their own bioweapons in their broadest definition would not do so. At the same time, it has been suggested that genetic engineering and biosecurity are in an arms race, new advances in, for example, synthetic biology can complicate the attribution task.

Developers of these technologies emphasize the importance of being able to keep up with the advances. For example one of the tools developed that grew out of a project started by IARPA, has given ground to ENDAR (Engineered Nucleotide Detection and Ranking). Combining several complementary algorithms, the platform can aid experts in determining the origin of nucleic acids in a given sample. The version of ENDAR that I am familiar with searches initially for what they call smoking guns of engineering. This identifies sequences that are commonly found in engineered materials but are rare in nature, such as green fluorescent protein or antibiotic resistance markers. But engineering that is intended to be hidden will not be (that) obvious, thus the platform also tries to find sequences that when compared with machine learning models can tell the experts the likelihood of these sequences appearing in nature. The most exciting part is perhaps the module where ENDAR attempts to put together how the engineering happened. It tries to identify the cut-and-paste steps that might have occurred and reverse-engineer the steps. Provided that the databases are sufficient and well annotated, looking at the assemblies and inferred references, it can highlight the source and the genes that were inserted, or if a genomic region was deleted, it can also suggest which genes were removed. Finally, all these are fed to an analyst who gets a rank-ordered list of regions they should look at along with the likelihood score that that specific region contains engineering. This also emphasizes several features that are needed for this kind of platforms to be beneficial, for example, well-curated databases to detect possible engineering. This might prove difficult when dealing with rogue actors without legitimate laboratories and track records. Another complexity could be added with rising accessibility to synthetic DNA/organisms that again can generate the need to update approaches for reference panels for sequencing and annotating sequences. Also, it is sometimes suggested that the need for bioinformatics experts is on the decline but considering the complexity of the data in addition to the weight and importance of the decision regarding the possible engineering and the adversary, I would argue that the need for well-trained experts is increasing. This could also aid in avoiding false attribution, either accidental or deliberate. 

Establishing a verification mechanism for the BWC will continue to be challenging, both politically and technically. Nevertheless, there are potential scientific approaches that could provide a structured approach to start building up this mechanism when looking forward.