We're building value and opportunity by investing in cybersecurity, analytics, digital solutions, engineering and science, and consulting. Our culture of innovation empowers employees as creative thinkers, bringing unparalleled value for our clients and for any problem we try to tackle.
Empower People to Change the World®
Abstract
The finance industry's success with securely sharing high volumes of sensitive data can offer a lesson in cybersecurity for the life sciences.
Scientists need to look beyond their traditional professional network to adequately address one of the most fundamental issues facing scientific research: the growing volume of results that cannot be validated or replicated. The resulting loss in confidence diminishes the impact of science across the board. It hinders progress on some of society's most pressing challenges, such as the development of new medical treatments, maintenance of food supply safety, and protection against the rise of new infectious diseases. One way that scientists are attempting to address this issue is by making at least some of their data available to others who can validate the results. In fact, this is now standard practice for most federally supported research.Â
Enabling scientists to test each other's results and collaborate in other innovative ways through electronically sharing data has the potential to benefit science and society as a whole, but it also brings major risks to security and privacy. To better understand and mitigate these risks, science organizations should consider lessons from an industry that's been securely sharing high volumes of sensitive data for decades—finance.
The renewed focus on data quality and reproducibility carries innumerable benefits for most research programs. One far-reaching good is that the drive for greater data access has motivated efforts to establish global standards for data format and quality. For example, notable efforts to gather all metabolomics data and associated metadata (including environmental exposures) include:Â
In addition, standardization of laboratory protocols is driving automation such as robot pipettors and automatic storage of digital output from laboratory instruments. These efforts are paying off handsomely: Electronic data are now routinely subject to analysis by machine learning algorithms to find patterns not readily detectable by human experts. Deep learning algorithms can already meet or exceed the accuracy of clinicians in some applications such as radiomics.
Unfortunately, embracing the digitization and sharing of laboratory data carries the same risks faced by other big data disciplines. For example, the development of electronic laboratory quality management systems linking lab instruments and electronic notebooks introduces the need for a local area network and/or cloud server to store the data, and these networks are notoriously vulnerable to hacking and other malicious interference.Â
The recent cyber attacks on Israeli labs working on possible vaccines for coronavirus and on Merck's production of its pediatric vaccine Gardasil highlight the high cost and impact these attacks can have on an organization's finances and reputation.Â
Despite these high-profile cases, many scientists are reluctant to fully embrace data security as a top priority. IT experts compare the state of most laboratories to that of current "smart homes" that emphasize ease of operation through Wi-Fi connectivity of appliances and home security systems without providing the network security necessary to prevent malicious interference.
Fortunately, this is not the first time we have faced such a challenge. Beginning in the latter half of the 20th century, the financial world underwent a similar transformation from a paper-based records system to the current, globally linked, ultra-high-speed financial market capable of completing billions of transactions each day. These transactions form the backbone of the global economy and are, therefore, among the most closely watched and secured forms of data sharing.Â
By analogy, most laboratory scientists remain stuck in the "paper currency" era, relying on paper records (e.g., lab notebooks) locked away from anyone outside their physical laboratory or office space. The advantages of preserving paper records are clear. For one thing, when dealing in paper, physically securing the laboratory is generally sufficient to protect the original data. For another, sharing research results via PDF while keeping the underlying data offline and inaccessible is secure in much the same way as writing a paper check in lieu of a cash transaction. Yet even these records can be at risk. What's to stop someone from copying a check on a networked printer or other device that records all images it scans without securing the images behind rigorous cybersecurity protection? And once that image is online, it's liable to be found and manipulated by malicious actors.Â
Fully embracing the digital data ecosystem offers many advantages analogous to high-speed financial trading:Â
In effect, digitizing scientific data facilitates "investment" in other studies, yielding "interest" in the form of increased validation by peers, potential collaboration to expand the impact of the data.
Yet, the adoption of a digital data ecosystem introduces its own risks. Like currency, scientific data can be counterfeited, or, worse, corrupted without detection. Data can be highjacked and ransomed, instrument protocols can be altered to decrease performance, and a lab's infrastructure can be weaponized against itself (e.g., Triton malware capable of disabling safety instruments and systems in chemical manufacturing plants).
Translating these lessons from finance requires skills and perspective that most research scientists lack, and researchers seeking collaboration and data sharing can be hindered by organizational or regulatory compliance requirements as well. Overcoming these obstacles requires researchers and organizational IT experts to jointly develop data security plans at the outset of a new research study protocol. Data security plans can help mitigate network security challenges through in-depth analysis and implementation of security measures for each element of data management, including:Â
Now is the time for research scientists and their supporting organizations to join the digital age of data management. They can get started in doing so by taking a lesson from the financial sector on how to establish stable, productive collaboration with their colleagues in cybersecurity.