The Future of Health Data Management: Creating a Trusted Research Environment


Increased access to health research data allows scientists and researchers to uncover new discoveries about diseases and treatments they may not have had access to before. This data, based on genomic markers, is the key element in the manufacture of drugs and the diagnosis of patients.

In a recent study, researchers from Stanford University were able to beat the world record for diagnosing a patient with a rare disease in five hours and two minutes. By contrast, a typical diagnosis of rare diseases can take up to four years – and children usually have to wait six to eight years before being diagnosed.

Shortening the time to diagnosis is clearly a critical factor in living a longer and healthier life.

The barrier to accelerating the pace of diagnosis is that health data is often owned and accessed by a single group or organization (“silos,” in other words), and patient privacy makes data sharing problem. To overcome this hurdle, researchers and organizations are looking at a relatively new method of managing health data, by establishing Trusted Research Environments (TREs).

TRE is becoming an acronym commonly used by the scientific and research community. In general, a TRE is a power plant computer database that securely stores data and allows users to access it for analysis. TREs can only be accessed by licensed researchers and no data ever leaves the location. Because the data stays in place, the risk of patient confidentiality is reduced.

This is a very different approach to the traditional means by which researchers access data. Historically, researchers had to download an entire dataset to their computer to be able to study the results. Transferring and disseminating data in this manner increases the risk of security issues, even if individuals have been anonymized. Additionally, this method takes a considerable amount of time – time that could be better spent analyzing clinical datasets.

Why the change?

The COVID-19 pandemic has revealed that the availability and standardization of clinical patient data is key to learning more about the virus and how to target it head-on. Researchers around the world were conducting experiments, analyzing their results, collecting clinical data sets and reporting their results.

Meanwhile, organizations have realized the pressing need for a new way to manage health data. Specifically, the UK Health Security Agency began collecting whole genome sequencing in 2020 for COVID-infected patients. Recently the agency has just passed a million genomes in their database, which led to many findings and discoveries about the virus and its variants. These results were then shared with other countries for the benefit of the world.

Global impact of limited access

TREs are becoming the architectural backbone of health data in many research organizations. Although a step in the right direction, many TREs still cannot talk to colleagues in other organizations, or even other departments within their own organization.

For example, some universities have their own research departments, each with its own TRE. There have unfortunately been common situations where TREs who are only separated by a wall in an organization cannot “talk” to each other. Without this ability, it is impossible to take full advantage of an TRE.

As the genomics sector continues to grow, the ability of ESTs to communicate will enable researchers and scientists to collaborate effectively to overcome life-threatening diseases and diagnostics by breaking down health data silos.

It does not mean moving data. Life science data sets are too large to move efficiently – and to complicate matters, many data security regulations prohibit data from leaving an organization, state or nation. Therefore, it is estimated that up to 80–90 percent large datasets are simply not available for research.

What is needed is to move from centralizing data in silos to a way to share data while on the spot with the organizations that brought it together in the first place. No alternative is as promising for research.

What is a trusted search environment?

Organizations must consider several factors when embarking on the challenge of developing a trusted search environment. The UK Health Data Research Alliance has applied the Five Safes framework which includes safe people, safe projects, safe settings, safe data and safe exit, to TREs. The following is an overview of these components.

1. Safe People

Users must be approved and have appropriate credentials to access health data. Individuals should not attempt to re-identify individuals, as this would be a breach of patient confidentiality, or give another party access via their credentials. Researchers and scientists must be properly trained in the use of the TRE platform.

2. Safe projects

Even though TREs hold secure and sensitive information, it is essential that the data used is relevant and used to positively benefit public health. To achieve this, TREs must have audits in place to ensure compliance.

3. Safe settings

Cloud technology should never let data leave the database or export results to users. Researchers should have the option of contributing their own analysis algorithms, but any tools that are charged to the system must be contained in “airlock” mode. This feature allows the tools to be analyzed so that the security of the TRE is not affected. Ensure safe adjustments also means that users are tracked on their activity to ensure that researchers and their work are approved and appropriate.

4. Secure data

The data in the TRE must be safe and secure, so that patients are anonymized and there is no possibility for researchers to re-identify the information. Data quality should also be cleaned and checked, so that the appropriate data can be relevant to the approved project. The value of safe data can open up new ones research opportunities that will benefit the general public.

5. Safe exits

As mentioned in Safe settingsTREs must have barriers in place between the database and the researchers accessing the data. BarrierSecurity systems (or “airlocks”) are implemented so that the system can track requests and transactions from both sides to ensure that everything is approved, safe and secure.

When TREs meet these five requirements, organizations create a fully trusted research environment.


Genomic health data pose unique challenges for storage, management, analysis, and collaboration, due to both the scale of the datasets and the sensitivity of their content. ERTs become the architectural framework for bridging the health data gap so that information can be scaled and secured.


Comments are closed.