Exploring the use of Artificial Intelligence (AI) for extracting and integrating data obtained through New Approach Methodologies (NAMs) for chemical risk assessment

Project status: Completed
Project start: Jan 2022
Project end: Jun 2023
Acronym: AI4NAM
Department: Sicherheit von Pestiziden

Description and Objective

Result

Final report published as External Scientific Report: doi: 10.2903/sp.efsa.2024.EN-8567 Conclusions: The AI4NAMS project was initiated with the aim of investigating the potential of applying AI tools to support the search, extraction, and integration of NAM-based data into chemical risk assessment. Since the acquisition of NAM data from scientific publications and other sources such as chemicals databases is very elaborate due to the complexity and large volume of data, the support from AI tools is necessary to achieve EFSA’s strategic goal “to develop and integrate new approach methodologies (NAMs) for regulatory risk assessment”. The regulatory risk assessment process is divided into four steps: hazard identification, hazard characterisation, assessment, and risk characterisation. All steps of the risk assessment process in turn require the same line of action. First, evidence has to be assembled from all kinds of sources, including structured and unstructured databases. Second, evidence must be weighed according to scientific criteria, including but not limited to the assessment of reliability of presented data. Last, all evidence has to be integrated to finally reach an overall verdict on the conclusiveness, strength, and usefulness of the collected data to help the decision making (e.g. for the derivation of health-based reference values). To assist in all these steps, the AI4NAMS project developed a workflowng with data collection and ending with the integration of data. The structuring of the project into three major work packages enabled a targeted evaluation of suitable AI tools with regard to their potential to support this workflow and the derivation of recommendations on how can move towards its goal. Starting from an initial review of state-of-the-art AI tools, the most promising tools were applied in six selected case studies with either a focus on specific chemicals or endpoints. The hands-on experience gained during the case study implementation was used to update the tool review and formulate recommendations. In doing so, a distinction was made between vertical recommendations, directly addressing single steps of the workflow, and horizontal recommendations, covering cross-cutting challenges, as well as project recommendations, bundling the former into actionable next steps. Potentials for AI tool support could be identified in all workflow steps. The readiness of the tools, though, varied from ready-to-use commercial software solution to language models and code libraries that need to be adapted to the individual tasks of the workflow. In principle, many of these tasks can be supported by (semi-)automation of certain aspects but experience from the case studies shows that subject matter experts need to be involved in all workflow steps. A good example for the added value of commercial software solutions is the initial review where tools such as DistillerSR can reduce the literature screening effort considerably. However, a better integration with the initial search has emerged as a desirable feature. Most of the following workflow steps depend on a machine-readability of the analysed full texts of scientific publications. Hence, this transformation step is of central importance and should be considered in future applications of AI tools.Particularly high potentials were identified for the application of large language models to identify and extract contextualised key information and for the reliability assessment of scientific publications. The use of ontologies for data harmonisation and as look-up service to support the initial data collection are recommended as well. Also, tabular data extraction which is particularly relevant for omics data or toxicity study reports can be supported through the application of AI tools. The integration and visualisation of the extracted data into AOP-like networks proved to foster the understanding of the underlying data. The further evaluation and use of large language models and ontologies is also reflected in specific horizontal recommendations emphasising the great potential of the two approaches beyond the AI4NAMS workflow. Additionally, the introduction of agile project frameworks, an adoption of concepts such as user stories and journeys as well as the implementation of individual workflow steps and similar tasks as standalone microservices is recommended to exploit the potential of the AI tools in the best possible way. From a NAMs’ perspective, the conclusions of the project should be used to support the implementation of case studies for the development of new AOPs. Additionally, the building of reference databases for ADME data, the development of quantitative AOPs as well as the search and extraction of human exposome data are relevant areas of chemical risk assessment which can benefit from at least some of the solution approaches and tools applied in the AI4NAMS workflow.To bundle the recommendations and enable targeted implementation, six dedicated follow-up projects are recommended that build on the findings of the AI4NAMS project and contribute to achieving EFSA’s strategic goal. To support the work with text documents such as scientific publications and dossier submissions at ,the implementation of anenterprise search solution is proposed. This will highly facilitate the searching and use of information from various sources and extended to a digital workspace it can serve as basis for additional AI-powered services such as document classifications, translations or other tasks, e.g. of the AI4NAMS workflow. Another service that could build on the enterprise search solution is a chatbot for risk communication. This project recommendation aims for an improved transparency and knowledge transfer communicating risks associated with the food chain.Additionally, many of the approaches and tools applied in the individual workflow steps can be applied to support specific tasks at . One example is the processing of study reports for OECD test guideline compliant toxicity studies. The enrichment and maintenance of toxicological databases such as the genotoxicity database which is currently updated in a joint initiative of , ISS and can benefit from automated keyword highlighting and a (semi-)automated extraction of tabular data. Building on the experience made with structured data source, the harmonisation via ontologies and the transfer of data into IUCLID, a migration of CompTox data to IUCLID OHTs is proposed. This can improve the availability of mechanistic data on intermediate effects of endocrine activity which is required due to the recent introduction of new hazard classes for endocrine disruptors in the CLP regulation.To drive the implementation of next generation risk assessment (NGRA), the adoption of the AI4NAMS workflow to three priority areas is proposed. For the quantitative exploration of in vitro NAM data, substance specific input parameters for physiologically based toxicokinetic (PBTK) modelling are required which could be extracted via the workflow, supported by the development of suitable reporting formats and topical ontologies. The development of new and the refinement of existing AOPs is another area which could benefit from the workflow. Here, data harmonisation and integration tools appear of particular use and project complexity could be reduced by building on existing data collections or establishing synergies with data collection projects. Finally, the refinement of EFSA’s cumulative assessment groups based on mechanistic similarities or differences between chemicals with common target organs are expected to present a promising area of application.Another crucial step towards the efficient use of AI tools and methodologies and the application of “human-centric AI in close coexistence with human expertise” (PwC EU Services & Intellera Consulting, 2022) is the qualification of staff to be able to work efficiently and confidently with such tools. Therefore, the development of a data academy with tailored curriculum and learning paths with a focus on fundamentals in data analysis and management as well as more advanced topics in data analysis and AI is recommended. In conclusion, the AI4NAMS project identified a broad range of suitable AI tools and methodologies that can support the regulatory risk assessment in searching and extracting as well as harmonising and integrating NAMs data. At the same time, development needs were uncovered which should be addressed in future activities at . The proposed projects can contribute to both, the further development of AI tools and the accumulation of further practical experience as well as driving a change in perspective from exploration to integration supporting the everyday work at . This will ensure that can draw tangible added value from the great potential of AI.

Type of project	Third-party funded project
Research focus	Moderne Methoden in der Toxikologie / Alternativmethoden zum Tierversuch
Organisational units and partners	Lead specialist group: Toxikologie der Wirkstoffe und ihrer Metabolite (63) Contact persons: Dr. Carsten Kneuer External partner: Wageningen Food Safety Research , d-fine GmbH
Funding body and grant number	Europäische Behörde für Lebensmittelsicherheit OC/EFSAshort forEuropean Food Safety Authority/SCER/2021/08
Publications	https://doi.org/10.2903/sp.efsa.2024.EN-8567

The BfRshort forGerman Federal Institute for Risk Assessment offers you various newsletters in German and English.

Exploring the use of Artificial Intelligence (AI) for extracting and integrating data obtained through New Approach Methodologies (NAMs) for chemical risk assessment

Description and Objective

Result

Our newsletters