Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10637/13820
Título : Design, validation and implementation of a software tool for metabolites annotation and identification.
Autor : Gil de la Fuente, Alberto.
Materias: Metabolómicabioinformáticabases de datosaplicaciones webLC-MSMetabolomicsdatabasesbioinformaticsweb applications
Fecha de publicación : 3-ago-2022
Resumen : Se ha diseñado, implementado y validado una herramienta computacional de anotación de metabolitos que: (1) integró metabolitos de diferentes bases de datos basadas en su estructura única; (2) desarrolló un modelo relacional para la explotación de los datos; (3) desarrolló un sistema experto de conocimiento para puntuar las anotaciones en experimentos de matabolómica no dirigida oteniendo evidencia que soporte o refute las mismas en base a las posibilidades de ionización, reglas de intensidad entre diferentes aductos y orden de elución de compuestos en cromatografía de fase reversa. De esta forma, se puede obtener un nivel mayor de confianza sobre las anotaciones y se complementa la técnica de espectrometría en tándem para la identificación de matabolitos. La correcta identificación de metabolitos va a resultar en una mejor interpretación del análisis biológico subsecuente. Esta herramienta es la primera en utilizar información analítica y no analítica para soportar la anotación de metabolitos, aumentando de esta forma el nivel de confianza sobre los metabolitos.

Metabolomics is a subarea of the systems biology devoted to the study of the small size molecules (usually < 1,000 Da) produced by the metabolic processes happening in a cell. Since the end of the previous century untargeted metabolomics has been successfully applied to different domains such as biomarker discovery, therapeutical targets discovery, personalized medicine or providing knowledge about organisms and mechanisms of health and disease. Untargeted metabolomics, by nature, aims to obtain as much information as possible to maximize the number of detected and identified metabolites, being the metabolite identification vital in the final success of the studies. The number of extracted metabolites and subsequently identified with certain confidence level can be defined as “metabolite coverage”. The identification is the main bottleneck in metabolomic studies since the analytical information acquired requires a high amount of work and knowledge to be successfully exploited. On the one hand, separation and detection provide a valuable information that can be exploited in an automatic way by software tools. On the other hand, currently there are a large number of metabolomic data sources containing information about the metabolites they store. Both information coming from the analyses and the data sources can be used to provide a higher confidence level in the metabolite identification. The final goal of this thesis is the design, validation and implementation of a software tool that allows the simultaneous query over different metabolomic databases to offer the researchers the possibility of retrieving data from them in a single step. This simultaneous query will allow the access to more data both in depth, since they will be able to access the complementary information stored in distinct databases about metabolites contained in more than one database, and width, since there are a high number of compounds only present in a single database, with the consequent risk for the researchers of skipping metabolites during the annotation and identification process, thus potentially increasing the number of unknows in the experiment. Furthermore, the tool should exploit the analytical and non-analytical information to aid during the metabolite annotation and identification, therefore increasing the metabolite coverage in the metabolomic studies and reducing the number of misidentifications that lead to potential wrong biological interpretations. The first chapter reviews the available resources and data sources for the metabolite identification using Electrospray as ionization technique. The information contained in those resources is often complementary and the metabolite overlap is low. Therefore, the researchers should query different resources to boost the metabolite coverage in their studies. The second chapter introduces the first version of the software tool performed in this thesis: CEU Mass Mediator (CMM). The tool develops a heuristic approach for metabolite annotation from information coming from MS1 and the RT or MT obtained in the chromatographic or electrophoretic separation. The third chapter presents the acquisition of analytical knowledge from oxidized glycerophosphocholines and the creation of a semi-automated approach for their detection and identification using the RT and information obtained in MS1 and MS2 analysis. The fourth chapter describes the updates performed in CMM. New services have been gradually incorporated such as a spectral quality assessment, the incorporation of ontology and taxonomy information, and the support of MS2 searches. All the services present in CMM are available through a REST API to facilitate the automatic access and the communication with other software tools. The metabolites are the end products and the responsible of the biological systems status. The correctness and completeness of metabolite identification result in a higher amount of information for the subsequent biological interpretation. Consequently, we remark the necessity of combining analytical and non-analytical information to obtain and provide a higher confidence level in the metabolite identification, as well as the utility of the software tools in helping researchers to successfully conduct their experiments.
Descripción : Tesis-CEINDO, Universidad San Pablo CEU, Escuela Politécnica Superior, Programa en Ciencia y Tecnología de la Salud, leida el 4 de diciembre de 2019
Este trabajo es un compendio de trabajos publicados en revistas indexadas en la primera mitad del índice del Journal Citation of Reports (JCR)
Idioma: en
URI : http://hdl.handle.net/10637/13820
Derechos: http://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
Otros identificadores : 000000731215
Aparece en las colecciones: Ciencia y Tecnología de la Salud

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
Design_Alberto_GIl_USPCEU_Tesis_2019.pdf21,13 MBAdobe PDFVisualizar/Abrir



Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.