Background

Dr. Jorge Abreu-Vicente

Astrophysicist • AI/ML Researcher • Science Communicator

About

I research how to use natural language processing (NLP) and artificial intelligence to build open science tools that revolutionize the way we do and understand science.

My work spans from developing generative means of structuring biomedical data via large language models (LLMs) and knowledge graphs (KGs) to creating semantic maps of scientific knowledge.

I use these technologies to annotate and curate molecular and cell biology knowledge into data structures that are understandable by both humans and machines.

My background in astrophysics, where I studied molecular cloud structure and star formation at galactic scales, provides a unique perspective on handling complex, multi-dimensional datasets and understanding universal patterns in complex systems..

Experience

Academic Research

2022 - Present

Senior Staff: Machine Learning Developer

EMBO (European Molecular Biology Organization)

Heidelberg, Germany

Research and development of computing language models for biomedical data curation. Generation of knowledge graph for molecular and cell biology. Transforming open science through AI initiatives.

2013 - 2017

Postdoctoral Researcher and PhD Student

Max Planck Institute for Astronomy

Heidelberg, Germany

World-class research on star formation and molecular cloud structure at Galactic scales.

2013

Research Associate

Instituto de Astrofísica de Andalucía

Granada, Spain

Automated data processing and analysis pipeline for the IRAM 30m telescope.

2010 - 2013

Astronomer on Duty, IRAM 30m Telescope

Instituto de Radioastronomía Milimétrica (IRAM)

Granada, Spain

Quality assurance of observational data. Spectral and image processing and analysis. Writing technical documentation and reports.

Industry

2019 - 2022

Head of Center of Excellence Data Science and Innovation

CAMELOT Group

Mannheim, Germany

Lead company-wide AI transformation. Leading collaboration and connection with AWS. Data strategy design and lead implementation of Camelot Data Intelligent Digital Services. Intelligent document processing: Automated data extraction from unstructured documents.

2017 - 2019

Data Scientist

Datavard AG

Heidelberg, Germany

Creation and development of award-winning application. Evaluation and implementation of data science projects. Fast iteration and experimentation, complex application prototyping.

Featured Projects

2025
BioChatter

Open-source platform for biomedical application of large language models. Published in Nature Biotechnology (2025). Democratizing AI in biomedical research through transparent, customizable conversational interfaces with RAG, knowledge graph integration, and local LLM support.

AI/MLBiomedicineLLMsOpen ScienceNature Biotechnology
2023
SourceData-NLP

The largest Named Entity Recognition (NER) and Named Entity Linking (NEL) dataset in biomedical sciences. Integrating AI-ready curation directly into the publishing workflow at EMBO Press. Paper approved for publication in Bioinformatics (Oxford University Press).

NLPDatasetBiomedical AIKnowledge GraphsHuggingFace
2025 - Present
Scientific Knowledge Mapping

Creating a semantic atlas of all biomedical knowledge using novel self-supervised learning (Barlow Twins, VICReg) to map 35+ million papers beyond citations and impact factors. Building comprehensive knowledge landscapes using graph databases, knowledge graphs, and semantic embeddings.

Knowledge GraphsGraph DatabasesSemantic EmbeddingSelf-Supervised LearningScience of Science
2024
Morgenrot: A Journey from Darkness to Dawn

A science-backed personal journey helping others overcome panic attacks and anxiety through lived experience. Currently seeking publisher. Combining autobiographical chapters with evidence-based techniques for recovery.

Mental HealthBookAnxietyRecoveryPsychology
2016
Galactic Paleontology: Unraveling the Cosmic Web

Discovery of large-scale filamentary structures forming a galactic skeleton, challenging theoretical models and revealing key insights into star formation at galactic scales. Published in Astronomy & Astrophysics.

AstrophysicsStar FormationFilamentsGalactic StructureMolecular Clouds
2015
Deciphering the Evolutionary Journey of Molecular Clouds

First systematic study of density distribution in molecular clouds across the Galactic plane, revealing the roles of turbulence and gravity in star formation. Published in Astronomy & Astrophysics.

AstrophysicsStar FormationMolecular CloudsGalactic Evolution
2017
Enhanced Calibration of Herschel and Planck Data

Innovative recalibration of Herschel and Planck telescope data achieving unparalleled precision in mapping molecular cloud temperature and density across the Galactic plane. Published in Astronomy & Astrophysics.

AstrophysicsHerschelPlanckCalibrationMolecular Clouds

Education

2017

Ph.D. cum laude in Natural Sciences

Ruprecht-Karls-Universität Heidelberg

Heidelberg, Germany

Thesis: Molecular cloud structure at Galactic scales. Written score: 1/1. Member of the prestigious International Max Planck Research School.

View Thesis
2012

M.S. in Physics and Mathematics

IRAM & Universidad de Granada

Granada, Spain

Thesis: Carbono ionizado en el eje mayor de M33. Honors in Master Thesis.

2010

B.S. in Physics

Universidad de La Laguna

La Laguna, Spain

Graduated with honors in optics.

Publications

10+ peer-reviewed publications

Refereed Publications

2025

A platform for the biomedical application of large language models

Lobentanzer, S., Feng, S., Bruderer, N., Maier, A., Wang, C., Baumbach, J., Abreu-Vicente, J., et al.Nature Biotechnology 43, 166-169

2023

The SourceData-NLP dataset: integrating curation into scientific publishing for training large language models

Abreu-Vicente, J., Sonntag, H., Eidens, T., Lemberger, T.Bioinformatics (accepted for publication)

2017

Constraining the Dust Opacity Law in Three Small and Isolated Molecular Clouds

Webb, K. A. et al.ApJ 849, 13W

2017

Resolving fragmentation of high line-mass filaments with ALMA: integral-shaped filament in Orion A

Kainulainen, Stutz, Stanke, Abreu-Vicente et al.A&A 600, A141

2016

Fourier-space combination of Planck and Herschel images

J. Abreu-Vicente et al.A&A, 604A, A65

Honors & Awards

2021

Selected for book 'Inspiraciones Nocturnas VII'

Diversidad Literaria

2018

Most Innovative Project 2018

IA4SP (Datavard AG)

2018

3rd Prize in the Innojam

SAP Campus Basel

2017

Cover Page of Astronomy and Astrophysics Journal

A&A Journal

2013-17

PhD Fellowship

International Max Planck Research School

2012

Master Thesis Honors (10/10)

Universidad de Granada

Media, Outreach & Teaching

2024 - Present
Morgenrot: A Journey from Darkness to Dawn

Science-backed book on overcoming panic attacks and anxiety. Currently seeking publisher. Website and blog documenting the journey and recovery process.

2020-21
Astronomy Podcast: La cúpula

Four chapters with Dr. Francisco Parra-Rojas.

2020
Founder of Punto Vernal

Amateur astronomy company and YouTube channel.

2017
2016

Astronomy in Elementary School

Colegio PP Somascos, A Guarda, Spain

2014

Teacher for Astronomical Lab Course

Stellar photometry - MPIA/Ruprecht Karls-Universität Heidelberg