Jun 15, 2024


  Hibrid / Office in Barcelona [Spain].


Date of Publication:




About Us:     

CNAG (Centro Nacional de Análisis Genómico)

The Centro Nacional de Análisis Genómico (CNAG) is one of the largest Genome Sequencing Centers in Europe.

The CNAG Consortium aims to carry out large-scale projects in DNA/RNA analysis for the improvement of quality of life in collaboration with the Spanish, European and International Research Community. CNAG researchers participate in major International Genome Initiatives such as the Human Cell Atlas (HCA), the International Cancer Genome Consortium (ICGC), the International Human Epigenome Consortium (IHEC), the International Rare Diseases Research Consortium (IRDiRC), the European Reference Genome Atlas (ERGA) and the European Infrastructure for life-science information (ELIXIR), as well as in several EU-funded projects.



The Role

We have an opening for a Data Engineer to play a key role in several cancer and rare diseases related projects, such as Genomed4All ( and EJP-RD (

For Genomed4all we are developing a platform for federated learning based on flower ( and mlflow  ( In EJP-RD we are further developing the RD-Connect GPAP ( and contributing to the EJP-RD Virtual Platform of data and resources.  With the supervision of the lead of the Data Platforms and Tools Development team and in collaboration with cancer specialists, bioinformaticians and software engineers, the successful candidate will implement the data infrastructure and back-end of the product for the federated platform and cancer platform.



The Team

The successful candidate will join the Data Platforms and Tools Development team, coordinated by Dr. Davide Piscia ( The team is part of the CNAG Bioinformatics Unit (led by Dr. Sergi Beltran), which has over 30 members and offers continuous growth and support on a professional level.

The team works in a stimulating scientific environment, applying state-of-the-art technologies to breakthrough research projects in Genomics that have an impact on people’s health.



1. Implement pipelines in Apache Spark

2. Integrate Machine learning models into a federated learning platform

3. Integrate pipelines in Jenkins pipeline or NextFlow workflow manager systems

4. Collaborate with back-end developers and bioinformaticians to integrate data into platforms

5. Benchmark, develop and implement services and queries on SQL (Postgres) and NoSQL databases (Clickhouse, Elasticsearch, MongoDB, etc.)

6. Gather and address technical and design requirements

7. Follow emerging technologies





• Bachelor degree or Master degree in Computer science or related fields

• A minimum experience of 2 years in a related position on software development, preferentially as a Data engineer.

• Hands on experience with programming languages like Python, Scala, Rust and similar

• Understanding of pipeline orchestration

• Knowledge of distributed computing (Apache Spark, Apache Flink or similar)

• Good organisational, prioritising, communication and interpersonal skills

• Good spoken and written English


Nice to have:


• Experience with genomics and clinical data

• Experience with federated learning framework ( flower, pysyft,etc..)

• Experience with work-flow orchestrator (Jenkins pipeline, Nextflow, Airflow, prefect, snakemake, etc.)

• Experience with databases (Postgres, Clickhouse, Elasticsearch, Cassandra, etc.)

• Experience with MlOps ( mlflow)

• Experience with data pipeline testing



The Offer


• Contract duration: Open-ended contract

• Estimated annual gross salary: Salary is commensurate with qualifications and consistent with our pay scales.

• Target start date: as soon as possible





• Highly stimulating environment with state-of-the-art infrastructures, and unique Professional Career Plan and development opportunities.

• We offer and promote a diverse and inclusive environment and welcomes applicants regardless of age, disability, gender, nationality, race, religion or sexual orientation, in a collaborative and supportive environment.

• We are committed to reconcile a work and family life for our employees and are offering the opportunity to benefit from annual leave, full health and dental Insurance, flexible schedule, and the possibility of remote work.


We look forward to receiving your application and discovering how you can contribute to CNAG's success! 



How to Apply:


All applications must include:

• A complete CV including contact details.

• Contact details of two referees.

• Cover Letter.

All applications must be addressed to People Department –

Deadline: Please submit your application by 15/06/2024

Interview: Shortlisted candidates will be invited for interview at CNAG on 18/06/2024


See the CNAG Career site at our website: