Feb 28, 2023

The Centro Nacional de Análisis Genómico (CNAG-CRG) is one of the largest Genome Sequencing Centres in Europe.  

With the increasing demand in genomic tests on rare diseases, cancer and other diseases, genomic data management, analysis and interpretation is a real bottleneck in the healthcare systems. The Bioinformatics Unit at CNAG-CRG designs and develops innovative platforms and solutions to analyse large amounts of genomics and clinical data, with the ultimately goal of improving the implementation of data-driven and high-quality personalised medicine in the Healthcare System. Resources developed by the Unit are currently used by hundreds of clinical researchers in Europe and are part of major International genomic initiatives such as the Global Alliance for Genomics and Health (GA4GH), the European Infrastructure for life-science information (ELIXIR) and the 1+ Million Genomes Initiative (1+MG). The Unit includes bioinformaticians, engineers, software developers and biologists highly experienced on genomic data management, analysis and interpretation.

It is integrated with the Centre for Genomic Regulation (CRG), an international biomedical research institute of excellence, based in Barcelona, Spain, with more than 400 scientists from 44 countries. The CRG is composed by an interdisciplinary, motivated and creative scientific team which is supported both by a flexible and efficient administration and by high-end and innovative technologies.

In April 2021, the Centre for Genomic Regulation (CRG) received the renewal of the 'HR Excellence in Research' logo from the European Commission. This is a recognition of the Institute's commitment to developing an HR Strategy for Researchers, designed to bring the practices and procedures in line with the principles of the European Charter for Researchers and the Code of Conduct for the Recruitment of Researchers (Charter and Code).

Please, check out the CRG's Recruitment Policy


The role

We have an opening for a Data Engineer to play a key role in Instand-NGS4P. The aim is to develop a standardised Next Generation Sequencing (NGS) workflow from NGS data analysis to medical-decision making for common and rare adult and paediatric cancer. The workflow will leverage, among other, the current RD-Connect Genome Phenome Analysis Platform, with the objective to cover data management, clinical and genome data integration, genome analysis pipelines, variant annotation, interpretation and reporting. With the supervision of the lead of the Data Platforms and Tools Development team and in collaboration with cancer specialists, bioinformaticians and software engineers, the successful candidate will implement the data infrastructure and back-end of the product for the cancer platform.

His/ Her responsibilities include: 

  • - Implement pipelines in Apache Spark
  • - Integrate pipelines in Jenkins pipeline or NextFlow workflow manager systems 
  • - Collaborate with back-end developers and bioinformaticians to integrate data into the cancer platform
  • - Implement and improve queries in SQL (Postgres) and NoSQL databases (Elasticsearch, MongoDB, etc.)
  • - Gather and address technical and design requirements
  • - Follow emerging technologies


About the team

The successful candidate will join the Data Platforms and Tools Development team, coordinated by Dr. Davide Piscia. The team is part of the CNAG-CRG Bioinformatics Unit, led by Dr. Sergi Beltran, which has over 30 members and offers continuous growth and support on a professional level. The team works in a stimulating scientific environment, applying state-of-the-art technologies to breakthrough research projects in Genomics that have an impact on people’s health.


Whom would we like to hire? 

Must Have

  • - A minimum experience of 1 years in Software related position, preferentially as Data engineer.
  • - Hands on experience with programming languages like Python, Scala, Java and similar
  • - Understanding of pipeline orchestration 
  • - Knowledge of distributed computing (Apache Spark, Apache Flink or similar) 
  • - Experience with source control system as git


Nice to have

  • - Experience with genomics and clinical data
  • - Experience with work-flow orchestrator (Jenkins pipeline, Nextflow, Airflow, prefect, snakemake, etc.)
  • - Experience with databases (Postgres, Elasticsearch, Cassandra, etc.)
  • - Experience with data pipeline testing 


Education and training

  • - Bachelor degree or Master degree in Computer science or related fields



  • - Good spoken and written English 



  • - Good organisational, prioritising, communication and interpersonal skills.


The offer 

  • - Contract duration: Open-ended linked to a project duration
  • - Estimated annual gross salary: Salary is commensurate with qualifications and consistent with our pay scales.
  • - Target start date: as soon as possible


We provide a highly stimulating environment with state-of-the-art infrastructures, and unique professional career development opportunities. To check out our training and development portfolio, please visit the training section.

We offer and promote a diverse and inclusive environment and welcomes applicants regardless of age, disability, gender, nationality, ethnicity, religion, sexual orientation or gender identity.

The CRG is committed to reconcile a work and family life of its employees and are offering extended vacation period and the possibility to benefit from flexible working hours.

Application procedure

All applications must include:

  • - A complete CV including contact details. 
  • - A motivation letter addressed to Dr Davide Piscia will be highly valued.

All applications must be addressed to Human Resources and be submitted through the recruitment portal in the following link:

Selection Process 

  • - Pre-selection: The pre-selection process will be based on qualifications and expertise reflected on the candidates CVs. It will be merit-based.
  • - Interview: Preselected candidates will be interviewed by the Hiring Manager of the position and a selection panel if required.
  • - Offer Letter: Once the successful candidate is identified the Human Resources department will send a Job Offer, specifying the start day, salary, working conditions, among other important details.


The position will be open for at least 15 days since the date of publication. After it will remain open until a suitable candidate is hired.

Suggestions: The CRG believes in ongoing improvement and promotes a culture of feedback. This is one of the reasons we have in place, at your disposal as a candidate, a mechanism to gather your suggestions/complaints concerning your candidate experience in our recruitment processes. Your feedback really matters to us in our aim at creating a positive candidate journey. You can make a difference and help us improve by letting us know your suggestions through the following form.