New Tutorial: VGP assembly pipeline

new tutorial assembly pacbio vgp

Posted on: 14 March 2022 purlPURL: https://gxy.io/GTN:N00033

We are proud to announce that, as result of the collaboration with the Vertebrate Genomes Project (VGP), a new training describing the VGP assembly pipeline is now available in the Galaxy Training Network. The Vertebrate Genomes Project aims to generate high-quality, near-error-free, gap-free, chromosome-level, haplotype-phased, annotated reference genome assemblies for every vertebrate species.

VGP pipeline. Open image in new tab

Figure 1: VPG Pipeline 2.0. The pipeline starts with assembly of the HiFi reads into contigs, yielding the primary and alternate assemblies. Then, duplicated and erroneously assigned contigs will be removed by using purge_dups. Finally, Bionano optical maps and HiC data are used to generate a scaffolded primary assembly.

The tutorial organized in four sections: genome profile, HiFi phased assembly, post-assembly pocessing and hybrid scaffolding. During the genome profiling stage, diverse tools based on the analsys of k-mer frequencies are used for infering the properties of the genome. After that, a draft assembly is generated by using high accuracy long-read PacBio HiFi reads. In the third stage, the initial assembly is preprocessed for identifying and reassign allelic contigs. Finally, in the last step the assembed contigs are assembled into scaffolds by using two additional technologies: Bionano optical maps and Hi-C data.

View Material

Recent News

See all news

Scaling Up Hands-On Bioinformatics Training with TIAAS – An Open University Perspective

3 July 2025   gtn TIAAS

Back in September 2024, we ran the Open University Bioinformatics Bootcamp—a free, five-day online course introducing students to the core tools and techniques used in single-cell biology. We were genuinely delighted by the level of interest: 120 students signed up, 100 showed up, and around 80 worked through the hands-on tutorials during the week. That’s a fantastic level of engagement, especially for a course that’s entirely optional and doesn’t count towards their degree.

Enhancing Scientific Training: The Galaxy Training Network's Role in the ELIXIR Training Life-Cycle

1 July 2025   gtn

In the rapidly evolving landscape of data science, continuous learning and skill development are crucial. The Galaxy Training Network (GTN) plays a pivotal role in this educational ecosystem, particularly within the ELIXIR Training Life-Cycle. This blog post explores how the GTN contributes to each phase of the life-cycle and aligns with the SPLASH recommendations, ensuring high-quality training for researchers worldwide.

An Ode to the Galaxy Community and all I learned

19 June 2025   gtn

This is my ode to the Galaxy Community to say how grateful I have been for your welcome, your energy, and your support. I have learned so very, very much, about bioinformatics; about software development; and most of all, about open-source communities in this complex scientific world.