Workflows

These workflows are associated with Large genome assembly and polishing

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Assembly polishing - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nlong reads"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Parameter\nminimap setting for long reads"];
  style 2 fill:#ded,stroke:#393,stroke-width:4px;
  3["ℹ️ Input Dataset\nIllumina reads R1"];
  style 3 stroke:#2c3143,stroke-width:4px;
  4["🛠️ Subworkflow\nRacon polish with long reads, x4 - upgraded"];
  style 4 fill:#edd,stroke:#900,stroke-width:4px;
  0 -->|output| 4;
  1 -->|output| 4;
  2 -->|output| 4;
  5["Medaka polish"];
  4 -->|Assembly polished by long reads using Racon| 5;
  1 -->|output| 5;
  e3136060-bce7-4af3-87c4-9dcbb0d1f531["Output\nAssembly polished by long reads using Medaka"];
  5 --> e3136060-bce7-4af3-87c4-9dcbb0d1f531;
  style e3136060-bce7-4af3-87c4-9dcbb0d1f531 stroke:#2c3143,stroke-width:4px;
  6["Fasta statistics after Racon long read polish"];
  4 -->|Assembly polished by long reads using Racon| 6;
  7["Fasta statistics after Medaka polish"];
  5 -->|out_consensus| 7;
  8["🛠️ Subworkflow\nRacon polish with Illumina reads R1 only, x2 - upgraded"];
  style 8 fill:#edd,stroke:#900,stroke-width:4px;
  5 -->|out_consensus| 8;
  3 -->|output| 8;
  9["Fasta statistics after Racon short read polish"];
  8 -->|Assembly polished by short reads using Racon| 9;
	
Assembly with Flye - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nlong reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["Flye: assembly"];
  0 -->|output| 1;
  3960e31d-7a9e-400c-bb21-f6e47b75e649["Output\nFlye assembly on input dataset(s) (consensus)"];
  1 --> 3960e31d-7a9e-400c-bb21-f6e47b75e649;
  style 3960e31d-7a9e-400c-bb21-f6e47b75e649 stroke:#2c3143,stroke-width:4px;
  e524f295-a957-4c91-838c-f8e98e809b6c["Output\nFlye assembly on input dataset(s) (assembly_graph)"];
  1 --> e524f295-a957-4c91-838c-f8e98e809b6c;
  style e524f295-a957-4c91-838c-f8e98e809b6c stroke:#2c3143,stroke-width:4px;
  48b854e2-dd6e-4345-8d05-09abca6659da["Output\nFlye assembly on input dataset(s) (Graphical Fragment Assembly)"];
  1 --> 48b854e2-dd6e-4345-8d05-09abca6659da;
  style 48b854e2-dd6e-4345-8d05-09abca6659da stroke:#2c3143,stroke-width:4px;
  8672c172-71a7-432c-9679-a8e37f36cf53["Output\nFlye assembly on input dataset(s) (assembly_info)"];
  1 --> 8672c172-71a7-432c-9679-a8e37f36cf53;
  style 8672c172-71a7-432c-9679-a8e37f36cf53 stroke:#2c3143,stroke-width:4px;
  2["Fasta statistics"];
  1 -->|consensus| 2;
  3["Quast genome report"];
  1 -->|consensus| 3;
  17cdf8e0-8ad4-4570-afae-1861934fc678["Output\nQuast on input dataset(s):  HTML report"];
  3 --> 17cdf8e0-8ad4-4570-afae-1861934fc678;
  style 17cdf8e0-8ad4-4570-afae-1861934fc678 stroke:#2c3143,stroke-width:4px;
  4["Bandage image: Flye assembly"];
  1 -->|assembly_gfa| 4;
  e66bd129-146f-48dc-95b8-39a2a1ffb68d["Output\nBandage Image on input dataset(s): Assembly Graph Image"];
  4 --> e66bd129-146f-48dc-95b8-39a2a1ffb68d;
  style e66bd129-146f-48dc-95b8-39a2a1ffb68d stroke:#2c3143,stroke-width:4px;
  5["Bar chart: show contig sizes"];
  1 -->|assembly_info| 5;
  6d0a4e23-d631-4e37-8930-3d22fc91369b["Output\nBar chart showing contig sizes"];
  5 --> 6d0a4e23-d631-4e37-8930-3d22fc91369b;
  style 6d0a4e23-d631-4e37-8930-3d22fc91369b stroke:#2c3143,stroke-width:4px;
	
Assess genome quality - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nPolished assembly"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nReference genome"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["Busco: assess assembly"];
  0 -->|output| 2;
  219e4952-36e0-4b01-a407-e774b5b02dca["Output\nBusco short summary"];
  2 --> 219e4952-36e0-4b01-a407-e774b5b02dca;
  style 219e4952-36e0-4b01-a407-e774b5b02dca stroke:#2c3143,stroke-width:4px;
  3["Quast: assess assembly"];
  1 -->|output| 3;
  0 -->|output| 3;
  77ab0186-4cfd-460f-b71e-39a923414ef4["Output\nQuast on input dataset(s):  HTML report"];
  3 --> 77ab0186-4cfd-460f-b71e-39a923414ef4;
  style 77ab0186-4cfd-460f-b71e-39a923414ef4 stroke:#2c3143,stroke-width:4px;
	
Combined workflows for large genome assembly - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nlong reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nR1"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nR2"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["ℹ️ Input Parameter\nminimap settings for long reads"];
  style 3 fill:#ded,stroke:#393,stroke-width:4px;
  4["ℹ️ Input Dataset\nReference genome for Quast"];
  style 4 stroke:#2c3143,stroke-width:4px;
  5["🛠️ Subworkflow\nkmer counting - meryl - upgraded"];
  style 5 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 5;
  6["🛠️ Subworkflow\nData QC - upgraded"];
  style 6 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 6;
  2 -->|output| 6;
  0 -->|output| 6;
  7["🛠️ Subworkflow\nTrim and filter reads - fastp - upgraded "];
  style 7 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 7;
  2 -->|output| 7;
  0 -->|output| 7;
  8["🛠️ Subworkflow\nAssembly with Flye - upgraded"];
  style 8 fill:#edd,stroke:#900,stroke-width:4px;
  7 -->|fastp filtered long reads| 8;
  9["🛠️ Subworkflow\nAssembly polishing - upgraded"];
  style 9 fill:#edd,stroke:#900,stroke-width:4px;
  8 -->|Flye assembly on input datasets consensus| 9;
  7 -->|fastp filtered R1 reads| 9;
  7 -->|fastp filtered long reads| 9;
  3 -->|output| 9;
  10["🛠️ Subworkflow\nAssess genome quality - upgraded"];
  style 10 fill:#edd,stroke:#900,stroke-width:4px;
  9 -->|Assembly polished by long reads using Medaka| 10;
  4 -->|output| 10;
	
Data QC - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nInput file: long reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nInput file: Illumina reads R1"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nInput file: Illumina reads R2"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["Nanoplot: long reads"];
  0 -->|output| 3;
  73d0e4cf-366e-41c1-810a-b269638826b3["Output\nNanoPlot on input dataset(s): HTML report"];
  3 --> 73d0e4cf-366e-41c1-810a-b269638826b3;
  style 73d0e4cf-366e-41c1-810a-b269638826b3 stroke:#2c3143,stroke-width:4px;
  4["FastQC on R1"];
  1 -->|output| 4;
  5["FastQC on R2"];
  2 -->|output| 5;
  6["MultiQC: combine fastQC reports"];
  4 -->|text_file| 6;
  5 -->|text_file| 6;
  8baf8700-876e-4a74-ad99-6e656f3ba618["Output\nMultiQC on input dataset(s): Webpage"];
  6 --> 8baf8700-876e-4a74-ad99-6e656f3ba618;
  style 8baf8700-876e-4a74-ad99-6e656f3ba618 stroke:#2c3143,stroke-width:4px;
	
Racon polish with Illumina reads (R1 only), x2 - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nIllumina reads, R1, in fastq.gz format"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["Minimap2 round 1: map reads to assembly"];
  1 -->|output| 2;
  0 -->|output| 2;
  3["Racon round 1: polish assembly"];
  0 -->|output| 3;
  2 -->|alignment_output| 3;
  1 -->|output| 3;
  4["Minimap2 round 2: map reads to assembly"];
  1 -->|output| 4;
  3 -->|consensus| 4;
  5["Racon round 2: polish assembly"];
  3 -->|consensus| 5;
  4 -->|alignment_output| 5;
  1 -->|output| 5;
  594819c3-668e-4575-b9a6-4459ffacf952["Output\nAssembly polished by short reads using Racon"];
  5 --> 594819c3-668e-4575-b9a6-4459ffacf952;
  style 594819c3-668e-4575-b9a6-4459ffacf952 stroke:#2c3143,stroke-width:4px;
	
Racon polish with long reads, x4 - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nlong reads"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Parameter\nminimap setting for long reads "];
  style 2 fill:#ded,stroke:#393,stroke-width:4px;
  3["Minimap2: map long reads to assembly"];
  2 -->|output| 3;
  1 -->|output| 3;
  0 -->|output| 3;
  4["Racon: polish 1"];
  0 -->|output| 4;
  3 -->|alignment_output| 4;
  1 -->|output| 4;
  5["Minimap2: map long reads to polished assembly 1"];
  2 -->|output| 5;
  1 -->|output| 5;
  4 -->|consensus| 5;
  6["Racon: polish 2"];
  4 -->|consensus| 6;
  5 -->|alignment_output| 6;
  1 -->|output| 6;
  7["Minimap2: map long reads to polished assembly 2"];
  2 -->|output| 7;
  1 -->|output| 7;
  6 -->|consensus| 7;
  8["Racon: polish 3"];
  6 -->|consensus| 8;
  7 -->|alignment_output| 8;
  1 -->|output| 8;
  9["Minimap2: map long reads to polished assembly 3"];
  2 -->|output| 9;
  1 -->|output| 9;
  8 -->|consensus| 9;
  10["Racon: polish 4"];
  8 -->|consensus| 10;
  9 -->|alignment_output| 10;
  1 -->|output| 10;
  bcf0f03c-5951-46a7-aa38-545aed9bc183["Output\nAssembly polished by long reads using Racon"];
  10 --> bcf0f03c-5951-46a7-aa38-545aed9bc183;
  style bcf0f03c-5951-46a7-aa38-545aed9bc183 stroke:#2c3143,stroke-width:4px;
	
Trim and filter reads - fastp - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nIllumina reads R1"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nIllumina reads R2"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nlong reads"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["fastp on short reads"];
  0 -->|output| 3;
  1 -->|output| 3;
  656e4138-41ab-4561-8989-33de9ac9a2f3["Output\nfastp report on short reads html"];
  3 --> 656e4138-41ab-4561-8989-33de9ac9a2f3;
  style 656e4138-41ab-4561-8989-33de9ac9a2f3 stroke:#2c3143,stroke-width:4px;
  0d53f347-0368-47cb-953a-2e4dac57e013["Output\nfastp filtered R1 reads"];
  3 --> 0d53f347-0368-47cb-953a-2e4dac57e013;
  style 0d53f347-0368-47cb-953a-2e4dac57e013 stroke:#2c3143,stroke-width:4px;
  10fbe1e5-400c-4ffe-8794-9b776b0d7322["Output\nfastp report on short reads json"];
  3 --> 10fbe1e5-400c-4ffe-8794-9b776b0d7322;
  style 10fbe1e5-400c-4ffe-8794-9b776b0d7322 stroke:#2c3143,stroke-width:4px;
  639ed3f7-0e51-4e5d-b6f8-081378962109["Output\nfastp filtered R2 reads"];
  3 --> 639ed3f7-0e51-4e5d-b6f8-081378962109;
  style 639ed3f7-0e51-4e5d-b6f8-081378962109 stroke:#2c3143,stroke-width:4px;
  4["fastp on long reads"];
  2 -->|output| 4;
  5e0d2c3d-41a4-4823-ae9c-b1e4d2826541["Output\nfastp report on long reads html"];
  4 --> 5e0d2c3d-41a4-4823-ae9c-b1e4d2826541;
  style 5e0d2c3d-41a4-4823-ae9c-b1e4d2826541 stroke:#2c3143,stroke-width:4px;
  e6018ad6-86f4-4e78-8cf2-ccc8b97022fe["Output\nfastp filtered long reads"];
  4 --> e6018ad6-86f4-4e78-8cf2-ccc8b97022fe;
  style e6018ad6-86f4-4e78-8cf2-ccc8b97022fe stroke:#2c3143,stroke-width:4px;
  69f8383b-a1be-4a74-95f4-3dba35e01426["Output\nfastp report on long reads json"];
  4 --> 69f8383b-a1be-4a74-95f4-3dba35e01426;
  style 69f8383b-a1be-4a74-95f4-3dba35e01426 stroke:#2c3143,stroke-width:4px;
	
kmer counting - meryl - upgraded
Anna Syme

Last updated May 8, 2024

Launch in Tutorial Mode question
License: GPL-3.0-or-later
Tests: ❌ Results: Not yet automated

flowchart TD
  0["ℹ️ Input Dataset\nIllumina reads R1"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["Meryl - count kmers"];
  0 -->|output| 1;
  899ddd93-4c0f-4f81-a973-8120494ed983["Output\nMeryl on input dataset(s): read-db.meryldb"];
  1 --> 899ddd93-4c0f-4f81-a973-8120494ed983;
  style 899ddd93-4c0f-4f81-a973-8120494ed983 stroke:#2c3143,stroke-width:4px;
  2["Meryl - generate histogram"];
  1 -->|read_db| 2;
  3["Genomescope"];
  2 -->|read_db_hist| 3;
  efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac["Output\nGenomeScope on input dataset(s) Transformed log plot"];
  3 --> efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac;
  style efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac stroke:#2c3143,stroke-width:4px;
  701df341-5767-44bc-ade2-6af498ab7467["Output\nGenomeScope on input dataset(s) Transformed linear plot"];
  3 --> 701df341-5767-44bc-ade2-6af498ab7467;
  style 701df341-5767-44bc-ade2-6af498ab7467 stroke:#2c3143,stroke-width:4px;
  85fa4004-b351-47b3-84aa-6788d338037a["Output\nGenomeScope on input dataset(s) Log plot"];
  3 --> 85fa4004-b351-47b3-84aa-6788d338037a;
  style 85fa4004-b351-47b3-84aa-6788d338037a stroke:#2c3143,stroke-width:4px;
  c71ce055-98f0-4354-9397-2f8833b90cc4["Output\nGenomeScope on input dataset(s) Linear plot"];
  3 --> c71ce055-98f0-4354-9397-2f8833b90cc4;
  style c71ce055-98f0-4354-9397-2f8833b90cc4 stroke:#2c3143,stroke-width:4px;
	

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands-on: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL