Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Bioinformatics with Python Cookbook
  • Toc
  • feedback
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

By : Tiago R Antao, Tiago Antao
4.7 (6)
close
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

4.7 (6)
By: Tiago R Antao, Tiago Antao

Overview of this book

If you have intermediate-level knowledge of Python and are well aware of the main research and vocabulary in your bioinformatics topic of interest, this book will help you develop your knowledge further.
Table of Contents (11 chapters)
close
10
Index

Analyzing data in the variant call format

After running a genotype caller (for example, GATK or samtools), you will have a variant call format (VCF) file reporting on genomic variations, such as single-nucleotide polymorphisms (SNPs), Insertions/Deletions (INDELs), copy number variation (CNVs), and so on. In this recipe, we will discuss VCF processing with the PyVCF module.

Getting ready

While next-generation sequencing is all about big data, there is a limit to how much I can ask you to download as a dataset for this book. I believe that 2 to 20 GB of data for a tutorial is asking too much. While the 1000 genomes' VCF files with realistic annotations are in this order of magnitude, we will want to work with much less data here. Fortunately, the bioinformatics community has developed tools to allow partial download of data. As part of the samtools/htslib package (http://www.htslib.org/), you can download tabix and bgzip, which will take care of data management (on Debian, Ubuntu, and...

bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete