Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Bioinformatics with Python Cookbook
  • Toc
  • feedback
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

By : Tiago R Antao, Tiago Antao
4.7 (6)
close
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

4.7 (6)
By: Tiago R Antao, Tiago Antao

Overview of this book

If you have intermediate-level knowledge of Python and are well aware of the main research and vocabulary in your bioinformatics topic of interest, this book will help you develop your knowledge further.
Table of Contents (11 chapters)
close
10
Index

Traversing genome annotations

Having a genome sequence is interesting, but we will want to extract features from it: genes, exons, and coding sequences. This type of annotation information is made available in GFF and GTF files. GFF stands for Generic Feature Format. In this recipe, we will see how to parse and analyze GFF files, using the annotation of the Anopheles gambiae genome as an example.

Getting ready

We will use the gffutils library to process the annotation file.

If you do not use the notebook, you need to acquire the annotation file from our datasets page at https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Datasets.ipynb (file gambiae.gff3.gz) Rename the annotation file as gambiae.gff.gz. Preferably, use the 02_Genomes/Annotations.ipynb notebook, which is provided in the code bundle of the book.

How to do it...

Let's take a look at the following steps:

  1. Let's start by creating an annotation database with gffutils based on our GFF file:
    import gffutils
    import...
bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete