Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Bioinformatics with Python Cookbook
  • Toc
  • feedback
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

By : Tiago R Antao, Tiago Antao
4.7 (6)
close
Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

4.7 (6)
By: Tiago R Antao, Tiago Antao

Overview of this book

If you have intermediate-level knowledge of Python and are well aware of the main research and vocabulary in your bioinformatics topic of interest, this book will help you develop your knowledge further.
Table of Contents (11 chapters)
close
10
Index

Extracting genes from a reference using annotations

In this recipe, we will see how to extract a gene sequence with the help of an annotation file to get its coordinates against a reference FASTA. We will use the Anopheles gambiae genome, along with its annotation file (as per the previous two recipes). We will first extract the Voltage-gated sodium channel (VGSC) gene, which is involved in resistance to insecticides.

Getting ready

If you have followed the previous two recipes, you are ready. If not, download the Anopheles gambiae FASTA file, along with the GTF file. You also need to prepare the gffutils database:

import gffutils
import sqlite3
try:
    db = gffutils.create_db('gambiae.gff.gz', 'ag.db')
except sqlite3.OperationalError:
    db = gffutils.FeatureDB('ag.db')

As usual, you will find all this in the 02_Genomes/Getting_Gene.ipynb notebook.

How to do it…

Let's take a look at the following steps:

  1. Let's start by retrieving the annotation information...
bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete