
Bioinformatics with Python Cookbook
By :

In this recipe, you will learn about a few general techniques to manipulate reference genomes. As an illustrative example, we will study the GC content – the fraction of the genome that is based on guanine-cytosine in Plasmodium falciparum, the most important parasite species that causes malaria. Reference genomes are normally made available as FASTA files.
Organism genomes come in widely different sizes, ranging from viruses such as HIV, which is 9.7 kbp, to bacteria such as E. coli, to protozoans such as Plasmodium falciparum, which has a 22 Mbp spread across 14 chromosomes, mitochondrion, and apicoplast, to the fruit fly with three autosomes, a mitochondrion, and X/Y sex chromosomes, to humans with their three Gbp pairs spread across 22 autosomes, X/Y chromosomes, and mitochondria, all the way up to Paris japonica, a plant with 150 Gbp of the genome. Along the way, you have different ploidy and sex chromosome...