Functional annotation is an important step in genome/metagenome projects. It give us hints of what these genes are capable of. One of the widely used functional database is CAZy (Carbohydrate-Active enZYmes), which describes the enzymes that degrade, modify, or create glycosidic bonds.
Snakemake is a Python based workflow management system which aim to create a reproducible and scalable data processing pipeline. In the workflow, there are rules defined how to create outputs from inputs.
There are 2 major metagenomic approaches to investigate the microbiome - amplicon sequencing and shotgun sequencing. The major aim for amplicon-based approach is estimate the relative abundance of each species in the sample.
Intrinsically disodered regions (IDRs) are polypeptide segments which have flexible structure. They often lack of hydrophobic residues (which is the major driving component for protein folding) but enriched in polar/charged residues (which enhance their capability to interact with water solvent).