CAZyme Protein Count Grids — Help & Information

What are CAZymes?

Carbohydrate-Active enZymes (CAZymes) are enzymes involved in the synthesis, modification, and breakdown of complex carbohydrates and glycoconjugates. They are classified into families based on amino acid sequence similarity by the CAZy database. Major CAZyme classes include Glycoside Hydrolases (GH), Glycosyl Transferases (GT), Polysaccharide Lyases (PL), Carbohydrate Esterases (CE), Auxiliary Activities (AA), and Carbohydrate-Binding Modules (CBM).

Data Sources

dbCAN Counts Grid

The dbCAN counts grid displays the number of CAZyme family annotations per organism derived from the dbCAN automated annotation pipeline. dbCAN uses hidden Markov models (HMMs) built from the CAZy database to scan protein sequences and assign them to CAZyme families. Each cell value represents the total count of annotations for a given CAZyme family in a given organism.

Erdody Protein Grid

The Erdody grid shows CAZyme protein counts derived from the Erdody consolidated database, which uses BLAST-based homology searches against curated CAZyme reference sequences. Each cell can display three metrics (toggled in the interface):

Clicking on an individual count dot opens a modal with the full BLAST alignment details (bit score, E-value, coordinates, identity, gaps) grouped by query protein.

Genome Set Comparison Grid

The comparison grid lets you select up to 50 organisms from any domain and view both dbCAN and Erdody data side by side. Each cell shows a D/E value (dbCAN count / Erdody protein count), making it easy to compare the two annotation methods for the same organism and CAZyme family. You can save and reload named organism sets for repeated analysis.

Substrate and Activity Annotations

Each CAZyme family column includes a substrate sub-header row colored by primary substrate. Clicking on a substrate cell opens a detail modal showing:

These annotations are sourced from the CAZy database and cross-referenced literature.

Grid Navigation

Color Legend

Count values are divided into four color groups based on percentile boundaries computed from all non-zero values in the current view:

Low counts (up to 50th percentile)
Moderate counts (50th–80th percentile)
High counts (80th–95th percentile)
Very high counts (above 95th percentile)
↑ Top





v1 @copyright 2026 UCLA