Updating the sequence based classification of glycosyl hydrolases

Updating the sequence based classification of glycosyl hydrolases

The majority of them represent Eubacteria (16 different species).

Red arrows correspond to glycosidase (GH) and glycosyltransferase (GT) genes: family belonging is indicated.

As a rule, about 1% of genes in a given genome encode glycoside hydrolases and their homologues.

On the basis of sequence similarity they have been grouped into more than ninety GH families during the last 15 years.

Iterative sequence analysis revealed the relationship of the GH97 family with the GH27, GH31, and GH36 families of glycosidases, which belong to the α-galactosidase superfamily, as well as a more distant relationship with some other glycosidase families (GH13 and GH20).

-barrel fold of the catalytic domain and a retaining mechanism of the glycoside bond hydrolysis.

In the case of poly-domain proteins each catalytic domain is considered separately.

A family was initially defined as a group of at least two sequences displaying significant amino acid similarity and with no significant similarity with other families [].

A classification of glycoside hydrolases in families based on amino acid sequence similarities has been proposed a few years ago.

Because there is a direct relationship between sequence and folding similarities, such a classification: (i) reflects the structural features of these enzymes better than their sole substrate specificity, (ii) helps to reveal the evolutionary relationships between these enzymes, (iii) provides a convenient tool to derive mechanistic information) provides a continuously updated list of the glycoside hydrolase families. Liu QP, Sulzenbacher G, Yuan H, Bennett EP, Pietz G, Saunders K, Spence J, Nudelman E, Levery SB, White T, Neveu JM, Lane WS, Bourne Y, Olsson ML, Henrissat B, Clausen H (2007) Bacterial glycosidases for the production of universal red blood cells.

Instead, related clans (and families) having statistically significant sequence similarity of the corresponding proteins were proposed to be grouped into superfamilies at a higher hierarchical level.

For example, we have described the furanosidase (β-fructosidase) superfamily, that includes clans GH-F (inverting glycosidases) and GH-J (retaining glycosidases), as well as the GHLP (COG2152) family of enzymatically-uncharacterized proteins [].

These data suggest a common evolutionary origin of glycosidases representing different families and clans.

