GSP4PDB is a bioinformatics web tool that lets the users visualize, search and explore protein-ligand structural patterns inside the Protein Data Bank. (PDB).
The novel feature of GSP4PDB is that a protein-ligand structural pattern is graphically represented as a graph such that the nodes represent protein’s components (amino acids and ligands) and the edges represent structural relationships (e.g. distance relationships). Such abstract representation is called a Graph-based Structural Pattern (GSP).
Once the user has "drawn" the GSP, it is transformed into an SQL query, and searched in a PostgreSQL database containing PDB data. The results of the search are shown in textual or graphical form, depending of the version of of GSP4PDB.
Protein-ligand structural pattern
The notion of structural pattern is used to describe a three-dimensional "structure" or "shape" that occurs in the secondary structure of a protein.
We define a protein-ligand structural pattern as the combination of a ligand and a group of amino acids, whose three-dimensional distribution could be determined by three types of relationships:
Distance between two amino acids.
Distance between an amino and the ligand.
Order of precedence (in the sequence) of an amino respect to other amino acid.
For instance, a zinc finger is a protein-ligand structural pattern where a zinc atom (the ligand) is surrounded by cysteine and histidine residues (the amino acids).
Graph-based structural pattern (GSP) is a labeled property graph where nodes and edges can contain key-values pairs representing their propeties (or attributes). For this case (GSP4PDB) four types of nodes are allowed:
Additionaly, nodes can be connected by three types of edges: