1) Which three databases are being used in this course and what kind of data do they contain?
SwissProt : Protein sequences
EMBL : Nucleotide sequences
PDB : Protein structures
2) Find out for each of the three database formats SwissProt, EMBL, and PDB if all the 5
'essential' data elements are really present.
All three have all 5 essential elements, but: In case of the PDB the file name is used as unique identifier; In
SwissProt and EMBL the depositer is 'coded' implicitly in the reference that belongs with the sequence. Also,
it is difficult to figure out the real date of deposition for SwissProt and EMBL
as corrections are dated, but you cannot see
what was done at any correction.
3) The GPCRDB is a called an
information system because it
contains sequences, structures, mutations, computationally derived data, etc. SwissProt on the other hand
is a database that consists only entries of the same type.
The GPCRDB can also be called a heterogeneous system as its data is heterogenous (articles, sequences, models,
mutations, etc) while SwissProt is a homogeneous system because each entry has the same format.