The main objective of WP1 is to create a portal - (called SSP, for Self-Service protein engineering Portal) - that will facilitate all NewProt's software, databases, interactive operations, dissemination, and management activities. The SSP will combine aspects of a classical portal with that of an interactive workbench that enables simple access to all NewProt resources through a single, homogeneous interface. The SSP will additionally hold a series of tabs that will facilitate direct access to the databases, software, query systems, protocols, experimental results, course material, and management tools.
WP1 can be logically divided in two main tasks: 1) Design and implementation of the SSP; and 2) Validation of the technical aspects of the SSP. These two tasks will have their focal points in year 1 and year 2 of the NewProt project, respectively. Population and usage of the portal will be the tasks of WP2-5 and WP6, respectively.
Task 1. Design and implementation of the SSP framework.
The SSP will be designed based on the open source Information Workbench from fluidOps. The SSP will provide a working environment to easily interact with all produced NewProt resources. To this end, the SSP will hold tabs for database access (for WP2), software access (for WP3), system facilities (svn access, ftp access, virtual machine, etc), dissemination (e.g. course material, software documentation) help facilities, and project management facilities. Users will be able to interactively run NewProt software within the SSP workbench. Software and databases will be fully interoperable, i.e., users will not need to store in-between results and will not need to worry about file formats etc. This interoperability requires that all data types will be syntactically and semantically described (WP2) and that all software that can operate on those data will need to 'know' about the syntactic and semantic annotations (WP3-5). The use of common standards (RDF, Linked Data, as well as relevant domain ontologies such as EDAM, OBO and the GO Gene Ontology)) will ensure the semantic interoperability of data and software. Much of the data and software the partners intend to incorporate in the SSP already use the SOAP protocol for interoperability and adhere to these common standards and ontologies.
The SSP will be provided as a hosted portal (hosted at the CMBI), but it will be designed such that all partners can instantiate private copies (e.g. for in-house use) if desired. Two mechanisms will be used for this purpose. First, all resources (except third party databases) will be kept in the svn version control system, and second, the whole database and software suite will be made available as a Virtual Machine image that can be instantiated on private infrastructures (e.g. using VMware) or, if needed, on public clouds (e.g. Amazon EC2). Partner FLUID will design the SSP keeping in mind that they will be able to include their template based provisioning tools when, in due time, they deliver technological support to SSP-using industries that request such additions. The SSP-mages will be equipped with an update mechanism that will allow the partners to frequently obtain the current versions of the entire suite while retaining the state of the user data.
Partners FLUID and CMBI will work together in the first weeks of the project to make a functional design for the SSP. Partners YAS, BIOP, ENTIS, and SAFAN, and especially SAC member T Schwede (who has long-standing experience in running portals) will be involved to consult in this task.
Task 2. Validation of the interoperability of the SSP with other software.
One of the goals of the portal is that third parties (i.e. bioinformatics SMEs) can easily use the central portal to enhance the quality, and thus the value, of their web-based products. The partners CMBI, BIOP, YAS, and ENTIS will each validate a different aspect of the interoperability of the SSP with their in-house software. Partners CMBI, YAS, BIOP, and ENTIS have software products that can be of interest to academic and industrial researchers in protein engineering and drug design Together, these partners are paradigmatic for anything a bioinformatics software specialist in academia or industry might want to do with the NewProt products.
Partner CMBI will add a large number of software and database facilities. This process will mainly take place between months 7 and 24 (after which more time will be spend on improvements steered by the validation experiments). However, CMBI will keep adding new products throughout the whole period, and will thus continuously validate the ease of upgrading the SSP.
Partner BIOP produces molecular class specific information systems that hold much information about a class of molecules. This information is collected, validated, annotated, and computationally enriched. The curated systems, that are called 3DM systems, are presented at the user as a classical portal system. BIOP does not distribute any data or software, but BIOP's customers obtain access to their 3DMs at the BIOP computer systems. BIOP will collaborate with FLUID to achieve a full integration between their molecular class specific 3DMs with the fully generic SSP. 3DM will obtain an in-house copy of the SSP as a virtual machine and it will, based on advice from partner FLUID, make the SSP and their in-house system fully interoperable. BIOP will write a detailed report about this integration process. This report will be detailed enough to function as a recipe for other, non-partner SMEs to produce similar, bi-directional interactivity with the SSP. BIOP will validate that the SSP can be downloaded and used as a virtual machine and that it can easily be made fully interoperable with their in-house molecular class specific information systems. This will require full two-way communication between the systems.
Partner BIOP would like to get fully integrated two-way interoperability with the SSP;
YAS will validate that the results from the CMBI-hosted SSP can easily be transferred to, and used in its YASARA View software (which implicitly also means that it can be used in its commercial YASARA software). Details of the YASARA View - SSP interoperability are discussed more extensively in WP5.
ENTIS's Hotspot Wizard is a software tool that automatically identifies the functional residues for engineering catalytic properties of enzymes and for estimating their mutability. For this purpose, HotSpot Wizard integrates several bioinformatics databases (RCSB PDB, UniProt, PDBSWS, Catalytic Site Atlas and nr NCBI) and computational tools (CASTp, CAVER, BLAST, CD-HIT, MUSCLE and Rate4Site). Structural analyses are conducted to identify the residues that potentially come into contact with the substrates or products. The mutability of individual amino acid residues is derived from their conservation level. (HotSpot Wizard: a Web Server for Identification of Hot Spots in Protein Engineering. Pavelka A, Chovancova E, Damborsky J. 2009 Nucl. Acids Res. 37 W376-W383). Partner ENTIS will validate that the SSP can be used to enhance their in-house bioinformatics products, but without maintaining a full in-house SSP copy.
So, in summary, partner BIOP will validate full two-way, in-house integration of its (SME) portal with the SSP; Partner ENTIS will validate that it can actually use obtain information from the hosted SSP to enrich its (SME) products; Partner YAS will validate that SSP users can directly use its (SME) products; and partner CMBI will validate that it actually is easy to add its (academic) products to the SSP.