GSoC/GCI Archive
Google Summer of Code 2010

GenMAPP, Cytoscape, WikiPathways & Reactome

Web Page: http://genmapp.cgl.ucsf.edu/wiki/Google_Summer_of_Code_2010

Mailing List: http://groups.google.com/group/cytoscape-discuss

Our GSoC code repository: http://code.google.com/p/google-summer-of-code-2010-genmapp/

We are a collection of academically-based, network biology-oriented, open source organizations:

  • GenMAPP is a pathway visualization and analysis tool for biological data. GenMAPP illustrates the relationships between various genes and proteins to help researchers understand their data in terms of connected, biological pathways. Approximately 24,000 people from ~97 countries have registered to download the GenMAPP program. The GenMAPP group is coordinated by the Conklin Lab at the Gladstone Institutes ( University of California, San Francisco). There are 500 publications that reference GenMAPP or use GenMAPP to display data in the context of biological pathways. GenMAPP is 100% open source and all new development is in Java, MySQL, Derby, XML, and Web technologies such as wikis, in collaboration with BiGCaT Bioinformatics and the Cytoscape Consortium. Our development team is composed of individuals who are both biologists and programmers, providing a unique perspective on building and using open source tools. 
  • Cytoscape is a general network visualization tool that integrates network topology with data about the network into the visualization. Cytoscape was developed in and finds most use in the Systems Biology community. With over 2,500 downloads per month Cytoscape is rapidly becoming a standard within the community. Cytoscape consists of a core application and a plugin framework which users exploit to extend the functionality of the application in new ways. Our team consists of programmers and biologists from both academia and industry including: UC San Diego, UC San Francisco, U of Toronto, Agilent, Institute for Systems Biology, Unilever, Sloan-Kettering, Institut Pasteur, UT Health Science Center and others. 
  • WikiPathways is a wiki for biological pathways, it does for pathway archives what WikiPedia does for the encyclopedia. The wiki approach allows biologists with specific domain knowledge to easily create or update pathways. Pathways can be directly modified from a web browser using an embedded applet where you can draw genes, proteins and their interactions like in any popular drawing tool. The pathways can be used as images for publication and in data analysis tools such as GenMAPP, PathVisio and Cytoscape. There are currently about 1,270 pathways available, divided over 15 different species and 1000 registered users. WikiPathways itself is completely open source and is built on top of MediaWiki, using PathVisio as the pathway editor. WikiPathways is developed and maintained by BiGCaT Bioinformatics ( University of Maastricht) and the Conklin Lab at the Gladstone Institutes ( University of California, San Francisco). 
  • Reactome is a manually curated database of core pathways and reactions in human biology that functions as a data mining resource and electronic textbook. The Reactome data model describes diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, signal transduction, and high-level processes, such as the cell cycle. As of Release 31, Reactome contains 4490 human proteins, 3669 reactions and 1081 pathways. Reactome software uses only freely available (and often open source) components and has been created with cross-platform compatibility and wide usability in mind. Data is stored in a MySQL database, the web site is implemented in Perl and data entry tool in Java programming language. The Reactome team is composed of individuals who are both biologists and programmers at the Ontario Institute for Cancer Research, New York University Langone Medical Center, Cold Spring Harbor Laboratory, and The European Bioinformatics Institute.

Projects

  • Creating a CyAnnotator and improving CyAnimator The main idea is to develop a plug-in to annotate Cytoscape networks with text and other graphics elements. This could be used in conjunction with CyAnimator to create animations that can be used for presentations and supplementary materials.
  • Exploring relations between pathways on WikiPathways This plug-in can be used by scientists, researchers and students to quickly find out which biological processes (pathways) are related to each other with respect to gene products / metabolites / continuity and to what extent. Help them find out the common properties between various processes. They can also find out pathways of their field of interest by starting from one pathway and gradually exploring the pathways mapped to that!
  • Expression Data Reader plugin for Cytoscape The goal of this project is to implement a plugin that will facilitate the discovery of Gene Expression data from open repositories such as GEO (Gene Expression Omnibus) and ArrayExpress. We will use the Entrez and EBI web services as the backend support for this plugin. Users can use this tool to find the targeted experiment data, download them locally, and perform certain data cleaning and preprocessing. The final results (fold, p-values) will be add as attributes to nodes in Cytoscape.
  • IDEA 19: Perform Microarray Summarization and Alternative Splicing Analyses By Coupling the Programs AltAnalyze and GenMAPP-CS This project is about developing a Java interface to be integrated with the GenMAPP-CS,which will present user convenient options do quickly download and easily use the AltAnalyze and APT tools for various time taking microarray analyses, without switching to any other program.Also a new idea is put forward to add a feature of simultaneous sharing/editing of common document in GenMAPP-CS, by different users located at physically different locations
  • Improve the PathVisio User Interface PathVisio is an open source pathway visualization tool which is the central editor for WikiPathways. As many new pathways are being generated using PathVisio, a more intuitive and powerful user interface will accelerate the pathway creation and data analysis process. In this Google Summer of Code event, I would like to contribute to the improvement of PathVisio user interface (the proposed IDEA 4).
  • Improving Cytoscape's Labels Experience The aim of this project is to improve Cytoscape's ability of dealing with labels, by providing two new label layout plugins based on the force directed and spring embedded approaches, and also by providing the capability to show and hide certain labels as the user zooms in and out.
  • KEGG global map browser and its integration to Processing visualizer KEGG is the most comprehensive pathway database and these maps are used extensively in biological research. However, in Cytoscape the support for KEGG is very primitive. KEGG have released global metabolic map (Atlas), which is useful to consider experimental result. However, Atlas's mapping speed is slow, and the function is less flexible than Cytoscape. The goal of my project is to implement data I/O for KEGG, construct global metabolic map, and present navigation UI and new visualization.
  • Proposal for the Implementation of Edge-Weighted Force Directed layout for Cytoscapeweb Force Directed Layout does not perform very well for a dense graph with large number of edges (possibly >1000) due to visual cluttering. This means that cytoscapeweb’s key network visualization tool cannot function for networks with large number edges. Hence, • Geometric - Based Edge Clustering • Forced – Directed Edge Bundling (FDEB) Are offered as solution to this problem to reduce visual complexity.
  • Reactome-Wikipathways Round-trip Format Converter Reactome is a curated pathway database while WikiPathways lives on the "wiki spirit" allowing anyone to edit and annotate pathways. Collaboration between the websites is highly desired but has stalled due to difference in data formats used internally. WikiPathways uses GPML, a vector graphics format, for its data. Reactome has recently introduced a new proprietary graphical XML format. This project will provide the means to convert to and from GPML and the new Reactome XML format.
  • Using NLP Techniques to Create a Semantic Network Summary for Cytoscape We propose a technique for creating a network summary for Cytoscape that implements a variety of Natural Language Processing (NLP) techniques including stop word removal, stemming and Latent Semantic Indexing (LSI).