GSoC/GCI Archive
Google Summer of Code 2014

Ganglia

License: New and Simplified BSD licenses

Web Page: https://github.com/ganglia/monitor-core/wiki/GSoC-2014-project-ideas

Mailing List: https://lists.sourceforge.net/lists/listinfo/ganglia-general

Ganglia is a scalable distributed performance monitoring system for high-performance computing systems such as clusters and Grids.

Due to its lightweight, easy to manage agent, Ganglia is now used in many educational institutions, corporations, research organizations and ISPs. Many IT professionals come across it in their job every time they have a scalability problem with an application. Students can probably find it on your own campus, just look for /etc/gmond.conf on UNIX machines to find clues about where the metrics are sent. Many institutions expose their Ganglia web reports publicly, they can be easily found with a Google query

Performance monitoring offers an excellent opportunity for students to develop core skills in many areas including networking, operating systems and also more advanced areas like data science, machine learning, statistics and prediction.

The lightweight Ganglia agent, gmond, is written in C and runs on every machine being monitored. The agent can be extended with plugins written in C or Python to extract metrics from the machine or applications being monitored. Application monitoring can also be achieved for Java (using JMXetric with Tomcat, JBoss, Spring, etc).

The Ganglia data collection and storage process, Gmetad, is also written in C. It uses the legendary RRDtool for storing the time series data.

The Ganglia web reporting system is built with PHP, JavaScript and jQuery. It does not have any database dependency. It constructs RRDtool queries/command lines to graph the data.

Various integration solutions exist, such as ganglia-nagios-bridge which is developed in Python. This allows the metrics discovered by Ganglia to be used for alerting in Nagios.

Ganglia is a BSD-licensed open-source project that grew out of the University of California, Berkeley Millennium Project which was initially funded in large part by the National Partnership for Advanced Computational Infrastructure (NPACI) and National Science Foundation RI Award EIA-9802069

For some tips about making an application, please see this blog from Ganglia developer Daniel Pocock

Projects

  • Improving integration between Ganglia and Nagios (Python) This project will aim to improve the current implementation of Ganglia-Nagios-Bridge by separating the function of writing checkresults for Nagios by the Ganglia-Nagios-bridge into a different class, providing means for generating dynamic and adaptive thresholds, providing advanced way to configure the alerting thresholds.
  • Internal Ganglia server metrics project proposal Hello, see the contents part for all of the information!
  • NVIDIA GPU monitoring enhancements NVIDIA GPU metrics can be collected via NVIDIA gmond plugin, but they are currently not very well visualized via Gweb. This project aims to improve the overall visualization and also support the new metrics supported by NVIDIA management library
  • RRDtool data access from data analysis frameworks Packages like R/Weka are very powerful and proven in their ability to efficiently analyze data using a variety of techniques. Dumping the contents of an rrd file to XML (with rrddump) and csv (http://code.google.com/p/rrd2csv/) is possible. While csv can be easily read into any of these tools, being able to directly read the binary data and not exporting it to an intermediate format first would be a lot more efficient and time-saving.