GSoC/GCI Archive
Google Summer of Code 2013

Drizzle Database

Web Page: http://wiki.drizzle.org/

Mailing List: https://launchpad.net/~drizzle-discuss

Drizzle is an open source database optimized for cloud scale applications. Drizzle was originally forked from the MySQL codebase and re-factored into a modern plug-in based architecture. It is designed to be a modern, lightweight, easy to use database for Internet scale applications and cloud infrastructure.

The codebase is C++, with autotools and bzr in the toolchain.

Projects

  • "AlsoSQL":Extend JSON server to support more than just key-value operations Drizzle 7.1 introduced a HTTP JSON server that allows clients to connect to Drizzle over HTTP, using a JSON based protocol. The 0.1 version still uses plain old SQL as the query language, embedded into the JSON structure. A work in progress also adds a pure json key-value protocol that supports HTTP PUT, POST, DELETE and GET operations. In this project you will further extend the functionality of the pure json protocol to also support querying of secondary indexes, ranges, etc.
  • Improving Replication Policy in drizzle This project involves designing a facility to wait for an event on a table and changing the slave plugin to use it.
  • MySQL Replication into Drizzle The project includes a interesting idea to build a libdrizzle-redux api based module to help replicate data from MySQL to Drizzle (Row based replication will be used). But currently most web applications owner uses MySQL, as either they don't know about drizzle power or they already deployed their application on MySQL. With this project Web Application owner can take advantages of both MySQL and Drizzle as they can keep their old MySQL features and at the same time can use drizzle plugins to implement new and faster feature with this drizzle replicated database.
  • Parallelization and Automation of Drizzle Continuous Integration Infrastructure Drizzle Continuous Integration uses Jenkins CI tool at present. It includes a Jenkins server / master with a few slaves. However, the infrastructure does not support parallel running of tests / builds to a scalable extend. So the project aims in bringing parallelism with automation. This can be envisioned as follows. Drizzle CI does builds using Jenkins. It tests for regression using the available test suites. To bring in parallelism, the CI infrastructure should make execution of test jobs, which may include a build, a sysbench test, a randgen test, etc., simultaneous. A dedicated system / cloud node must be configured for execution of each job in the test job list. Multiple tests + Multiple nodes = Parallelism. Configuring these nodes should be handled in a simple way. It should be as simple as writing a simple config file, of specified format, and then issuing a command which configures the nodes accordingly. Nodes + Configuration files = Automated configuration.