GSoC/GCI Archive
Google Summer of Code 2015

CERN SFT

License: GNU Library or "Lesser" General Public License (LGPL)

Web Page: http://ph-dep-sft.web.cern.ch/article/175948

Mailing List: sft-gsoc@cern.ch

The SFT (Software for Experiments) group is part of CERN (European Organization for Nuclear Research, http://www.cern.ch), and focuses on providing common software for its experiments. CERN is one of the world’s largest and most exciting centers for fundamental physics research. Experiments at CERN have probed the fundamental nature of matter and the forces which affect it. CERN is also the birthplace of the World Wide Web (http://info.cern.ch), invented by Tim Berners-Lee. The SFT group's efforts, like most of CERN's current activities, are directed towards the world’s highest-energy elementary particle accelerator - the Large Hadron Collider (LHC, http://public.web.cern.ch/public/en/lhc/lhc-en.html) and its experiments. There are four large experiments at the LHC (ALICE, ATLAS, CMS, LHCb) which seek to expand the frontiers of knowledge and complete our understanding of the constituents of matter and their interactions, of the conditions in the first instants after the Big Bang and of the differences between matter and anti-matter. During 2012, ATLAS and CMS announced the discovery of a new boson, which has been confirmed recently to have the properties of a Higgs boson - similar to the one required by the Standard Model of Particle Physics. NOTE: The vast majority of our GSoC projects do not require any physics knowledge. Operating the LHC and running each experiment requires a large amount of software. A large part of this software is common and open source. The open source software spans the range from system software to more specialized physics-oriented tools and toolkits.

The projects to which students can contribute span several software projects:

  • the SixTrack accelerator simulation;
  • the Geant4/Geant-V detector simulation toolkit
  • the ROOT software framework for storing and analyzing the data of the LHC experiments;
  • CERNVM, a baseline Virtual Software Appliance for the participants of CERN LHC experiments;
  •  IgProf, a  tool for obtaining profiling of large scale applications.

SixTrack is a simulation tool for the trajectory of high energy particles in accelerators. It has been used in the design and optimization of the LHC and is now being used to design the upgrade that will be installed in the next decade, the High-Luminsity LHC (HL-LHC). Sixtrack has been adapted to take advantage of large scale volunteer computing resources provided by the LHC@Home project. It has been engineered to give the exact same results after millions of operations on several, very different computer platforms.

The ROOT  (http://root.cern.ch/) software framework is used to handle, store and analyze the data of all LHC experiments. The experiments store both their raw data and intermediate, processed results using ROOT, as it offers an open source format and is very compact. Having the data defined as a set of objects, it is possible to get access separately to particular attributes of the selected objects, without touching the remaining attributes. ROOT includes many tools for analysis of data, from histogramming methods in an arbitrary number of dimensions, curve fitting, function evaluation, minimization, graphics and visualization. It includes also a built-in C++ interpreter the command language which is used as a scripting, or macro language. Cling is a new C++11 standard-compliant interpreter, an interpreter built on top of Clang (www.clang.llvm.org) and LLVM (www.llvm.org) compiler infrastructure. Cling is being developed at CERN as a standalone project. It is also being integrated into ROOT, giving access to a C++11 standards compliant interpreter. ROOT is an open system that can be dynamically extended by linking external libraries. This makes ROOT a premier platform on which to build data acquisition, simulation and data analysis systems.

The Geant4 toolkit (http://cern.ch/geant4) is a key component of the common physics software. It simulates the interactions of radiation with material in any setup, including the detectors of the LHC or other High Energy Physics (HEP) experiments. There are many diverse uses in other fields: assessing the effects of radiation on the electronics of satellites and designing improved medical detectors with specialized applications such as the Geant4 Application for Tomographic Emission GATE (http://www.opengatecollaboration.org). LHC experiments use Geant4 to compare the signatures of events from new physics (such as the Higgs boson and particles which are candidates for dark matter) to the signatures of events coming from known interactions which could mimic them. Geant4 is created, developed and maintained by the Geant4 collaboration (http://geant4.org) of over 100 physicists and engineers from around the world from Europe (CERN, IN2P3/France, INFN/Italy), US (Fermilab, SLAC), Japan (KEK), Canada (Triumf), Russia (Lebedev) as well as many universities. A key area of current research are extension to utilize current and emerging computer architectures. One of these efforts is the Geant Vector Prototype project, which aims to demonstrate improved performance on the latest CPU and accelerator hardware.

IgProf (https://igprof.org) is a lightweight performance profiling and analysis tool. It can be run in one of three modes: as a performance profiler, as a memory profiler, or in instrumentation mode. When used as a performance profiler it provides statistical sampling based performance profiles of the application. In the memory profiling mode it can be used to obtain information about the total number of dynamic memory allocations, profiles of the ``live'' memory allocations in the heap at any given time and information about memory leaks. The memory profiling is particularly important for C/C++ programs, where large amounts of dynamic memory allocation can affect performance and where very complex memory footprints need to be understood. In nearly all cases no code changes are needed to obtain profiles.

The CERN Virtual Machine (CernVM, http://cernvm.cern.ch) is a project to investigate how virtualization technologies can be used to improve and simplify the daily interaction of physicists with experiment software frameworks and the Grid infrastructure. CernVM maintains a Virtual Software Appliance designed to provide a complete and portable environment for developing and running LHC data analysis applications on any end user computer (laptop, desktop) as well as on the Grid and on Clouds.

Projects

  • Add GPU support to the Vc vectorization library Vc is a free software library to ease explicit SIMD vectorization of C++ code. It has an intuitive API and provides portability between different compilers and compiler versions as well as portability between different vector instruction sets. This projects aims to make Vc code portable to GPUs and the SIMT execution model.
  • Binary code browser and tester Engineers at CERN write scientific code which has to run fast in order to process the Petabyte of data which their experiments produce daily. For this to be possible, their code has to be analyzed and improved at assembly level. To come in the aid of this, I plan to develop an interactive web-app which will let the programmers easily browse through binaries and understand the control flow of a program. This project also includes an eclipse plugin which will provide unit testing functionality.
  • Create a Standalone Tracking Library With the successful development of particle tracking API of the SixTrack last summer, its time to augment it with the power of parallel computing. But before extending the API to subsume new physics model and harness parallel processing capabilities, a review and testing of the developed API needs to be done. The final aim of this project is to prove the numerical stability of the equations and correctness of the implementation by extensively testing it against the legacy FORTRAN code.
  • Extension of ROOT I/O customization framework The Evolution Schema is a subsystem of the ROOT framework which allows to support objects of old versions of classes without original compiled code. It’s important if we deal with long-lived serialized data while continuing development of programming application. The aim of the project is to add a support for new features in evolution schema to increase flexibility and performance when reading old files.
  • HTTP/2 Support for CernVM File System CernVM-FS software system possesses a rich array of functionalities, but still needs some compilation of software tools that will make this technology more advanced and provide better performance. The purpose of this proposal is to suggest an implementation of new ideas and manipulation of some portions of previously written code. The methodology followed is to propose new techniques that will be useful for making CernVM-FS a better, stronger and faster product.
  • Idiomatic Python from Idiomatic C++ with PyROOT and cling A refactoring of PyROOT and Cppyy is proposed, using the advanced reflection capabilities of the new cling backend of ROOT 6 to provide more flexible automatic bindings of C++ code to idiomatic Python, substantially surpassing current technologies and improving maintainability. An API for fine-tuning these bindings will be designed and implemented for both CPython and PyPy, providing control over memory management and locking at the same time as idiomatic access to C++ classes and functions.
  • Implementation of Unused Maps in SixTrackLib Tracking Library CERN’s SixTrack 6D Tracking Code is used to compute and simulate the trajectories of rel- ativistic charged particles in circular accelerators. However, there are certain maps in SixTrack’s Physics Manual that have not yet been implemented within the SixTrack 6D Tracking Code. This project aims to implement these maps in simulator’s native language Fortran, for use in the SixTrack simulator, and check each map within SixTrack’s Physics Manual for physical correctness and numerical stability.
  • Implementing Runge-Kutta solvers to track particle trajectories in Geant4 The goal of this project is to implement General Runge-Kutta-Nystrom ODE solvers in the software package Geant4. These ODE solvers will have the First Same as Last property (FSAL), and have dense output (they will be able to provide an estimate of the trajectory for any point in the step interval with the same order of accuracy as the original method).
  • New Methods of Field Integration - Dense output and FSAL In the ODE stack of Geant, -Introduce some new Runge Kutta methods and multistep methods -Implement dense output through interpolation and extrapolation methods -Implement a mechanism to dynamically choose between different methods of integration[Experimental]
  • ROOT - R interface for Multivariate Data Analysis (TMVA) Using the ROOT-R interface many of the statistical packages available in R such as those performing multi-variate analysis can be used in ROOT. The idea of the project is to facilitate the usage of some of the R packages by creating corresponding interface classes in the ROOT system. These classes should implement some of the needed interfaces required to be used by some of the statistical packages of ROOT and they could be loaded at run-time by the ROOT plugin manager system.
  • ROOT reader for Paraview Paraview is an application that visualizes datasets. Unlike ROOT, Paraview was designed solely for dataset visualization, providing several techniques that are not available in ROOT. The goal of this project is to make these techniques available to the ROOT community through a reader plugin for Paraview.
  • Static Analysis Suite(SAS) Enhancements I will introduce serveral features that make SAS static checker more powerful. These include a test infrastructure with cdash CI support, a user-friendly commandline interface, checkers for thread-safety and overflow, a configuration machanism for hierachy black/white list. In general, I will make SAS a escort for all C/C++ projects in CERN-SFT.
  • TTreeFormula class Reimplementation of TTreeFormula.
  • Vectorization of Philox CBRNG and VecGeom via Agner FOG's Vector Class Lib. The aim of this study is SIMD optimizing two independent projects: Philox, a newly proposed CBRNG and VecGeom, a high-performance HEP geometry system, on Intel’s Haswell architecture with AVX2 instruction set. Firstly, internal data structure of Philox will be transformed from AoS to SoA for better auto-vectorization and intrinsics will be used via high-level C++ wrapping library. Secondly, FOG's VCL will be implemented as an alternative backend to Vc to gain more platform independence.