After a successful first year of operations for JASMIN, which saw many exciting science collaborations taking advantage of the JASMIN petascale data processing and storage, the system began to operate near capacity. It was clear there was a pent up demand for a “big data analytic environment” in the environmental science community.
As a consequence NERC made a case to government to upgrade the JASMIN to expand the basic resource and to provide an enhanced range of services to a wider community across NERC's science activities. The objective presented in the case was "... to enable the scientific integration of water, geological, land use, ocean and climate information, linked to predictive modelling, visualisation, reflection, assimilation and analysis capability, in order to allow the efficient and effective use of this information to further scientific understanding and decision making by industry, regulators and government, and communication with the general public."
The resulting expansion was approved in Summer 2013 and an ambitious but carefully planned programme is now under way, managed by STFC's Scientific Computing Department. This will take the form of two sub-projects based around hardware and software procurements, phased in over the next two years, with interim completion dates of March 2014 and March 2015. Together these will deliver a major expansion of storage, network, compute and associated software (the “hard” upgrade), along with massively improved cloud capability, enhanced analysis software, and user and system documentation (the “soft” upgrade).
JASMIN will expand its role as the infrastructure in which a plethora of NERC science community services are run. Key services include the academic component of the facility for Climate and Environmental Monitoring from Space (CEMS) and the Centre for Environmental Data Archival (CEDA, including the British Atmospheric Data Centre, BADC). Underlying infrastructure enhancements will improve the LOTUS processing cluster, group workspaces, virtual machine hosting, and cloud environment.
The JASMIN environment will support:
These "managed" and "unmanaged" high performance services will be carefully isolated within the infrastructure. The requisite complex network design and implementation is a major part of the "hard upgrade" project, which will result in a non-blocking network with a theoretical bandwidth of > 1TeraByte per second, and at least an order of magnitude improvement in the performance of massively parallel analysis tasks (via reduced latency in the network).
The core JASMIN infrastructure, to be operated as a national service, has been supplemented by additional hardware purchased on behalf of projects awarded funding from the NERC "Big Data" capital grants round. Support for these projects has and will be provided by the JASMIN team so that the community get the best value for money out of the purchase and operation of their "High Performance Data" (HPD) environment.
By the end of phase 3, including resources for these partner projects, JASMIN will offer a 15 Petabyte storage infrastructure.