JASMIN Phase 4

JASMIN Phase 4 - September 2018 update

JASMIN undergoing major upgrade

Over the course of 2018, following a multi-milliion pound NERC investment, JASMIN is undergoing a major capacity and capability upgrade. Read on for details of what’s happening and how JASMIN Phase 4 will affect you.  

Important Dates

September - October 2018
Upgrade work ongoing: possibility of disruption. Check  CEDA's news updates and follow @cedanews on Twitter for further updates.
JASMIN Phase 4 Upgrade

Storage

After 6 years of faithful service, over 5 Petabytes (PB) of storage from JASMIN Phase 1 has been retired and taken out of operation. By the end of 2018, a further 6 PB from JASMIN Phase 2 will also need to be retired. However, a total of 38.5 PB of new storage is currently being added, consisting of:

  • 3 PB of Parallel File System storage (PFS)
  • 30 PB of Scale-Out File System storage (SOF)
  • 5 PB of Object Storage (OS)
  • 0.5 PB of dedicated high-performance storage for scratch and home directory use.

These are being brought into operation in stages over the next few weeks/months, see sections below for details of progress. Apart from the extra volume, the biggest change is moving from one type of storage to multiple different types of storage. These different storage types will allow more scalability and more flexibilty in how we support different JASMIN users - and the increased flexibility will keep JASMIN at the cutting edge of petascale environmental science.

Details & progress: Home Directory Storage Details & progress: Group Workspace Storage Details & progress: Scratch storage

Compute 

Along with the increased storage, we will be deploying 220 additional servers  for the LOTUS batch cluster and the JASMIN community cloud, and 10 more servers for the JASMIN Data Transfer Zone.

Details & progress: Compute

Network & Infrastructure

With all the new storage and compute, an upgrade to the internal netork is required, and additional machine room equipment and infrastructure will be necessary.

Details & progress: Network & Infrastructure

Software

The software changes will include:

  • Deploying OpenStack as a replacement for our previous cloud management infrastructure.
  • A new version of the JASMIN Cloud Portal to work with OpenStack.
  • A JASMIN-account based identity service.
  • An external data access service for group workspaces (OPeNDAP for Group Workspaces) -  enabling GWS managers to expose data more easily.
  • Support for fast access to  GWS and CEDA Archive data from within the JASMIN Unmanaged Cloud (also via OPenDAP).
  • Deploying interfaces to the object stores for climate science data.
  • Support for Cluster-as-a-Service in the JASMIN Unmanaged Cloud.

Watch this space for further details about these new developments and the benefits they will bring to users of JASMIN.

Details & progress: Software

Dedicated flash-based storage has been purchased for use as storage for user home directories. This will enable users to have a larger home directory quota (just how big is yet to be decided), and should significantly increase performance when performing tasks involving the handling of small files (for example code compilation). It should also increase system uptime and perceived reliability, by decoupling home storage from the high-volume storage.

Home directories were migrated over to this new storage on 14 March 2018.

Back to Overview

Integration of Phase 4 SOF Storage IN PROGRESS

A total of 30 PB of Scale-Out-File System storage (SOF) is currently being deployed for operational use supporting the CEDA Archive and Group Workspaces. A number of issues have delayed progress with this, and current expectations are for this storage to be available in early October 2018. 

After addressing several issues with the physical hardware once installed, the SCD system team has been working with the storage and network vendors to resolve an issue with networking. The solution was identified in August, but under heavy sustained network load, further testing provoked some packet loss traced to an issue with the operating system kernel. This is currently under investigation with the OS vendor, improved with 4.x kernels but as yet unresolved with 3.x kernels. 
Further concerns had been fuelled by apparent reliability issues with Phase 3.5 SOF storage (used for /group_workspaces/jasmin4 storage: the target of the recent migrations as per below, but of the same type as the Phase 4 storage). These were traced to bugs in the SOF client and server software which are now believed to be resolved in versions deployed at the end of August. 
Final benchmarking and installation of these latest versions is taking place, to be followed by configuration of systems monitoring and management checks before the storage is put into production.

We are aware of the critical need for new space and are doing everything we can to prioritise work on making this new storage available: it is badly needed both by Group Workspaces and for the CEDA Archive which also shares the same storage infratstructure. Watch this space for further details.

Migration from JASMIN Phase 1 storage COMPLETED

Previously this year, we (behind the scenes) migrated over 2 PB of CEDA Archive data from its previous location to some of the new storage detailed above. A similar task was completed in April/May for many Group Workspaces, moving them:

  • from paths /group_workspaces/jasmin/*to /group_workspaces/jasmin2/*or /group_workspaces/jasmin4
  • from paths /group_workspaces/cems/*to /group_workspaces/cems2/*or /group_workspaces/jasmin4

Note that the cems prefix is now deprecated and will no longer be used for newly-created storage volumes: the infrastructure is now simply known as JASMIN.

  • You are strongly advised to ensure that any scripts, programs or references DO NOT USE ABSOLUTE PATHS to the workspace.
  • Please avoid using inter-volume symlinks ( this article explains why).

For paths /group_workspaces/jasmin4/* the storage system is now SOF storage (in this phase, supplied by QuoByte), which has different features from the Panasas storage used to date, and is (in its current configuration) NOT capable of shared-file MPI-IO. If you run code which makes use of this feature, and this is essential to your work, please let the JASMIN Team know via the CEDA Helpdesk support@ceda.ac.uk so that we can discuss this with you: ideally you should use the MPI-IO-capable scratch area for your processing, then move your outputs to your GWS. These  codes MUST NOT be run against storage without this capability.

Although data in GWS on jasmin2 and cems2 will not be moving in the first phase, some of the moving data will move onto the same underlying storage, which may cause some short-term limits on expansion within those GWS - and at some time during 2018 all these data will be migrated as well so that the Phase 2 storage can be retired.

All migrations will be done behind the scenes initially to a new (hidden) volume, while the old volume remains live. At a time to be arranged with the Group Workspace Manager, a final “sync” will take place before the new volume will be renamed to replace the old.  During this short period (hours) the GWS will not be available. Once the change has taken place, the old volume will no longer be accessible.

Unfortunately because of the timing of the various procurements and retirements, and the reorganisations necessary to take advantage of the new storage, some GWS may require more than one move during 2018. We will of course try to minimize disruption.

Where has my group workspace gone?

Please note that storage locations with paths starting /group_workspaces/jasmin4 are now automounted. This means that they are not mounted by a particular host until the moment they are first accessed. If the workspace you are expecting to see is not listed at the top level ( /group_workspaces/jasmin4/) you should lsthe full path of the workspace, and after a very short delay the workspace should appear. This also explains reports of different workspaces being mounted on different jasmin machines: only those which are being actively used are mounted at any one time.  If you are still unable to find your GWS, please contact your GWS manager in the first instance, as they should have up-to-date information about their GWS and can liaise with CEDA support on your behalf.

As a GWS manager, I used to get regular emails about usage/status of my GWS: why have these stopped?

Unfortunately, the move to SOF storage has temporarily broken this reporting system. Once the SOF storage is fully in production, the scripts used to generate these alerts will be updated to work with the new storage, however the API used to interrogate the storage is not yet available to us for this purpose.

You can get basic information about GWSs using the appropriatepan_df -H or df -Hcommand, depending on what file system the GWS is stored on:

path filesystem command notes
/group_workspaces/jasmin2/NNN
/group_workspaces/cems2/NNN
Panasas (PFS) pan_df -H <PATH> Divide values other than % by 1.3
/group_workspaces/jasmin4/NNN
/gws/nopw/j04/NNN
QuoByte (SOF) df -H <PATH> (No need to divide values)

where PATH is the full path to the GWS volume, including its name. e.g. /group_workspaces/jasmin2/NNN.

Note: both of these commands return values in "vendor" units, i.e. powers of 1000, commonly used for storage, not powers of 1024.

Back to Overview

On 14th March, /work/scratch (used as intermediate storage by LOTUS jobs, not for general interactive use) was split into two areas. You now need to decide which to use:

  • A new /work/scratch area was created on newer storage. However, you should configure your software to use this ONLY if you think you need shared file writes with MPI-IO.
  • A second, larger area, /work/scratch-nompiio has been created (size 250TB) on new flash-based storage which should have significant performance benefits particularly for operations involving lots of small files. PLEASE USE THIS AREAunless you have a good reason to use the other.

Please remember that both scratch areas are shared resources, so consider other users and remember to clean up after yourself!

Back to Overview

Compute nodes added during this phase have so far been employed for benchmark testing the Phase 4 SOF storage. Once the storage is passed for operation, the compute nodes will be deployed for operational use within LOTUS and the JASMIN community cloud, and an additional 10 servers deployed for use with high-performance data transfer services within the Data Transfer Zone.

Back to Overview
  • JASMIN4 "super-spine" and associated network enhancement completed
  • Expansion and upgrade to management network completed
  • Installation of additional rackes, power units, cabling and environmental monitoring completed
Back to Overview
  • OPenDAP4GWS (OPeNDAP for Group Workspaces) now deployed in production with capability for tenant projects to autonomously expose their GWS data via common web interface and OPeNDAP protocol
  • Testbed in place for high-performance OPeNDAP for CEDA Archive
  • Storage interfaces proof-of-concept project delievered prototype HSHS deployment, ready for use with Object Store when available
  • OpenStack (cloud management system) deployed in production
  • Migration of most cloud tenants to Openstack completed
  • DASK cluster-as-a-service testbed deployed in Kubernetes as proof-of-concept
  • Containerised Jupyter Notebook service deployed in Kubernetes as proof-of-concept
  • New version of JASMIN cloud portal deployed in production for management of OpenStack-based cloud tenancies
  • JASMIN-account based identity service deployed
Back to Overview
This website and others run by CEDA use cookies. By continuing to use this website you are agreeing to our use of cookies.