Over the course of 2018, following a multi-milliion pound NERC investment, JASMIN is undergoing a major capacity and capability upgrade. Read on for details of what’s happening and how JASMIN Phase 4 will affect you.
After 6 years of faithful service, over 5 Petabytes (PB) of storage from JASMIN Phase 1 has been retired and taken out of operation. By the end of 2018, a further 6 PB from JASMIN Phase 2 will also need to be retired. However, a total of 38.5 PB of new storage is currently being added, consisting of:
These are being brought into operation in stages over the next few weeks/months, see sections below for details of progress. Apart from the extra volume, the biggest change is moving from one type of storage to multiple different types of storage. These different storage types will allow more scalability and more flexibilty in how we support different JASMIN users - and the increased flexibility will keep JASMIN at the cutting edge of petascale environmental science.Details & progress: Home Directory Storage Details & progress: Group Workspace Storage Details & progress: Scratch storage
Along with the increased storage, we will be deploying 220 additional servers for the LOTUS batch cluster and the JASMIN community cloud, and 10 more servers for the JASMIN Data Transfer Zone.Details & progress: Compute
With all the new storage and compute, an upgrade to the internal netork is required, and additional machine room equipment and infrastructure will be necessary.Details & progress: Network & Infrastructure
The software changes will include:
Watch this space for further details about these new developments and the benefits they will bring to users of JASMIN.Details & progress: Software
Dedicated flash-based storage has been purchased for use as storage for user home directories. This will enable users to have a larger home directory quota (just how big is yet to be decided), and should significantly increase performance when performing tasks involving the handling of small files (for example code compilation). It should also increase system uptime and perceived reliability, by decoupling home storage from the high-volume storage.
Home directories were migrated over to this new storage on 14 March 2018.Back to Overview
A total of 30 PB of Scale-Out-File System storage (SOF) is currently being deployed for operational use supporting the CEDA Archive and Group Workspaces. A number of issues have delayed progress with this, and current expectations are for this storage to be available in early October 2018.
After addressing several issues with the physical hardware once installed, the SCD system team has been working with the storage and network vendors to resolve an issue with networking. The solution was identified in August, but under heavy sustained network load, further testing provoked some packet loss traced to an issue with the operating system kernel. This is currently under investigation with the OS vendor, improved with 4.x kernels but as yet unresolved with 3.x kernels.
Further concerns had been fuelled by apparent reliability issues with Phase 3.5 SOF storage (used for /group_workspaces/jasmin4 storage: the target of the recent migrations as per below, but of the same type as the Phase 4 storage). These were traced to bugs in the SOF client and server software which are now believed to be resolved in versions deployed at the end of August.
Final benchmarking and installation of these latest versions is taking place, to be followed by configuration of systems monitoring and management checks before the storage is put into production.
We are aware of the critical need for new space and are doing everything we can to prioritise work on making this new storage available: it is badly needed both by Group Workspaces and for the CEDA Archive which also shares the same storage infratstructure. Watch this space for further details.
Previously this year, we (behind the scenes) migrated over 2 PB of CEDA Archive data from its previous location to some of the new storage detailed above. A similar task was completed in April/May for many Group Workspaces, moving them:
Note that the
cems prefix is now deprecated and will no longer be used for newly-created storage volumes: the infrastructure is now simply known as JASMIN.
/group_workspaces/jasmin4/* the storage system is now SOF storage (in this phase, supplied by QuoByte), which has different features from the Panasas storage used to date, and is (in its current configuration) NOT capable of shared-file MPI-IO. If you run code which makes use of this feature, and this is essential to your work, please let the JASMIN Team know via the CEDA Helpdesk email@example.com so that we can discuss this with you: ideally you should use the MPI-IO-capable scratch area for your processing, then move your outputs to your GWS. These codes MUST NOT be run against storage without this capability.
Although data in GWS on
cems2 will not be moving in the first phase, some of the moving data will move onto the same underlying storage, which may cause some short-term limits on expansion within those GWS - and at some time during 2018 all these data will be migrated as well so that the Phase 2 storage can be retired.
All migrations will be done behind the scenes initially to a new (hidden) volume, while the old volume remains live. At a time to be arranged with the Group Workspace Manager, a final “sync” will take place before the new volume will be renamed to replace the old. During this short period (hours) the GWS will not be available. Once the change has taken place, the old volume will no longer be accessible.
Unfortunately because of the timing of the various procurements and retirements, and the reorganisations necessary to take advantage of the new storage, some GWS may require more than one move during 2018. We will of course try to minimize disruption.
Please note that storage locations with paths starting
/group_workspaces/jasmin4 are now automounted. This means that they are not mounted by a particular host until the moment they are first accessed. If the workspace you are expecting to see is not listed at the top level (
/group_workspaces/jasmin4/) you should
lsthe full path of the workspace, and after a very short delay the workspace should appear. This also explains reports of different workspaces being mounted on different jasmin machines: only those which are being actively used are mounted at any one time. If you are still unable to find your GWS, please contact your GWS manager in the first instance, as they should have up-to-date information about their GWS and can liaise with CEDA support on your behalf.
Unfortunately, the move to SOF storage has temporarily broken this reporting system. Once the SOF storage is fully in production, the scripts used to generate these alerts will be updated to work with the new storage, however the API used to interrogate the storage is not yet available to us for this purpose.
You can get basic information about GWSs using the appropriate
pan_df -H or
df -Hcommand, depending on what file system the GWS is stored on:
||Divide values other than % by 1.3|
||(No need to divide values)|
PATH is the full path to the GWS volume, including its name. e.g.
Note: both of these commands return values in "vendor" units, i.e. powers of 1000, commonly used for storage, not powers of 1024.
On 14th March,
/work/scratch (used as intermediate storage by LOTUS jobs, not for general interactive use) was split into two areas. You now need to decide which to use:
/work/scratcharea was created on newer storage. However, you should configure your software to use this ONLY if you think you need shared file writes with MPI-IO.
/work/scratch-nompiiohas been created (size 250TB) on new flash-based storage which should have significant performance benefits particularly for operations involving lots of small files. PLEASE USE THIS AREAunless you have a good reason to use the other.
Please remember that both scratch areas are shared resources, so consider other users and remember to clean up after yourself!Back to Overview
Compute nodes added during this phase have so far been employed for benchmark testing the Phase 4 SOF storage. Once the storage is passed for operation, the compute nodes will be deployed for operational use within LOTUS and the JASMIN community cloud, and an additional 10 servers deployed for use with high-performance data transfer services within the Data Transfer Zone.Back to Overview