DoC Computing Support Group


Differences between revisions 21 and 22
Revision 21 as of 2012-04-26 18:38:26
Size: 4987
Editor: dcw
Comment:
Revision 22 as of 2012-04-26 18:51:16
Size: 5716
Editor: dcw
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
Build a DoC Infrastructure-as-a-service private cloud, very like Amazon EC2 ("Elastic Compute Service") which presents a secure and convenient
web interface which enables users of DoC to specify and create VMs and associated storage, automatically install OSes on them and deploy them.
Main goal is to virtualize most research servers, decoupling the OS image from the hardware for greater flexibility. Sharing (amortizing) the
Build a DoC '''Infrastructure-as-a-service private cloud, very like Amazon EC2''' ("Elastic Compute Service") which presents a ''secure and convenient
web interface'' which enables users of DoC to ''specify and create VMs and associated storage, automatically install OSes on them and deploy them''.

The m
ain goal is to virtualize most research servers, decoupling the OS image from the hardware for greater flexibility. Sharing (amortizing) the
Line 15: Line 16:
 1. a 6 month phase to build a prototype cloud, recruiting a "Cloud Manager" person to join CSG, either temporary or permanent. The Department will spend some significant amount of money, perhaps in the £100-200K range.  1. a 6 month phase in which CSG (advised by an academic working group) will design and build a prototype cloud, recruiting a "Cloud Manager" person to join CSG, possibly for 6 months in the first instance. The Department will spend some significant amount of money to build the prototype cloud, perhaps in the £100-200K range.
Line 17: Line 18:
 2. assuming the prototype cloud is successful, it will move into production and the "Cloud Manager" become permanent. Researchers then have the option of adding hardware to the cloud. All members of CSG will become skilled in cloud-related topics, and the "Cloud Manager" will do non-cloud-related problem solving too.  2. assuming the prototype cloud is successful, it will move into production and the "Cloud Manager" become permanent. Researchers would then be encouraged to add research-funded hardware to the cloud and given some form of preferential treatment on "their hardware". All members of CSG are enthusiastic to gain cloud-related skills from the "Cloud Manager". (By the way, the Cloud Manager will do non-cloud-related systems administration too).
Line 19: Line 20:
Most crucially: (despite not knowing the exact spec, services to provide, let
alone how to implement them) the Department has found a significant amount of
money (perhaps in the region of
£100-200K) to spend on this project - and this
money '''must be full
spent before the end of July'''. This means that all
kit must be ordered, delivered and paid for before the 31st of July.  With the
Olympics making deliveries difficult, this means that everything must have been
ordered by 1st July.
Most crucially: (despite not knowing the exact set of services to provide, let alone how to implement them, or having yet appointed a Cloud Manager person)
the Department's £100-200K investment '''must be fully spent before the end of July'''. This means that all kit must be ordered, delivered and paid for before the 31st of July.
With the Olympics making deliveries difficult, this means that everything must have been ordered by 1st July.
Line 27: Line 24:
Re: amount to spend, AON suggested CSG prepare hardware proposals for £100K, £150K and £200K,
which we (LDK) have done.
Re: amount to spend, Anne O'Neill (AON) and Peter McBrien (PJM) suggested CSG prepare possible collections of commodity hardware, on the
assumptions that either £100K, £150K and £200K (inc vat) would be spent.
Line 33: Line 29:
Sometime in early 2012, Susan told DCW that DoC were thinking of hiring In January 2012, Susan told DCW that DoC were thinking of hiring
Line 35: Line 31:
private cloud.  Essentially she said that Exec Committee has found some
significant pot of money which needs to be spent this financial year.
private cloud. Exec Committee had agreed this, were happy to spend some
significant pot of money - this financial year.
Line 40: Line 36:
them up, install "linux du jour" on them, configure fileservers (if part of
cluster), tape backups (if part), processing node special software etc.
them up, install "linux du jour" on them [which changes each year], configure
fileservers (if part of cluster), tape backups (if part), processing node
special software etc.
Line 52: Line 49:
tied (1-1 at first) to their own hardware, CSG install that virtual cluster
node's OS, r
esearchers work as before - but each node is encapsulated
tied (1-1 at first) to their own hardware, the creation process should
automatically install a
CSG-supported operating system (historically
supported Linuxes and Windows
versions) on the new VM. Researchers work as before on each VM - but each node is encapsulated
Line 57: Line 55:
Various discussions with PJM and AON followed, clarifying things. Then
further meetings with interested academics were held.
We would also gain to flexibility to create short-term VMs for specific
"run this software" experiments. They would then be automatically
destroyed.

We could even give every DoC user (students and staff!) their very own
VM when they join, with full root/admin access - or at least the
ability to create one when they first need it (lazy evaluation:-)).

Various discussions with CSG, PJM and AON followed, clarifying things
quite a lot.

== Open Staff cloud meetings ==

In April 2012, the discussion was opened out to all interested staff, and (so far) two open staff cloud meetings
have been held. Here are some notes taken by DCW and LDK of the discussions at both meetings.

[[internal/project/privatecloud/meeting-2012-04-03|Open Staff Meeting 1 - April 3rd 2012]]

[[internal/project/privatecloud/meeting-2012-04-25|Open Staff Meeting 2 - April 25th 2012]]
Line 78: Line 93:

== Staff Working Group meetings ==

In April 2012, the discussion was opened out to all interested staff, and (so far) two open staff cloud meetings
have been held. Here are some notes taken by DCW and LDK of the discussions at both meetings.

[[internal/project/privatecloud/meeting-2012-04-03|Open Staff Meeting 1 - April 3rd 2012]]

[[internal/project/privatecloud/meeting-2012-04-25|Open Staff Meeting 2 - April 25th 2012]]

DoC Private Cloud

Project Goal

Build a DoC Infrastructure-as-a-service private cloud, very like Amazon EC2 ("Elastic Compute Service") which presents a secure and convenient web interface which enables users of DoC to specify and create VMs and associated storage, automatically install OSes on them and deploy them.

The main goal is to virtualize most research servers, decoupling the OS image from the hardware for greater flexibility. Sharing (amortizing) the costs of each machine. One driver of this is EPSRC deciding to only provide 50% of any hardware bid over £10K in future, with the Dept expected to pay the remaining 50|%.

This project will definitely proceed, having been approved by Executive Committee and by two open meetings of Academic staff.

Peter McBrien (PJM) is leading the project, and has laid out two stages:

  1. a 6 month phase in which CSG (advised by an academic working group) will design and build a prototype cloud, recruiting a "Cloud Manager" person to join CSG, possibly for 6 months in the first instance. The Department will spend some significant amount of money to build the prototype cloud, perhaps in the £100-200K range.
  2. assuming the prototype cloud is successful, it will move into production and the "Cloud Manager" become permanent. Researchers would then be encouraged to add research-funded hardware to the cloud and given some form of preferential treatment on "their hardware". All members of CSG are enthusiastic to gain cloud-related skills from the "Cloud Manager". (By the way, the Cloud Manager will do non-cloud-related systems administration too).

Most crucially: (despite not knowing the exact set of services to provide, let alone how to implement them, or having yet appointed a Cloud Manager person) the Department's £100-200K investment must be fully spent before the end of July. This means that all kit must be ordered, delivered and paid for before the 31st of July. With the Olympics making deliveries difficult, this means that everything must have been ordered by 1st July.

Re: amount to spend, Anne O'Neill (AON) and Peter McBrien (PJM) suggested CSG prepare possible collections of commodity hardware, on the assumptions that either £100K, £150K and £200K (inc vat) would be spent.

Background

In January 2012, Susan told DCW that DoC were thinking of hiring someone for 6 months into CSG, specifically tasked with building a DoC private cloud. Exec Committee had agreed this, were happy to spend some significant pot of money - this financial year.

She explained the core idea was "virtualisation even for research clusters": at present, research groups buy clusters when they have money, CSG set them up, install "linux du jour" on them [which changes each year], configure fileservers (if part of cluster), tape backups (if part), processing node special software etc.

Then the servers age, the OS is essentially frozen (it's often difficult to persuade researchers that we should reinstall their fileservers, webservers and compute nodes). They become "fragile". Sometimes it's hard to even retire them on schedule (4/5/6 years or whatever). Also these clusters are often only accessible by members of that research group so the resource may not be fully utilised.

Susan's vision: setup a private cloud, researchers add hardware to that cloud's core resources, then create a VM for each cluster node, perhaps tied (1-1 at first) to their own hardware, the creation process should automatically install a CSG-supported operating system (historically supported Linuxes and Windows versions) on the new VM. Researchers work as before on each VM - but each node is encapsulated inside a VM. Later, these VMs could share resources - when the group don't need 100% resources, or new more powerful hardware is purchased.

We would also gain to flexibility to create short-term VMs for specific "run this software" experiments. They would then be automatically destroyed.

We could even give every DoC user (students and staff!) their very own VM when they join, with full root/admin access - or at least the ability to create one when they first need it (lazy evaluation:-)).

Various discussions with CSG, PJM and AON followed, clarifying things quite a lot.

Open Staff cloud meetings

In April 2012, the discussion was opened out to all interested staff, and (so far) two open staff cloud meetings have been held. Here are some notes taken by DCW and LDK of the discussions at both meetings.

Open Staff Meeting 1 - April 3rd 2012

Open Staff Meeting 2 - April 25th 2012

Cloud Services and Software Investigations

CSG have been performing initial investigations of hardware to buy, whether all commodity or investigating commercial filers like NetApp, and possible software that might be able to implement some/all of the required iaas cloud services. Here are our notes:

Software Investigations

Cloud Hardware

Susan expressed a serious preference for open source software running on commodity hardware. That way, commodity hardware could be repurposed, even if the cloud project failed completely.

Against that, Peter Pietzuch strongly recommended that we also investigate commercial scalable storage systems - specifically NetApp "scalable NAS" units. We are investigating these and will report back soon.

Possible hardware to buy will be added later, we have done quite a lot of work.. coming soon.

 
 

project/privatecloud (last edited 2013-11-13 19:27:43 by dcw)