|
Size: 10340
Comment:
|
Size: 12199
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = Wiki page for notes on Jan-April 2012 DoC private cloud discussions = | = DoC Private Cloud = |
| Line 3: | Line 3: |
| == Intro == | == Services == Initially, the following services will be needed: * Virtual-machine hosting / automated provisioning facility. * Persistent backing-store for VM images. * High-performance POSIX file-store access / scratch areas. == Background == |
| Line 6: | Line 14: |
| someone (Jeremy Cohen) for 6 months into CSG, specifically tasked with building a DoC private cloud [definition unclear]. Essentially, Exec Committee has found some money and needs to spend it quick! |
someone for 6 months into CSG, specifically tasked with building a DoC private cloud. Essentially she said that Exec Committee has found some significant pot of money which needs to be spent this financial year. |
| Line 29: | Line 37: |
| Suppose, for instance, the group needed N nodes x 100% of underlying VM host x M months [and then less thereafter]. |
Various discussions with PJM and AON followed, post will be called "Cloud Manager" and be part of CSG, and do non-cloud things too. Could be permanent, could be 6 months in the first instance. |
| Line 32: | Line 41: |
| Susan also added "and it should just scale without limits, manage itself magically", which is less realistic:-) Saving RAs significant informal sysadmin time is a goal. Various discussions with PJM and AON followed, Jeremy decided not to accept the job, DoC still wants to hire a "Cloud Manager" as part of CSG. Most crucially: the Dept decided it has money now, not next year, and that (despite not knowing the exact spec, services to provide, let alone how to implement them) we therefore needed to purchase all the kit |
Most crucially: (despite not knowing the exact spec, services to provide, let alone how to implement them) we therefore need to purchase all the kit |
| Line 43: | Line 45: |
| £150K or even £200K - we are tasked with providing possible plans for | £150K or even £200K - we will provide possible plans for |
| Line 57: | Line 59: |
| - PJM/Susan: background (spend money now, define services later), acknowledged unusual approach.. added (PJM) idea that a group can have a VM per project per year if they need, so they build new apps on the latest supported OS, while maintaining the ability to run their old versions on the older OS, allows people to try old code on new OS releases without "big bang" server upgrade problems. old VMs can eventually wither away.. |
* PJM/Susan: background (spend money now, define services later), acknowledged unusual approach.. added (PJM) idea that a group can have a VM per project per year if they need, so they build new apps on the latest supported OS, while maintaining the ability to run their old versions on the older OS, allows people to try old code on new OS releases without "big bang" server upgrade problems. old VMs can eventually wither away.. want to save RAs (and CSG?) sysadmin time. |
| Line 64: | Line 61: |
| - PJM: start with concept of: every student gets a VM as they walk in through the door, keep while at College, have root access [need to fix/avoid NFS problem]. users should have the ability to create more VMs programmatically, both short term and long term ones. |
* PJM: start with concept of: every student gets a VM as they walk in through the door, keep while at College, have root access [need to fix/avoid NFS problem]. users should have the ability to create more VMs programmatically, both short term and long term ones. |
| Line 69: | Line 63: |
| - PJM: also, are we all agreed: it's got to be a realiable production system. | * PJM: also, are we all agreed: it's got to be a realiable production system. Noone disagreed (but see later discussions). |
| Line 71: | Line 65: |
| - JAMM: use cases - projects into cloud technologies, pervasive computing exercises could be made more flexible [not sure how], some of her research involves streaming data from sensors, need high-capacity filestores. |
* JAMM: use cases of interest to her - projects into cloud technologies, pervasive computing exercises could be made more flexible [not sure how], some of her research involves streaming data from sensors, need high-capacity filestores. |
| Line 75: | Line 67: |
| - PRP: EPSRC call "every research grant puts in for a small cluster" by the name "vanity clusters". EPSRC favouring shared resources (Dept, College, federated) - will allocate at most first £10K of equipment, then excess must have matching funds from Dept! favours (for example) shared services, grids, clouds and HPC. |
* PRP: EPSRC call "every research grant puts in for a small cluster" by the name "vanity clusters". EPSRC favouring shared resources (Dept, College, federated) - will allocate at most first £10K of equipment, then excess must have matching funds from Dept! favours (for example) shared services, grids, clouds and HPC. |
| Line 81: | Line 69: |
| - PRP added: VMs can really speed up provisioning of research project kit, instead of purchasing kit, waiting for it to arrive, installing and configuring it, continuing to maintain it, then (after project) decide what to do with it, can create 16 short term VMs bound to suitable hardware very quickly, do quick experiments and release the VMs resources. If spare hardware capacity is in hand, of course! Like Julie, Peter added that research into cloud and distributed systems performance could be improved if we had a cloud which we could monitor and tweak. |
* PRP added: VMs can really speed up provisioning of research project kit, instead of purchasing kit, waiting for it to arrive, installing and configuring it, use and maintain it, then (after project) decide what to do with it, can create 16 short term VMs bound to suitable hardware very quickly, do quick experiments and release the VMs resources. If spare hardware capacity is in hand, of course! |
| Line 90: | Line 71: |
| - JD: 2 important aspects of cloud here: 1. easily provisioned VMs; 2. amortization of all resources over multiple projects. The latter requires that researchers don't require all of their "own" resources "all" of the time - otherwise none spare! |
* PRP agreed with Julie that research into cloud and distributed systems performance could be improved if we had a cloud which we could monitor and tweak. |
| Line 95: | Line 73: |
| - PJM/Susan: the matching funds model allows Dept to demand up to 50% of these shared resources [on average over time, perhaps front-loaded so "owners" get the majority of time up front, release nearly all resources later for general use]. |
* JD: 2 important aspects of cloud here: 1. easily provisioned VMs; 2. amortization of all resources over multiple projects. The latter requires that researchers don't require all of their "own" resources "all" of the time - otherwise none spare! |
| Line 100: | Line 75: |
| - CCADAR: will sometimes need exclusive access to all "your" cluster VMs on all your hardware for experiments - repeatability is especially important. => need ability to pin VMs onto particular classes of node. |
* PJM/Susan: the matching funds model allows Dept to demand up to 50% of these shared resources [on average over time, perhaps front-loaded so "owners" get the majority of time up front, release nearly all resources later for general use]. |
| Line 104: | Line 77: |
| - PRP: Yes, and sometime experiments need to happen directly on the bare metal. but only a small minority! |
* CCADAR: will sometimes need exclusive access to all "your" cluster VMs on all your hardware for experiments - repeatability is especially important. => need ability to pin VMs onto particular classes of node. |
| Line 107: | Line 79: |
| - JAMM: performance monitoring very important. | * PRP: Yes, and sometime experiments need to happen directly on the bare metal. but only a small minority! |
| Line 109: | Line 81: |
| - WJK: yes, including power monitoring of the physical VM hosts, a la picards. very useful. |
* JAMM: performance monitoring very important. |
| Line 112: | Line 83: |
| - GCASALE: agreed, subtle point about frequency of monitoring being very different between cheap power mon and expensive power mon.. LDK discussing with him. |
* WJK: yes, including power monitoring of the physical VM hosts, a la picards. very useful. |
| Line 116: | Line 85: |
| - SUSAN: Maja had mentioned that she makes a very large amount of use of Matlab, on Windows clusters, buying extra parallel licenses etc. PJM: why not use College standard license? DCW: believe extra modules and parallel licenses not included in College Matlab license. DCW added: Note that ICT HPC kit doesn't support Matlab for same reason! |
* GCASALE: agreed, added a subtle point about frequency of monitoring being very different between "cheap" power mon and "expensive" power mon.. LDK discussing with him. |
| Line 122: | Line 87: |
| - TORA: Lab are very interested in more continuous autotesting, need a better sandbox: like a short term VM to run student code in! Also very interested in scalable storage (didn't say why?) |
* SUSAN: Maja had mentioned that she makes a very large amount of use of Matlab, on Windows clusters, buying extra parallel licenses etc. PJM: why not use College standard license? DCW: believe extra modules and parallel licenses not included in College Matlab license, which is why ICT HPC kit doesn't support Matlab either! |
| Line 126: | Line 89: |
| - JD/SUSAN discussed: where are other Computing Depts with clouds? at any level (Dept, College, federated?) - answer seems to be: none known in production. |
* TORA: Lab are very interested in more continuous autotesting, need a better sandbox: like a short term VM to run student code in! Also very interested in scalable storage (didn't say why?) |
| Line 130: | Line 91: |
| - DWM added that LESC had done lots of "cloud v1" - grid - related work, and mentioned the similarities between grids, private clouds and HPC. |
* JD/SUSAN discussed: where are other Computing Depts with clouds? at any level (Dept, College, federated?) - answer seems to be: none known in production. |
| Line 133: | Line 93: |
| - PRP said that we should make more use of ICT's HPC, as we're paying for it. Susan said: some use (PHJK, Kanwal), have found HPC team not very welcoming to DoC, sniffy about Java code. DCW said: yes, real programmers in HPC:-) DCW added: lots of money still going in though - let's use it. ICT also upgrading to VMware ESX 5, which "supports cloud" (buzzword alert). DCW added: HPC doesn't even let you access College home dirs cos they're "not fast enough". |
* DWM added that LESC had done lots of "cloud v1" - grid - related work, and mentioned the similarities between grids, private clouds, batch processing and HPC. |
| Line 141: | Line 95: |
| - PJM asked re: this - does everyone want DoC home dirs and research volumes accessible from VMs? everyone agreed, and several people pointed out that existing fileservers can be saturated by Condor so need to scale more. => cloud storage needs to hold VM images and (some) scalable filesystem data too. not clear how much. |
* PRP said that we should make more use of ICT's HPC, big resource. Susan said: some use ICT extensively (eg PHJK). PJM added that PHJK has found ICT HPC support very helpful, has invested money in more HPC kit, and believes we should make more use of college HPC. SUSAN added that she/Khanwal have found HPC team a bit sniffy about running Java code on HPC kit. |
| Line 147: | Line 97: |
| - DCW asked: what about Amazon S3 - simple distributed (key,value) storage system - do we need that? some people said "might be useful" but noone had a solid use case. |
* DCW said: yes, real programmers in HPC:-), and added that lots of money still going in though - let's use it. DCW added: HPC doesn't even let you access College home dirs cos they're "not fast enough" (source: Simon Burbidge, ICT), and mentioned that ICT also upgrading to VMware ESX 5, which "supports cloud" (but DCW doesn't know what that means). |
| Line 151: | Line 99: |
| - WJK added that he'd love to do experiments using different speed storage eg. flash and raid levels. |
* PJM asked re: this - does everyone want DoC home dirs and research volumes accessible from VMs? everyone agreed, but several people pointed out that existing fileservers can be saturated by Condor so fileservers will need to scale more to cope. |
| Line 154: | Line 101: |
| - TORA added that a large scalable block storage system would be very useful, but neglected to say why. |
* DCW asked: what about Amazon S3 - simple distributed (key,value) storage system - important to DoC? some people said "might be useful" but noone had a solid use case. |
| Line 157: | Line 103: |
| - DWM said there seems to be a need for scalable storage at some level as part of the cloud, there are a variety of technologies - open source and commercial - to look at. |
* WJK added that he'd love to do storage speed experiments using different speed storage eg. flash and raid levels. |
| Line 161: | Line 105: |
| - PJM channeled PRP in saying that "commercial filers" should be looked into, think he meants NetApp/EMC stuff. Susan said DoC prefers open source if possible, PRP added that cloud storage is NetApp's bread and butter and their support and scalability was really good. DCW: look at. |
* TORA added that a large scalable block storage system would be very useful, but neglected to say why. |
| Line 166: | Line 107: |
| - SUSAN reported that DR had initially said - CSG do everything his group needs, why need a cloud. However, when she asked him - want more scalable storage, his eyes lit up! |
* DWM said there seems to be a need for scalable storage at some level as part of the cloud, there are a variety of technologies - open source and commercial - to look at. Amazingly, he didn't even say "Ceph":-) |
| Line 170: | Line 109: |
| - DWM: so we conclude that scalable storage is very important? | * PJM said that commercial filers should be looked into, such as NetApp/EMC. PRP added that cloud storage is NetApp's bread and butter and their support and scalability was really good. Susan said DoC should consider these, but has a preference for open source if possible, DCW: CSG need to investigate NetApp with PRP/Cambridge/ICT help. |
| Line 172: | Line 111: |
| - GCASALE asked: what type of cloud? private? DCW/PJM: yes. what about cloudbursting, Giuliano asked ? [what's that we said] - upload VMs to Amazon after development (or when need short term resources). PJM: useful if possible. |
* SUSAN reported that DR had initially said - CSG do everything his group needs, why need a cloud. However, when she asked him - could your group use more scalable storage, his eyes lit up! |
| Line 177: | Line 113: |
| - "tall chap in green shirt": what about network bandwidth? 10Gb links? may also need bandwidth reservation in switch fabric. DWM: talking with ICT networking about 10Gb. |
* DWM: so we conclude that scalable storage is very important? general vague agreement. |
| Line 181: | Line 115: |
| - "natasha's phd student in her place": their group are very interested in virtualizing algorithms and still using FGPAs and GPUs, and again more scalable storage is needed here. |
* DCW summary: so cloud storage needs to hold VM images, it's not clear whether the same cloud storage subsystem should also support scalable filesystems, or whether fileservers are separate (but need to scale more). No estimate of size! S3 probably not important (optional extra). |
| Line 185: | Line 117: |
| - WL agrees, saying some VM hosts definitely need to have GPUs and FPGAs (he can provide details and costs). He added that he'd be very interested in "getting under the hood" and tweaking and monitoring how various aspects of the cloud operate. PJM said: may be contrary to production cloud - but perhaps a "sandpit cloud" could fork off the main cloud on occasion, grab some hardware etc. WJK agreed. DCW added that Amazon EC2 had VMs with access to GPUs and FPGAs etc in their pricing model. |
* GCASALE asked: what type of cloud? private? DCW/PJM: yes. what about cloudbursting, he asked ? DWM: what's that? GCASALE: ability to upload VMs to Amazon after development (or when need short term extra resources, maybe downloading VMs from Amazon too, general inter-Amazon operability). PJM: useful if possible. |
| Line 193: | Line 119: |
| - PJM talked about a cost accounting model, enforcing 50% maximum usage, WJK wondered whether anything that heavy was needed. (god knows how that's even implemented! perhaps logging use for post-analysis). |
* "tall chap in green shirt": what about network bandwidth? 10Gb links? may also need bandwidth reservation in switch fabric. DWM: talking with ICT networking about 10Gb, they can also discuss bandwidth reservation. |
| Line 197: | Line 121: |
| - JD asked: would we give access to people outside of DoC? DCW: no. PJM: might be open to sharing with ICT. JD: power of clouds - federating. |
* "natasha's phd student in her place": their group are very interested in virtualizing algorithms but still using FGPAs and GPUs, and again more scalable storage is needed here. * WL agreed, saying some VM hosts definitely need to have GPUs and FPGAs (he can provide details and costs). DCW added that Amazon EC2 had VMs with access to GPUs and FPGAs etc in their pricing model. * WL added that he'd be very interested in "getting under the hood" and tweaking and monitoring how various aspects of the cloud operate. PJM said: may be contrary to production cloud - but perhaps a "sandpit cloud" could fork off the main cloud on occasion, grab some hardware etc. WJK agreed. * PJM talked about a cost accounting model, enforcing 50% maximum usage, sounded very complicated (DCW: god knows how that's even implemented! perhaps logging use for post-analysis). WJK wondered whether anything that heavy was needed. * JD asked: would we give access to people outside of DoC? DCW: no, our resources, our users. JD: power of clouds (and interesting research topics) is when you get to federating. * PJM: might be open to sharing with ICT, maybe specific research projects later? Quick round up of other comments at end, useful services to check? * CacheDB useful (JD) * OpenNebula (GCASALE) * Eucalyptus (JD). * OpenStack (PJM) * Hadoop/Mapreduce (green shirt bloke) * DCW asked about size of storage: helpful answer was "TBs to PBs". === PJM's summary of meeting === I think that three basic conclusions should be drawn from yesterday's discussion regarding the specification of hardware: 1. Our concept of buying compute nodes with large numbers of cores and large memory, with disc storage for virtual machines images, and with 10G network will support the main objective of providing a DoC Cloud that provides virtual computers to the DoC students and staff for teaching and research purposes. We need to refine exactly which machines and configuration will be purchased. 2. We should look at input from research groups as to what is the cost of GPU, FGPA, and hardware monitoring, and see if this can be incorporated at this stage. 3. We do need to look at fast network storage options. === Next meeting === Next Working Group meeting: 25th April 1pm, level 4 common room |
DoC Private Cloud
Services
Initially, the following services will be needed:
- Virtual-machine hosting / automated provisioning facility.
- Persistent backing-store for VM images.
- High-performance POSIX file-store access / scratch areas.
Background
Sometime in early 2012, Susan told DCW that DoC were thinking of hiring someone for 6 months into CSG, specifically tasked with building a DoC private cloud. Essentially she said that Exec Committee has found some significant pot of money which needs to be spent this financial year.
She explained the core idea was "virtualisation even for research clusters": at present, research groups buy clusters when they have money, CSG set them up, install "linux du jour" on them, configure fileservers (if part of cluster), tape backups (if part), processing node special software etc.
Then the servers age, the OS is essentially frozen (it's often difficult to persuade researchers that we should reinstall their fileservers, webservers and compute nodes). They become "fragile". Sometimes it's hard to even retire them on schedule (4/5/6 years or whatever). Also these clusters are often only accessible by members of that research group so the resource may not be fully utilised.
Susan's vision: setup a private cloud, researchers add hardware to that cloud's core resources, then create a VM for each cluster node, perhaps tied (1-1 at first) to their own hardware, CSG install that virtual cluster node's OS, researchers work as before - but each node is encapsulated inside a VM. Later, these VMs could share resources - when the group don't need 100% resources, or new more powerful hardware is purchased.
Various discussions with PJM and AON followed, post will be called "Cloud Manager" and be part of CSG, and do non-cloud things too. Could be permanent, could be 6 months in the first instance.
Most crucially: (despite not knowing the exact spec, services to provide, let alone how to implement them) we therefore need to purchase all the kit having it delivered in July 2012, before the Olympics. PJM added "build a private cloud like Amazon EC2 does", AON suggested a budget of £100K, £150K or even £200K - we will provide possible plans for these price levels.
DWM has spent a lot of time evaluating Ceph as a possible S3/Elastic Block Store like storage system for supporting VM storage and possibly very high speed filesystems eg. staging areas for VM data (scaleout NAS with replication). So far: it's not there yet. Alternatives need to be looked at as well..
Working Group: 3rd April 2012 meeting
A working group of academics has been set up, this met on 3rd April 2012 for the first time. Things discussed:
- PJM/Susan: background (spend money now, define services later), acknowledged unusual approach.. added (PJM) idea that a group can have a VM per project per year if they need, so they build new apps on the latest supported OS, while maintaining the ability to run their old versions on the older OS, allows people to try old code on new OS releases without "big bang" server upgrade problems. old VMs can eventually wither away.. want to save RAs (and CSG?) sysadmin time.
- PJM: start with concept of: every student gets a VM as they walk in through the door, keep while at College, have root access [need to fix/avoid NFS problem]. users should have the ability to create more VMs programmatically, both short term and long term ones.
- PJM: also, are we all agreed: it's got to be a realiable production system. Noone disagreed (but see later discussions).
- JAMM: use cases of interest to her - projects into cloud technologies, pervasive computing exercises could be made more flexible [not sure how], some of her research involves streaming data from sensors, need high-capacity filestores.
- PRP: EPSRC call "every research grant puts in for a small cluster" by the name "vanity clusters". EPSRC favouring shared resources (Dept, College, federated) - will allocate at most first £10K of equipment, then excess must have matching funds from Dept! favours (for example) shared services, grids, clouds and HPC.
- PRP added: VMs can really speed up provisioning of research project kit, instead of purchasing kit, waiting for it to arrive, installing and configuring it, use and maintain it, then (after project) decide what to do with it, can create 16 short term VMs bound to suitable hardware very quickly, do quick experiments and release the VMs resources. If spare hardware capacity is in hand, of course!
- PRP agreed with Julie that research into cloud and distributed systems performance could be improved if we had a cloud which we could monitor and tweak.
- JD: 2 important aspects of cloud here: 1. easily provisioned VMs; 2. amortization of all resources over multiple projects. The latter requires that researchers don't require all of their "own" resources "all" of the time - otherwise none spare!
- PJM/Susan: the matching funds model allows Dept to demand up to 50% of these shared resources [on average over time, perhaps front-loaded so "owners" get the majority of time up front, release nearly all resources later for general use].
CCADAR: will sometimes need exclusive access to all "your" cluster VMs on all your hardware for experiments - repeatability is especially important. => need ability to pin VMs onto particular classes of node.
- PRP: Yes, and sometime experiments need to happen directly on the bare metal. but only a small minority!
- JAMM: performance monitoring very important.
- WJK: yes, including power monitoring of the physical VM hosts, a la picards. very useful.
- GCASALE: agreed, added a subtle point about frequency of monitoring being very different between "cheap" power mon and "expensive" power mon.. LDK discussing with him.
- SUSAN: Maja had mentioned that she makes a very large amount of use of Matlab, on Windows clusters, buying extra parallel licenses etc. PJM: why not use College standard license? DCW: believe extra modules and parallel licenses not included in College Matlab license, which is why ICT HPC kit doesn't support Matlab either!
- TORA: Lab are very interested in more continuous autotesting, need a better sandbox: like a short term VM to run student code in! Also very interested in scalable storage (didn't say why?)
- JD/SUSAN discussed: where are other Computing Depts with clouds? at any level (Dept, College, federated?) - answer seems to be: none known in production.
- DWM added that LESC had done lots of "cloud v1" - grid - related work, and mentioned the similarities between grids, private clouds, batch processing and HPC.
- PRP said that we should make more use of ICT's HPC, big resource. Susan said: some use ICT extensively (eg PHJK). PJM added that PHJK has found ICT HPC support very helpful, has invested money in more HPC kit, and believes we should make more use of college HPC. SUSAN added that she/Khanwal have found HPC team a bit sniffy about running Java code on HPC kit.
- DCW said: yes, real programmers in HPC:-), and added that lots of money still going in though - let's use it. DCW added: HPC doesn't even let you access College home dirs cos they're "not fast enough" (source: Simon Burbidge, ICT), and mentioned that ICT also upgrading to VMware ESX 5, which "supports cloud" (but DCW doesn't know what that means).
- PJM asked re: this - does everyone want DoC home dirs and research volumes accessible from VMs? everyone agreed, but several people pointed out that existing fileservers can be saturated by Condor so fileservers will need to scale more to cope.
- DCW asked: what about Amazon S3 - simple distributed (key,value) storage system - important to DoC? some people said "might be useful" but noone had a solid use case.
- WJK added that he'd love to do storage speed experiments using different speed storage eg. flash and raid levels.
- TORA added that a large scalable block storage system would be very useful, but neglected to say why.
- DWM said there seems to be a need for scalable storage at some level as part of the cloud, there are a variety of technologies - open source and commercial - to look at. Amazingly, he didn't even say "Ceph":-)
PJM said that commercial filers should be looked into, such as NetApp/EMC. PRP added that cloud storage is NetApp's bread and butter and their support and scalability was really good. Susan said DoC should consider these, but has a preference for open source if possible, DCW: CSG need to investigate NetApp with PRP/Cambridge/ICT help.
- SUSAN reported that DR had initially said - CSG do everything his group needs, why need a cloud. However, when she asked him - could your group use more scalable storage, his eyes lit up!
- DWM: so we conclude that scalable storage is very important? general vague agreement.
- DCW summary: so cloud storage needs to hold VM images, it's not clear whether the same cloud storage subsystem should also support scalable filesystems, or whether fileservers are separate (but need to scale more). No estimate of size! S3 probably not important (optional extra).
- GCASALE asked: what type of cloud? private? DCW/PJM: yes. what about cloudbursting, he asked ? DWM: what's that? GCASALE: ability to upload VMs to Amazon after development (or when need short term extra resources, maybe downloading VMs from Amazon too, general inter-Amazon operability). PJM: useful if possible.
- "tall chap in green shirt": what about network bandwidth? 10Gb links? may also need bandwidth reservation in switch fabric. DWM: talking with ICT networking about 10Gb, they can also discuss bandwidth reservation.
- "natasha's phd student in her place": their group are very interested in virtualizing algorithms but still using FGPAs and GPUs, and again more scalable storage is needed here.
- WL agreed, saying some VM hosts definitely need to have GPUs and FPGAs (he can provide details and costs). DCW added that Amazon EC2 had VMs with access to GPUs and FPGAs etc in their pricing model.
- WL added that he'd be very interested in "getting under the hood" and tweaking and monitoring how various aspects of the cloud operate. PJM said: may be contrary to production cloud - but perhaps a "sandpit cloud" could fork off the main cloud on occasion, grab some hardware etc. WJK agreed.
- PJM talked about a cost accounting model, enforcing 50% maximum usage, sounded very complicated (DCW: god knows how that's even implemented! perhaps logging use for post-analysis). WJK wondered whether anything that heavy was needed.
- JD asked: would we give access to people outside of DoC? DCW: no, our resources, our users. JD: power of clouds (and interesting research topics) is when you get to federating.
- PJM: might be open to sharing with ICT, maybe specific research projects later?
Quick round up of other comments at end, useful services to check?
- CacheDB useful (JD)
OpenNebula (GCASALE)
- Eucalyptus (JD).
OpenStack (PJM)
- Hadoop/Mapreduce (green shirt bloke)
- DCW asked about size of storage: helpful answer was "TBs to PBs".
PJM's summary of meeting
I think that three basic conclusions should be drawn from yesterday's discussion regarding the specification of hardware:
- Our concept of buying compute nodes with large numbers of cores and large memory, with disc storage for virtual machines images, and with 10G network will support the main objective of providing a DoC Cloud that provides virtual computers to the DoC students and staff for teaching and research purposes. We need to refine exactly which machines and configuration will be purchased.
- We should look at input from research groups as to what is the cost of GPU, FGPA, and hardware monitoring, and see if this can be incorporated at this stage.
- We do need to look at fast network storage options.
Next meeting
Next Working Group meeting: 25th April 1pm, level 4 common room