Petabyte Storage Solution Question

PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
Hey guys... quick question for you storage architects...

If you were tasked to create and maintain a petabyte SAN with off-site backups, how much staffing would you need? This storage solution must also grow at a rate of 250-300TB per year.

I've already got estimates from storage vendors that come in at approximately $1 million per PB (not including backup solutions). I'm interested in how much staffing you think you would need to maintain just the SAN.

I'm trying to justify Amazon S3/Glacier or Azure.

Comments

  • philz1982philz1982 Member Posts: 978
    Hey guys... quick question for you storage architects...

    If you were tasked to create and maintain a petabyte SAN with off-site backups, how much staffing would you need? This storage solution must also grow at a rate of 250-300TB per year.

    I've already got estimates from storage vendors that come in at approximately $1 million per PB (not including backup solutions). I'm interested in how much staffing you think you would need to maintain just the SAN.

    I'm trying to justify Amazon S3/Glacier or Azure.

    I just did a 40/60 split 6 PB design for a sports stadium.

    What is your usage, and your drive breakout? Is your SAN 5.4k, 7.2k, ect RPM drives? What's your split between ssd, disk, and tape?

    What does the $1M include if that is hardware only for a disk based system you are getting ripped off. I did my 6PB 40/60 split on an HA cluster with N+1 for 9M including prof services.

    Need some clarifies on your current IT staff numbers, DC size, utilization of staff, and vertical market. The manning levels shift dramatically. For example, the SAN at a stadium is only monitored during events and is used to provide data for instant playback reviews.
  • PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
    Current staff is 4 admins stretched thin, supporting about 3,000 users in 15 locations. Storage will be used 24/7/365. No hard stats on any hardware yet, probably will be 100% cheap HDD, with an autoloading tape system. Disk performance not an issue, IOPS don't matter. Going purely for space.
  • philz1982philz1982 Member Posts: 978
    Does the price includes install and configuration? Also did you account for network modifications and client side software for backups?
  • philz1982philz1982 Member Posts: 978
    Depending on how many calls your techs field you will be stretched with four people. The hardware management isn't the big labor component. The labor component is managing the collection and integrity.

    Does your industry require PHI, PCI, or SOX compliance?
  • PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
    Yes, I understand. Do you think we would need to add staffing? That's the only question I asked. I'm just looking for another opinion. I'm trying to make a case for cloud storage.
  • PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
    For arguments sake, yes. We will have stringent auditing and data security requirements.
  • philz1982philz1982 Member Posts: 978
    For arguments sake, yes. We will have stringent auditing and data security requirements.

    Then at a minimum you will need 1/2 an FTE because he will need to check integrity daily and run backups based on data classification and SLA's.
  • PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
    Thanks for the insight!
  • apr911apr911 Member Posts: 380 ■■■■□□□□□□
    I cant comment directly on how many if any admins are required as I dont know enough about the structure of your organization, how things are run, how many devices are to be connected, etc.

    Speaking more generally, I look at these problems from 3 main cost centers for support:

    1. Hardware support - With the increasing reliability of hardware and harddrives, most companies dont require a dedicated admin to support hardware once its in place and setup. That being said, if you skimp and install cheap drives or if you incorrectly provision MLC SSDs for a high-write application that really needed SLC drives, you'll spend more time replacing drives. Generally speaking though, storage array's and associated switches dont often fail completely without warning and swapping the occasional failed disk doesnt require an FTE

    1.5 Hardware/Software Support - This doesnt fit exclusively under point 1 or point 2. It doesnt warrant its own cost center but it does warrant mention... Your Disk/Spindle provisioning, SSD acceleration and array capabilities will play a part in the costs of the array. Without enough spindles or SSD Cache to cover the IOPS desired or if you buy a device incapable of the IOPs required, you will spend a lot of time, money and effort trying to keep the device online. Spending top dollar on top of the line, over provisioned equipment is a waste but in the long run, your support costs will be so much higher that spending bottom dollar on under provisioned equipment can be just as, if not more, costly as the over-provisioned device and therefore just as wasteful.

    2. Software support - This is time spent in the administrative console of the array which pretty much cover 99% of what you are going to do with the device. Again, there are 2 ways to approach this. The hands off way where once things are configured they stay pretty static, in which case costs are again low and unlikely to be a major impact. The other is based on how often your company retools things in the environment. If you are constantly re-architecting/reconfiguring storage tiers and their capacity/usage you will spend more time on the software side of things.

    You will also spend time in the software portion everytime you bring a device connected to the SAN online/offline as you need to create/delete and present/wipe the LUNs. Again in a stable environment without much change (growth != change in this scenario) this might not be very costly but in a rapidly changing/evolving environment you may feel differently.

    Backups fall under this heading but again, a properly implemented DR plan can be largely hands off (though periodic testing should be conducted which can take a considerable amount of time depending on the level of test whether its a full, partial or critical test).

    I also place security auditing and control under the "software" heading though usually as a dotted line to the device itself. The actual cost center for this should fall under the "security" budget but many companies dont budget security separately from the device budget and even when they do grant separate security budgets, they want to know exactly how the money was spent. As philz1982 already pointed out, the more stringent the auditing and security requirements, the more time spent running integrity and security audit checks.

    3. Misc support - This is the category for everything else but it mostly boils down to "user" costs. This doesnt really apply as much to SAN's since the SAN engineer presents the LUN to the server and it then acts like local storage on the device but I can give a prime example using a SMB-NAS device...

    I worked at a company with a configured and provisioned NAS array (actually several, hence the need for a NAS Team & several employees). Now through the normal course of operations, the NAS team was probably a bit overkill but because the company didnt want to spend resources on Linux employees, everytime the company brought a new Linux box online that needed to connect to the NAS, a call had to go out to the NAS team for assistance. The same was true anytime the NAS dropped from one of the existing linux boxes. It didnt matter that the configuration on new servers was identical to that on existing devices or that X-1 linux boxes were still connected to the NAS just fine... In both cases a call went out for support from the NAS team. Ultimately this support was billed back to the NAS team even though the fix more often than not had nothing to do with the NAS.
    Currently Working On: Openstack
    2020 Goals: AWS/Azure/GCP Certifications, F5 CSE Cloud, SCRUM, CISSP-ISSMP
  • netsysllcnetsysllc Member Posts: 479 ■■■■□□□□□□
    You could build backblaze cases for about $10K per 180TB https://www.backblaze.com/blog/backblaze-storage-pod-4, Here is their original story about mass storage on cheap https://www.backblaze.com/blog/petabytes-on-a-budget-how-to-build-cheap-cloud-storage-2/

    Sorry no idea about what it would costs to maintain such a beast though
  • PJ_SneakersPJ_Sneakers Member Posts: 884 ■■■■■■□□□□
    So, what I'm getting is that four guys who have to maintain 3000 users on about 10 different specialized line-of-business applications at 15 different sites... may not be enough to properly plan, set up, maintain, and manage a SAN of that size.

    In other words, it will likely require additional manpower in order to have five 9's of uptime.
Sign In or Register to comment.