How are Google and Facebook servers / hard drives utilized?

Shoe BoxShoe Box Banned Posts: 118
I've watched YouTube videos of Google and Facebook data centers, and the individual servers used in each company. And I see each individual server has a pair of hard drives of its own, and these server "blades" are pulled out of production for maintenance or replacement now and then.

So I am wondering how this works. If each server blade has a pair of drives, and I assume those 2 drives are mirrored copies of each other, then a portion of information, or several hundred or thousand facebook user profiles would become instantly unavailable when that server blade is unplugged. So it seems like that can't be right.

If the google search data / facebook user profiles aren't on the hard drives in each server blade, then what is on them? The server OS, of course, but what else? And what is each server blade doing if it isn't directly responsible for a group of facebook profiles or google data?

Comments

  • veritas_libertasveritas_libertas Member Posts: 5,746 ■■■■■■■■■■
    I'm convinced after looking at all the complexity involved that they are maintained by elves and blessed with pixie dust. I recently saw this: http://www.neowin.net/news/lightning-strike-near-google-data-center-permanently-wipes-0000001-of-user-data and wondered how much data .0000001% is...

    Yes, I realize I didn't contribute much to this conversation icon_lol.gif
  • Alif_Sadida_EkinAlif_Sadida_Ekin Member Posts: 341 ■■■■□□□□□□
    I'm sure they use some sort of clustering for their data. Have a look at Hadoop, which uses HDFS (Hadoop Distributed File System). Data gets written to multiple servers, so if one goes down, it's still accessible somewhere else. Same concept for clustering relational databases.
    AWS: Solutions Architect Associate, MCSA, MCTS, CIW Professional, A+, Network+, Security+, Project+

    BS, Information Technology
  • NotHackingYouNotHackingYou Member Posts: 1,460 ■■■■■■■■□□
    Often what you see in that configuration are web servers where the ephemeral data stored in them is distributed across a cluster. They access back-end servers that are also in the same configuration which access database servers. Systems like Hadoop and Couchbase can synchronize cache data across servers. Web apps can be designed to take advantage of this replicated information and replicate session state, making it possible to bounce a user from web server to web server without any service interruption.
    When you go the extra mile, there's no traffic.
  • xenodamusxenodamus Member Posts: 758
    Google and Facebook were pioneers in the hyperconverged space, from what I understand. A large portion of their infrastructure runs on web-scale appliances that use storage virtualization to spread data across the entire cluster of virtualization hosts.

    Check out Nutanix. I think they have a few vids on youtube. They were the first ones to bring that technology to the enterprise, and are still doing very well with it.
    CISSP | CCNA:R&S/Security | MCSA 2003 | A+ S+ | VCP6-DTM | CCA-V CCP-V
  • TheFORCETheFORCE Member Posts: 2,297 ■■■■■■■■□□
    Google, Facebook, Microsoft, Amazon probably other high tech companies as well are probably fully on the cloud, they distribute services from server to server. They have mastered scalability and demand to the level of insanity with virtualized storage and virtualized computing. The cloud is transforming everything!
  • muktamukta Member Posts: 14 ■□□□□□□□□□
    I have to study about it.Now, i have no knowledge about it. I am sorry that i can't help at this moment/.
  • AjitKhodkeAjitKhodke Registered Users Posts: 1 ■□□□□□□□□□
    That might be just enough for one person to be really wiped out of Google.
  • PupilPupil Member Posts: 168
    Don't forget containers (Docker) and software defined networking. Google/Facebook are in a whole different world than even the best enterprise.
  • PristonPriston Member Posts: 999 ■■■■□□□□□□
    Shoe Box wrote: »
    If the google search data / facebook user profiles aren't on the hard drives in each server blade, then what is on them?
    If the servers are hosting VMs I would guess the hypervisor is installed on those hard drives. The VMs with all the real data would use remote storage.
    A.A.S. in Networking Technologies
    A+, Network+, CCNA
Sign In or Register to comment.