Server google hingga 10 Juta?

Secara khusus google tidak pernah mempublikaskan berapa jumlah mesin server yang digunakan untuk mengeruk informasi di semua webiste di dunia.

Dalam sebuah workshop salah satu keynote speakernya merupakan personil google Jeff Dean yang memaparkan presentasinya. Secara tidak langsung menyampaikan bahwa google masa datang akan mengelola server sejumlah 10 juta.

Dean says Spanner will be designed for a future scale of “106 to 107 machines,” meaning 1 million to 10 million machines. The goal will be “automatic, dynamic world-wide placement of data & computation to minimize latency or cost.

Beberapa rangkuman presentasinya:

·         A data center wide storage hierarchy:

o   Server:

§  DRAM: 16GB, 100ns, 20GB/s

§  Disk: 2TB, 10ms, 200MB/s

o   Rack:

§  DRAM: 1TB, 300us, 100MB/s

§  Disk: 160TB, 11ms, 100MB/s

o   Aggregate Switch:

§  DRAM: 30TB, 500us, 10MB/s

§  Disk: 4.8PB, 12ms, 10MB/s

·         Failure Inevitable:

o   Disk MTBF: 1 to 5%

o   Servers: 2 to 4%

·         Excellent set of distributed systems rules of thumb:

o   L1 cache reference 0.5 ns

o   Branch mispredict 5 ns

o   L2 cache reference 7 ns

o   Mutex lock/unlock 25 ns

o   Main memory reference 100 ns

o   Compress 1K bytes with Zippy 3,000 ns

o   Send 2K bytes over 1 Gbps network 20,000 ns

o   Read 1 MB sequentially from memory 250,000 ns

o   Round trip within same datacenter 500,000 ns

o   Disk seek 10,000,000 ns

o   Read 1 MB sequentially from disk 20,000,000 ns

o   Send packet CA->Netherlands->CA 150,000,000 ns

·         Typical first year for a new cluster:

o   ~0.5 overheating (power down most machines in <5 mins, ~1-2 days to recover)

o   ~1 PDU failure (~500-1000 machines suddenly disappear, ~6 hours to come back)

o   ~1 rack-move (plenty of warning, ~500-1000 machines powered down, ~6 hours)

o   ~1 network rewiring (rolling ~5% of machines down over 2-day span)

o   ~20 rack failures (40-80 machines instantly disappear, 1-6 hours to get back)

o   ~5 racks go wonky (40-80 machines see 50% packetloss)

o   ~8 network maintenances (4 might cause ~30-minute random connectivity losses)

o   ~12 router reloads (takes out DNS and external vips for a couple minutes)

o   ~3 router failures (have to immediately pull traffic for an hour)

o   ~dozens of minor 30-second blips for dns

o   ~1000 individual machine failures

o   ~thousands of hard drive failures

o   slow disks, bad memory, misconfigured machines, flaky machines, etc.

·         GFS Usage at Google:

o   200+ clusters

o   Many clusters of over 1000 machines

o   4+ PB clients

o   40 GB/s read/write laod

·         Map Reduce Usage at Google: 3.5m jobs/year averaging 488 machines each & taking ~8 min

·         Big Table Usage at Google: 500 clusters with largest having 70PB, 30+ GB/s I/O

·         Working on next generation GFS system called Colossus

·         Metadata management for Colossus in BigTable

·         Working on next generation Big Table system called Spanner

o   Similar to BigTable in that Spanner has tables, families, groups, coprocessors, etc.

o   But has hierarchical directories rather than rows, fine-grained replication (ad directory level), ACLs

o   Supports both weak and strong data consistency across data centers

o   Strong consistency implemented using Paxos across replicas

o   Supports distributed transactions across directories/machines

o   Much more automated operation

§  Auto data movement and replicas on basis of computation, usage patterns, and failures

o   Spanner design goals: 10^6 to 10^7 machines, 10^13 directories, 10^18 storage, 10^3 to 10^4 locations over long distances

o   Users specify require latency and replication factor and location

Sumber rangkuman dari Hamilton

About me