Data Replication
Students in CS644 course Advanced Operating System have particuliarly looked into the data coherence issues in distributed system. This issue also applies to GDC if the data resident inside GDC is updated frequently and needs to be synced across GDCs.
CAP theorem
In a distributed system, replication is a conventional strategy to prevent node failover or system crash. But replication across different nodes also introduces the consistency and synchronization problem. In the 2000 Symposium on Principles of Distributed Computing (PODC), Scientist Eric Brewer from UC Berkeley first proposed as a conjecture that it is impossible to provide
simultaneously in a distributed system, which is the well-known CAP theorem and has been theoretically proved by Seth Gilbert and Nancy Lynch of MIT in 2012.
Study the existing DC Applications
Following the CAP theorem, since GDC has to face the power outage, which is an extreme case of Network partition, if a distributed system is hosted across GDCs and data have multiple copies and are kept updated across GDCs, then we cannot provide data's strong consistency and high availability at the same time.
There are still a lot of successful solutions to CAP among current popular distributed systems. While still persistent to high availability, some of them customize the applications to maintain weak consistency without hurting the user experience by the special characteristics of their applications, some of them tries to maintain causal consistency even in the face of inconsistency to some extend.
Here is the readling list of those distributed systems and reviews:
PAPER NAME LINKS
The Google File System Brian Devins
Ronny Bulls
[link]
[link]
Scale and Performance in a Distributed File System Ronny Bulls [link]
Windows Azure Storage Vinay Soni
Long Zhang
[link]
[link]
Dynamo: Amazon's Highly Available Key-Value Store Wenjin Hu
Andrew Hicks
[link]
[link]
PNUTS: Yahoo!'s hosted data serving platform Vinay Soni
Brian Hudson
[link]
[link]