Distributed Systems Reading List
来源:百度文库 编辑:神马文学网 时间:2024/04/29 01:17:51
A Distributed Systems Reading List
Introduction
I often argue that the toughest thing about distributed systems ischanging the way you think. The below is a collection of material I'vefound useful for motivating these changes.
The Google Infrastructure
Current "rocket science" in distributed systems.
- MapReduce
- Chubby Lock Manager
- Google File System
- BigTable
- Data Management for Internet-Scale Single-Sign-On
Experience at MySpace
One of the larger websites out there with a high write load which is not the norm (most are read dominated).
eBay
Interesting they dumped most of J2EE and use a lot of db partitioning. Check out their site upgrade tool as well.
- http://www.addsimplicity.com.nyud.net:8080/downloads/eBaySDForum2006-11-29.pdf
Adam Bosworth
Broad vision, way of the future type stuff.
- http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=337
Amazon
Somewhat about the technology but more interesting is the culture and organization they've created to work with it.
- http://acmqueue.com/modules.php?name=Content&pa=printer_friendly&pid=403&page=1
- http://www.acmqueue.com/modules.php?name=Content&pa=printer_friendly&pid=388&page=1
- http://www.itconversations.com/shows/detail1634.html - Similar to the above link
- http://searchwebservices.techtarget.com/originalContent/0,289142,sid26_gci1195702,00.html
Thought Provokers
Ramblings that make you think about the way you design. Noteverything can be solved with big servers, databases and transactions.
- On Designing and Deploying Internet Scale Services - James Hamilton
- Latency Exists, Cope! - Commentary on coping with latency and it's architectural impacts
- The Perils of Good Abstractions - Building the perfect API/interface is difficult
- http://www.allthingsdistributed.com/historical/archives/000456.html - Vogels, Epidemics
- Chaotic Perspectives - Large scale systems are everything developers dislike - unpredictable, unordered and parallel
- http://poorbuthappy.com/ease/archives/2007/04/29/3616/the-top-10-presentation-on-scaling-websites-twitter-flickr-bloglines-vox-and-more - A collection of scalable architecture papers from various of the large websites
- Memories, Guesses and Apologies - Pat Helland
- SOA and Newton's Universe - Pat Helland
- Why Distributed Computing? - Jim Waldo
- A Note on Distributed Computing - Waldo, Wollrath et al
Theory
- Distributed Computing Economics - Jim Gray
- Fallacies of Distributed Computing - Peter Deutsch
- Impossibility of distributed consensus with one faulty process - also known as FLP [access requires account and/or payment, a free version can be found here]
- Unreliable Failure Detectors for Reliable Distributed Systems. A method for handling the challenges of FLP
- Lamport Clocks - How do you establish a global view of time when each computer's clock is independent
- The Byzantine Generals Problem
- Lazy Replication: Exploiting the Semantics of Distributed Services
Paxos Consensus
Understanding this algorithm is the challenge. I would suggestreading "Paxos Made Simple" before the other papers and again afterward.
- The Part-Time Parliament - Leslie Lamport
- Paxos Made Simple - Leslie Lamport
- Paxos Made Live - An Engineering Perspective - Chandra et al
- Revisiting the Paxos Algorithm - Lynch et al
- How to build a highly available system with consensus - Butler Lampson
- Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial - Fred Schneider
Consistency Models
- CAP Conjecture - Consistency, Availability, Parition Tolerance cannot all be satisfied at once
- Consistency and Availability - Vogels
- Eventual Consistency - Vogels
- Avoiding Two-Phase Commit - Two phase commit avoidance approaches
- 2PC or not 2PC, Wherefore Art Thou XA? - Two phase commit isn't a silver bullet
- Life Beyond Distributed Transactions - Helland
- Starbucks doesn't do two phase commit - Asynchronous mechanisms at work