|
Models of distributed computing (Introduction)
Lecture Notes: 4-up printable in PDF
References:
J. Turek and D. Shasha, “The Many Faces of
Consensus in Distributed
Systems”,
IEEE Computer, June 1992.
R. Schlichting and F. Schneider, "Fail-Stop processors:
an approach to
designing fault-tolerant computing", ACM Transactions on Computer
Systems,
1(3), August 1983.
F. Cristian, "Understanding Fault-Tolerant Distributed
Systems",
Communications
of the ACM, 34(2), February 1991.
|
|
Synchronous message-passing distributed systems
Leader Election and Breadth-First Search algorithms
Lecture Notes: HTML
(requires a browser that can handle frames); 4-up printable in PDF
and Postscript
format.
Demos: A zip
file containing the LTSA and models for the following algorithms: LCR, LCR
+ halt, FloodMax, FloodMaxOpt, SyncBFS,
SyncBFS
+ termination.
Tutorial Questions:
Tutorial 1 : Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
Tutorial 2 : Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
References:
- “Distributed Algorithms”, Nancy Lynch, Morgan Kaufmann,
1996. Chapters
2, 3 and 4.
The Atomic Commitment Problem
Lecture Notes: HTML
with animations which also includes a PowerPoint show (works well
only
with MS Internet
Explorer 5 - blaim Microsoft for that); 4-up handouts in PDF
and Postscript
format. The handouts do not include the animated slides of the
PowerPoint
presentation.
Fluent
Linear Temporal Logic: 4-up notes handout
in PDF
and " Analyzing Synchronous
Distributed Algorithms" PDF.
Additional notes on modelling link failure- PDF.
Demos: A zip
file containing the LTSA and models for the following algorithms: 2PC, 3PC,
2PC -linkfail, 3PC -
linkfail
Errata: In slide 14 of the handouts, in the very last
line of
the slide, the process must block and not abort! See also the
corrected
handouts above.
The material in this section is based on earlier notes by
Vassos Hadzilacos.
Additional notes: Impossibility of Atomic Commitment
in the presence
of link failures -- proof for 2 processes. (PDF,
Postscript)
Tutorial 3 : Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
References:
- J.N. Gray, "Notes on Database Operating Systems: An
Advanced Course",
Lecture
Notes in Computer Science 60:393-481, Springer-Verlag, Berlin,
1978.
- D. Skeen, "Nonblocking Commit Protocols", In Proc. ACM
SIGMOD Conf. on
Management of Data, pp. 133-147, Orlando FL, June, 1982.
- D. Cheung and T. Kameda, "Site-Optimal Termination
Protocols for a
Distibuted
Database under Networking Partitioning", In Proc. 4th ACM SIGACT-SIGOPS
Symp. on Principles of Distributed Computing, pp. 111-121, Minaki,
Ontario,
August, 1985.
The Byzantine Generals Problem
Lecture Notes: 4-up handouts in PDF.
Tutorial 4: Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
References:
L. Lamport, R. Shostak, M. Pease, "The
Byzantine Generals Problem", ACM
TOPLAS 4(3), pp. 382-401, July 1982.
D. Dolev, H. R. Strong, "Authenticated Algorithms for
Byzantine
Agreement",
SIAM Journal of Computer, 12(4), pp. 656-666, 1983.
|
|
Asynchronous message-passing distributed systems
Impossibility of Consensus
Lecture Notes: HTML
with animations which also includes a PowerPoint show; 4-up
handouts in PDF
and Postscript
format.
Additional notes:
Top-level induction of the FLP impossibility proof (PDF,
Postscript):
Inductive construction of a schedule S*, such that S* is infinite,
admissible, applicable to an initial bivalent configuration C", and for
every prefix S' of S*, S'(C") is also bivalent.
The material in this section is based on earlier notes by Vassos
Hadzilacos.
Tutorial 5 :
Question (PDF,
Postscript).
Sample solutions: PDF,
Postscript.
References:
- M. Fischer, N. Lynch, M. Peterson, "Impossibility of
Distributed Consensus with one Faulty Process", Journal of the ACM,
Vol. 32, No. 2, April 1985.
- J. Turek and D. Shasha, “The Many Faces of Consensus in
Distributed Systems”, IEEE Computer, June 1992.
Logical Time
Lecture Notes: HTML
with animations which also includes a PowerPoint show; 4-up
handouts in PDF
and Postscript
format.
Tutorial 6 (21/02/00): Questions (PDF,
Postscript).
References:
- L. Lamport, "Time, Clocks and Ordering of Events in a
Distributed System", Communications of the ACM, Vol. 21, No. 7, July
1978, pp. 558-565.
- G. Ricart, A.K. Agrawala, "An Optimal Algorithm for
Mutual Exclusion in computer Networks", Communications of the ACM, Vol.
24, No. 1, January 1981.
- F. Mattern, "Virtual Time and Global States of
Distributed Systems", in Proceedings of the International Conference on
Parallel and Distributed Algorithms, pp. 215-216, 1989.
- C. Fidge, "Tmestamps in message-passing systems that
preserve the partial ordering", in Proceedings of the 11th Australian
Computer Science Conference, February 1988.
Fault-Tolerant Broadcasts
Lecture Notes: HTML
with animations which also includes a PowerPoint show; 4-up
handouts in PDF
and Postscript
format.
Tutorial 7 (06/03/00): Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
References:
- Vassos Hadzilacos and Sam Toueg, "A modular approach to
fault tolerant broadcasts and related problems", Technical Report #
94-1425, Dept of Computer Science, University of Toronto and Dept of
Computer Science, Cornell University, May 1994. Available as http://www.cs.utoronto.ca/~vassos/research/publications/HT94/paper.ps.gz
- Ken Birman, "The process group approach to reliable
distributed computing", Communications of the ACM, 36(12), pp.37-53,
1993.
- K. Birman, A. Schiper and P. Stephenson, "Lightweight
causal and atomic group multicast", ACM Transactions on Computer
Systems, 9(3), August 1991.
|
|
Partially synchronous message-passing distributed systems
Failure Detectors
Lecture Notes: HTML
with animations which also includes a PowerPoint show; 4-up
handouts in Postscript
format.
Additional notes:
Informal correctness arguments (PDF,
Postscript)
for the algorithm that solves Consensus using the Perfect Failure
Detector (P). The algorithm is presented in the Lecture Notes,
page 14.
Tutorial 8 (13/03/00): Questions (PDF,
Postscript);
solutions (PDF,
Postscript).
References:
- T.D. Chandra and S. Toueg “Unreliable failure detectors
for reliable distributed systems”, Journal of the ACM, 43(2), March
1996.
- T.D. Chandra, V. Hadzilacos and S. Toueg “The weakest
failure detector for solving consensus”, Journal of the ACM, 43(4),
July 1996.
|