Imperial College  » Dept of Computing  » Peter Pietzuch Teaching » AdvDB (09/10)
 
 

PhD Students

I'm always looking for outstanding and motivated PhD students. Starting dates for PhDs are April and October. More information on the application process and funding opportunities is available here.

EBS Book

DEBS Book

 

Advanced Databases (312) 09/10

Lecturer: Dr Peter Pietzuch
Email: prp@doc (you know the rest)
Office: Huxley 442

Format: Three combined lectures/tutorials per week (2nd half; weeks 6 - 10)
Location: LT 311

The second half of the Advanced Databases course focuses on implementation issues in database systems. It covers algorithms and techniques that are used by current relational DBMS for storing data on disk, supporting fast access to data, enabling users to quickly query data, and providing guarantees when querying and updating data.

Please send me email if you have any questions. 

News 

  • 8/12/09: The deadline for the 2nd Advanced Databases coursework was extended until Friday (Dec 11th).
  • 20/11/09: The coursework should be done in groups of two students.

  • 30/11/09: Mistake in coursework:

    File TestingDataBase.java
    Line 26:                 Iterator<RecordID> it = allRIDs.iterator();
    should be moved to Line 133 (just in front of the while loop)

    An updated version of the file has been uploaded to CATE.

Syllabus

1. Physical Storage Organisation

  • Storage properties: memory and disks
  • Main concepts: files, record, fields, pages, blocks
  • File organisation: heap files, sorted files, hashed files
  • Record organisation: representation, fixed and variable length
  • Page organisation: representation, fixed and variable-length records

2. Indexing

  • Types of ordered indices: primary, secondary, dense, sparse, clustered, unclustered
  • Multi-level indices
  • Tree-structured indices:B trees, B+ trees
  • B+ trees: search, insert, delete, overflow
  • Hashing: static, dyanmic
  • Extendible hashing, linear hashing
  • Multi-dimensional indexing: grid files, bitmap indices

3. Query Processing and Optimisation

  • Evaluation plans
  • Cost estimation: selection, projection, set operations, sorting
  • Sorting: sort-merge algorithms
  • Join processing: nested loop join, block nested loop join, sort-merge join, hash join
  • Materialisation and pipelining
  • Query optimisation: cost estimation, size estimation, expression equivalence, cost optimiser

4. Transaction Recovery

  • ACID properties
  • Log-based recovery
  • Write-ahead logging
  • Checkpointing and fuzzy checkpointing
  • Basic recovery algorithm and ARIES

5. Stream Processing

  • Data Stream Management Systems (DSMS): application domains
  • Stream processing model: data model, query model
  • DSMS implementation (not examinable since not covered in lectures)

Handouts

Note: I will make the answers for exercises available one week after the corresponding lecture.

Week 6

Part 0 - Introducion
Part 1 - Storage
Exercise answers (Part 1 - Storage)

Week 7

Part 2 - Indexing
Exercise Answers (Part 2 - Indexing)

Week 8

Part 3 - Query Processing and Optimisation
Exercise Answers (Part 3 - Query Processing and Optimisation)

Week 9

Part 4 - Recovery
Exercise Answers (Part 4 - Recovery)

Week 10

Part 5 - Stream Processing
Exercise Answers (Part 5 - Stream Processing)

Recommended Books

  • R. Ramakrishnan and J. Gehrke, Database Management Systems, third edition, McGraw-Hill Higher Education publishers, 2003
    • Chapters 8+9: Physical Storage
    • Chapters 10+11: Indexing
    • Chapters 12-15: Query Processing
    • Chapters 18: Crash Recovery
  • Hector Garcia-Molina, Jeff Ullman, Jennifer Widom, Database Systems - The Complete Book, 2nd edition, Prentice Hall, 2008.
  • Abraham Silberschatz, Henry Korth, S. Sudarshan, Database System Concepts, fifth edition, McGraw-Hill, 2006.


Last modified on: 03-10-2010 10:56:29 — (C) Peter Pietzuch — Email:prp(at)doc(dot)ic(dot)ac(dot)uk
Powered by Etomite CMS.