CS 286: Implementation of Database Systems

来源:百度文库 编辑:神马文学网 时间:2024/04/29 09:11:41

[Berkeley][EECS Dept.][CS Division][DatabaseResearch]


 

CS 286, Spring 2007

Implementation of Database Systems 

Tu-Th 1:00-2:30, 405 Soda Hall
Minos Garofalakis(minos[at]cs[dot]berkeley[dot]edu)  and  Raghu Ramakrishnan(ramakris[at]yahoo-inc[dot]com)

Office hours: Tuesdays/Thusdays after class, or by appointment


Here's an initial list of project ideas


Schedule, Lecture Notes, and Suggested Readings

Here's an initial, tentativelist of topics and readingsfor the course.

 Part 1:   Query Languages and Recursive Query Processing

Tu 1/16 (Week 1) Query Languages (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Universality of Data Retrieval Languages
A.V. Aho and J.D. Ullman
Proc. of POPL'1979  Th 1/18 (Week 1) Intro to Logic and Databases (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Ramakrishnan & Gehrke (3rd edition)
Tu 1/23 (Week 2) Recursive Query Evaluation-I (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Ramakrishnan & Gehrke (3rd edition)
Th 1/25 (Week 2)
Tu 1/30 (Week 3) Recursive Query Evaluation-II (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Ramakrishnan & Gehrke (3rd edition)

 Part 2:   Decision Support and Approximate Query Processing

Th 2/1 (Week 3) Decision Support Systems (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Ramakrishnan & Gehrke (3rd edition)
(Chapter 25) Tu 2/6 (Week 4) Approximate Query Processing-I (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Improved Histograms for Selectivity Estimation of Range Predicates
V. Poosala, Y. Ioannidis, P. Haas, and E. Shekita, Proc. of ACM SIGMOD'1996 
Optimal Histograms with Quality Guaranteess
H.V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. Sevcik, and T. Suel, Proc. of VLDB'1998
Random Sampling from Databases -- A Survey
F. Olken and D. Rotem, 1994 Th 2/8 (Week 4) Approximate Query Processing-II (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Wavelet-based Histograms for Selectivity Estimation
Y. Matias, J.S. Vitter, and M. Wang, Proc. of ACM SIGMOD'1998
Deterministic Wavelet Thresholding for Maximum Error Metrics
M. Garofalakis, and A. Kumar, Proc. of ACM PODS'2004
Selectivity Estimation without the Attribute Value Independence Assumption
V. Poosala, and Y. Ioannidis, Proc. of VLDB'1997
Tu 2/13 (Week 5) Approximate Query Processing-III (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Join Synopses for Approximate Query Answering
S. Acharya, P.B. Gibbons, V. Poosala, and S. Ramaswamy, Proc. of ACM SIGMOD'1999
Approximate Query Processing using Wavelets
K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim, Proc. of VLDB'2000
Histogram-based Approximation of Set-Valued Query Answers
Y. Ioannidis and V. Poosala, Proc. of VLDB'1999
Th 2/15 (Week 5) Approximate Query Processing-IV (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf )

AQP references (ppt,   pdf ) Approximate Query Processing using Wavelets
K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim, Proc. of VLDB'2000
Independence is Good: Dependency-based Histogram Synopses for High-Dimensional Data
A. Deshpande, M. Garofalakis, and R. Rastogi, Proc. of ACM SIGMOD'2001

 Part 3:   Sequences and Streams

Tu 2/20 (Week 6) Beyond Sets: Multisets, Sequences, and Streams-I (Raghu)
Lecture Notes
(ppt)
Th 2/22 (Week 6) Beyond Sets: Multisets, Sequences, and Streams-II (Raghu)
Lecture Notes
(ppt) Tu 2/27 (Week 7) Data Stream Processing-I (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) The space complexity of approximating the frequency moments
N. Alon, Y. Matias, and M. Szegedy. ACM STOC, 1996
Tracking Join and Self-join Sizes in Limited Storage
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. ACM PODS, 1999
Th 3/1 (Week 7) Data Stream Processing-II (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) See above Tu 3/6 (Week 8) Data Stream Processing-III (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
P. Gibbons. VLDB, 2001
Tracking set-expression cardinalities over continuous update streams
S. Ganguly, M. Garofalakis, R. Rastogi. The VLDB Journal, 2004
Th 3/8 (Week 8) Data Stream Processing-IV (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf ) An improved data stream summary: The count-min sketch and its applications
G. Cormode and S. Muthukrishnan. Journal of Algorithms, 2005
Maintaining stream statistics over sliding windows
M. Datar, A. Gionis, P. Indyk, and R. Motwani. SIAM Journal on Computing, 2002

 Part 4:   Data Mining

Tu 3/13 (Week 9) Data Mining (Raghu)
Lecture Notes
(ppt,   pdf,   6up-pdf ) Th 3/15 (Week 9) Data Mining, contd. (Raghu)
See lecture notes for 3/13
Tu 3/20 (Week 10) Lecture Cancelled
Th 3/22 (Week 10) Data Mining, contd. (Raghu)
See lecture notes for 3/13
Tu 3/20 - Th 3/29 Spring Break!!
Tu 4/3 (Week 11) Data Mining, contd. (Raghu)
See lecture notes for 3/13

 Part 5:   Probabilistic/Uncertain Data Management

Th 4/5 (Week 11) Probabilistic Data Management-I (Minos)
Lecture Notes
(ppt,   pdf,   6up-pdf )
** ppt requires TexPoint to view correctly ** Efficient Query Evaluation on Probabilistic Databases
N. Dalvi and D. Suciu. The VLDB Journal, 2006
Tu 4/10 (Week 12) Probabilistic Data Management-II (Minos)
Lecture Notes
(ppt,   pdf )
** ppt requires TexPoint to view correctly ** Efficient Query Evaluation on Probabilistic Databases
N. Dalvi and D. Suciu. The VLDB Journal, 2006
Working Models for Uncertain Data
A. Das Sarma, O. Benjelloun, A. Halevy, and J. Widom. IEEE ICDE, 2006
Th 4/12 (Week 12) Midterm Exam (in class)
Tu 4/17 (Week 13) Lecture cancelled (ICDE'2007)
Th 4/19 (Week 13) Data Parallelism - Guest lecturer: Chris Olston (Yahoo! Research)
Lecture Notes
(ppt,   pdf ) Tu 4/25 (Week 14) Lecture cancelled (ICDE'2007)
Th 4/27 (Week 14) Probabilistic Data Management-III (Minos)
Lecture Notes
(ppt,   pdf )
** ppt requires TexPoint to view correctly ** Tu 5/1 (Week 15) Probabilistic Data Management-IV (Minos)
Lecture Notes
(ppt,   pdf )
** ppt requires TexPoint to view correctly ** Representing and Querying Correlated Tuples in Probabilistic Databases
P. Sen and A. Deshpande. IEEE ICDE, 2007

 Part 6:   Distributed Data-Stream Management

Th 5/3 (Week 15) Distributed Data-stream Management-I (Minos)
Lecture Notes
(ppt,   pdf ) Tu 5/8 (Week 16) Distributed Data-stream Management-II (Minos)
Lecture Notes
(ppt,   pdf )




Pointers to other database information

  • The Berkeley DBMS Research page is a good place to start.
  • The best source of additional database-related information is the ACM-SIGMOD home page.
  • Michael Ley maintains an amazing compendium of info on database systems and logic programming.
  • UCB DB-group alum Paul Aoki maintains a list of pointers to other random DBMS information.
  • Entertaining & informative history of the System R project and followons: Paul R. McJones (Ed.): The System R Home Page
Database research at other schools:
  • University of Wisconsin DBMS Research Home Page.
  • Stanford Database Group.
  • Maryland Database Research Home Page.
  • Brown Database Group.
  • Cornell Database Group
  • University of Toronto Database Group
  • U Penn Database Research.
  • University of Washington Information retrieval and Database Systems Group
  • ... more to come