und Theorie
diskreter Systeme

Informatik 7

Foundations of Data Science


In the age of "big data" and "advanced analytics", data processing faces new challenges. Queries become more complex and often involve data mining and machine learning tasks, and the scale of the datasets requires new algorithmic approaches.

This course will cover the theoretical foundations of modern data processing and analytics. This includes topics from database theory, such as data models, the analysis of query languages, and basic algorithmic and complexity theoretic questions related to query processing. It also includes topics from algorithmic learning theory, such as basic machine learning algorithms, support vector machines, the PAC model, and VC-Dimension. Furthermore, it includes new models of computation on massive datasets, such as the streaming model and the map-reduce paradigm, and algorithms for these models.

We will focus on computational aspects of the theory. Statistics, though undoubtedly one of the foundations of data science, will not play a central role in this course.

Lectures will be in English.


Time and Place

Tuesday, 8:30 - 10:00 am in 2350|111 (AH II)

Thursday, 2:15 - 3:00 pm in 2350|009 (AH I)


Martin Grohe


There will be separate tutorials for Bachelor- and Master students.

Bachelor: Monday, 4:15-5:45 pm in 2356|056 (5056), held by Marlin Frickenschmidt

Master: Wednesday, 4:15-5:45 pm in 2356|052 (5052), held by Christoph Berkholz


There will be weekly exercise sets. Completing these successfully (at least 50% of possible points) is necessary for admittance to the examination.

A new exercise sheet will be released before the thursday lectures in our L2P room. Each sheet is to be handed in before the thursday lecture a week later, or in the box in E1, first floor before 15:15.


The modalities will be announced. The planned exam dates are:

19.02.2015, 12:00 am, 2350|111 (AH II)

26.03.2015, 12:00 am, 2350|009 (AH I)


S. Abiteboul, R. Hull, V. Vianu. Foundations of Databases. Addison Wesley 1995.

M. Kearns, U. Vazirani. An Introduction to Computational Learning Theory. MIT Press 1994.

J. Hopcroft, R. Kannan. Foundations of Data Science. Unpublished, draft available online.

S.J. Russell, P. Norvig. Artificial Intelligence: A Modern Approach. 3rd Edition, Pearson 2014.