Module Specification

The information contained in this module specification was correct at the time of publication but may be subject to change, either during the session because of unforeseen circumstances, or following review of the module at the end of the session. Queries about the module should be directed to the member of staff with responsibility for the module.
1. Module Title Introduction to Data Science
2. Module Code COMP229
3. Year Session 2023-24
4. Originating Department Computer Science
5. Faculty Fac of Science & Engineering
6. Semester First Semester
7. CATS Level Level 5 FHEQ
8. CATS Value 15
9. Member of staff with responsibility for the module
Dr O Anosova Public Health, Policy & Systems O.Anosova@liverpool.ac.uk
10. Module Moderator
11. Other Contributing Departments  
12. Other Staff Teaching on this Module
Professor V Kurlin Computer Science Vitaliy.Kurlin@liverpool.ac.uk
Mrs J Birtall School of Electrical Engineering, Electronics and Computer Science Judith.Birtall@liverpool.ac.uk
13. Board of Studies
14. Mode of Delivery
15. Location Main Liverpool City Campus
    Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other TOTAL
16. Study Hours 30

  10

      40
17.

Private Study

110
18.

TOTAL HOURS

150
 
    Lectures Seminars Tutorials Lab Practicals Fieldwork Placement Other
19. Timetable (if known)            
 
20. Pre-requisites before taking this module (other modules and/or general educational/academic requirements):

COMP116 Analytic Techniques for Computer Science; COMP109 Foundations of Computer Science
21. Modules for which this module is a pre-requisite:

 
22. Co-requisite modules:

 
23. Linked Modules:

 
24. Programme(s) (including Year of Study) to which this module is available on a mandatory basis:

25. Programme(s) (including Year of Study) to which this module is available on a required basis:

26. Programme(s) (including Year of Study) to which this module is available on an optional basis:

27. Aims
 

1. To provide a foundation and overview of modern problems in Data Science.
2. To describe the tools and approaches for the design and analysis of algorithms for da-ta clustering, dimensionally reduction, graph reconstruction from noisy data.
3. To discuss the effectiveness and complexity of modern Data Science algorithms.
4. To review applications of Data Science to Vision, Networks, Materials Chemistry.

 
28. Learning Outcomes
 

(LO1) describe modern problems and tools in data clustering and dimensionality reduction,

 

(LO2) formulate a real data problem in a rigorous form and suggest potential solutions,

 

(LO3) choose the most suitable approach or algorithmic method for given real-life data,

 

(LO4) visualise high-dimensional data and extract hidden non-linear patterns from the data.

 

(S1) Critical thinking and problem solving - Critical analysis

 
29. Teaching and Learning Strategies
 

Teaching Method 1 - Lecture
Description: Formal Lectures

Teaching Method 2 - Tutorial
Description: Tutorials with 4-5 formative assessments (marked by demonstrators) - using problems similar to exam questions.

Standard on-campus delivery
Teaching Method 1 - Lecture
Description: Mix of on-campus/on-line synchronous/asynchronous sessions
Teaching Method 2 - Tutorial
Description: On-campus synchronous sessions

 
30. Syllabus
   

1. Descriptive Statistics (3 lectures): average, range, median, mode, quartiles, sample deviation and variance, box plot.

2. Introduction to probability (3 lectures): probability axioms, combinatorial probabilities, probability paradoxes.

3. Probability distributions (3 lectures): uniform distribution, beta distribution, normal distribution.

4. Hypothesis testing (3 lectures): confidence intervals, P-value, statistical significance.

5. Bayesian statistics (3 lectures): conditional probabilities, Bayes formula, Bayesian vs frequentist approaches.

6. Linear regression (3 lectures): scatterplots and correlation, linear approximation to data, regression formulae.

7. Clustering (3 lectures): types of clustering algorithms, optimisation for k-means clustering, Lloyd’s algorithm.

8. Linear maps (3 lectures): matrices of linear maps, scaling, reflections, rotations, compositions.

9. Invariants of linear maps (3 lectures): determinant a nd eigenvalues of a matrix, a change of a linear basis.

10. Dimensionality reduction (3 lectures): principal component analysis and singular value decomposition.

 
31. Recommended Texts
  Reading lists are managed at readinglists.liverpool.ac.uk. Click here to access the reading lists for this module.
 

Assessment

32. EXAM Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
  (229) Written examination 120 70
33. CONTINUOUS Duration Timing
(Semester)
% of
final
mark
Resit/resubmission
opportunity
Penalty for late
submission
Notes
  (229.1) Class test 0 30