Data Science
Teaching Staff: Maragoudakis Emmanouil
Course Code: HY-390
Course Type: Elective Course
Course Level: Undergraduate
Course Language: Greek
Semester: 7th
ECTS: 4
Teaching Units: 4
Lecture Hours: 2
Lab/Tutorial Hours: 2L
Total Hours: 4
Description of data with graphs and tables. Presentation of basic statistical measures for data description. Data Preparation. The importance of data control and "clearing". Introduction to Databases. SQL. Introduction to supervised learning: decision trees, accounting regression. Introduction to regression: Multiple linear regression. Predictions. Improving a model. The problems of over-parameterization. Performance control of the model. Dimensionality Reduction. The feature selection process. The Principal Component Analysis method with SVD factorization of matrices. Unsupervised learning, Clustering. Applications and evaluation of k-means. Application of Hierarchical Clustering models. Semi-supervised learning. Introduction to Metadata and Big Data. Computational Methods for Big Data Analysis (Hadoop and MapReduce).
Laboratory:
(i) Introduction to the R Language for Data Science.
(ii) Create, select and compare categorical data using Factors. Save tables
data in Data Frames. Select data from a Data Frame and convert it to a Table.
(iii) Basic graphics / visualization packages in R. (iv) Functions - Loops - Flow control.
(v) Introduction to SQL. Queries. Queries in multiple tables with the JOIN command. Operators. Subqueries.
(vi) Rattle.
(vii) R Hadoop.
(viii) Introduction to the RapidMiner tool. Introduction to the KNIME tool.
Back
Studies
e-mail: cs@ionio.gr