Jaime S. Cardoso – Faculdade de Engenharia da Universidade do Porto

Summary
The course is a joint organization of CLAD (Portuguese Association for Classification and Data Analysis) and the University of Aveiro (DEGEIT and DMAT) and proposes an introduction to the various types of data mining tasks (Data Mining), with emphasis on in predictive analytics (regression and data classification tasks). Representative methods/algorithms will be presented for each type of task and the application of these methods, using Python, to selected real problems will be illustrated. There will also be a brief introduction to Python (for those who already know a programming language) and the main packages for Data Mining in Python will be presented and used.

Recipients
All potential users of data mining/machine learning techniques (teachers, researchers, students and professionals from other areas) who need to address data analysis problems and evaluate the results and understand how the most appropriate methods work. Basic theory will be introduced, but the emphasis will be on applying the concepts using Python software.

Duration and Scheduling
The course will last 6 hours and 30 minutes, from 10:00 to 18:30. The more detailed program is attached.

Condition
The course will work with a minimum and maximum number of 10 and 30 participants, respectively. Applications not accepted will be sent to future editions of the course. Each participant must bring their own laptop. Instructions for installing Python will be sent in advance.

Investment and Application Deadline
The course is free for CLAD members with the 2016 fee paid; the investment for CLAD non-members is €60. CLAD will issue a certificate of participation. The deadline for registration is November 11, 2016.

Contact
If you are interested in attending this course, the attached registration form should be sent to the following e-mail address: mail@clad.pt. The same contact should be used for any other clarifications.

Best regards,
By the Board of CLAD
José Gonçalves Dias

 

Course program

10:00 – 13:00:
The ABC of the learning process: review of the main concepts of the machine learning/data mining process with a focus on the classification task and under the assumption of independence of observations. The ABC’s of Python and the main packages for data mining: a comment on the main features of the Python language and a discussion of the packages NumPy and Scikit-learn.

14:30 – 16:30: Application of previously discussed concepts in Python, proposing to the participants the solution of one of three real problems of choice.

17:00 – 18:30: Analysis of context-dependent data, in particular sequential data. Motivation and discussion of Hidden Markov Models (HMM). Exemplification/application in Python.