We will introduce best practices and essential foundations to guide you from data loading to generating scientific insights. By the end of the course, you will be able to develop your data science ideas independently and in a structured way.
In Module I, inspired by the concept of tidy data from the R community, “Tidy Python” will mimic, to a great extent, previous CCMAR’s R workshop. Organise, process, and analyse tabular data of all sorts, however, in Python.
If you are familiar with topics from the Module I program, you can participate only in the Module II program. Module II will provide personalised guidance on your dataset, problem, or project as specified in your registration form.
Prerequisites:
Audience: Researchers, PhD students, Master students
Registration & Costs:
NOTE: Members of CIMAR-LA (CCMAR and CIIMAR) have the same access conditions.
Module I
2-3 July | 24 places
In Module I, from foundational concepts like conditionals and loops to libraries like NumPy, SciPy and Pandas, you’ll learn to harness Python’s power for data manipulation, analysis, and visualisation. Along the way, you’ll develop a structured way of preparing and executing scientific data analysis workflows efficiently and reproducibly.
Day 1: Tidy principles applied to raw data
11:45 – 12:30, Help with Python setup (optional for those who wish to run codes locally on their machines).
14:00 – 17:30, Data exploration and tidy data.
Day 2: Data wrangling and visualization
10:00 – 12:30, Visualisation, pre-processing.
14:00 – 17:30, Analysis, reproducibility.
Please make your registration HERE.
Module II (Bring your data/project) :
4 july | 15 places
Module II will provide personalised guidance on your dataset, problem or project specified in your registration form, which may include data cleaning, exploratory analysis, predictive model development and training.
Day 1: Exploratory analysis, data cleaning, identify and use existing repositories/packages
10:00 – 12:30, Explore data and modularise code.
14:00 – 17:30, Pipelines, external packages, GitHub.
Please make your registration HERE.
Instructors:
Paulo Martel (CINTESIS, UAlg): Computational biologist focusing on protein dynamics and lecturer at UAlg.
David Palecek (PBS, CCMAR): Python practitioner with an interest in automation and bioinformatics.
Website by: Glitz Design