(BIP) Applied Data Modelling: A Case Study in Atmospheric Pollution (3 cr)
Code: ICB018AS2AE-3001
Basic information of implementation
- Enrollment
- 04.02.2025 - 28.02.2025
- Enrolment for the implementation has ended.
- Timing
- 03.02.2025 - 25.05.2025
- Implementation has ended.
- ECTS Credits
- 3 cr
- Campus
- Pasila Campus
- Teaching languages
- English
- Seats
- 5 - 15
- Degree programmes
- ITBBA Business Information Technology
- Teachers
- Juhani Heikkinen
- Jukka Remes
- Groups
-
ITE5PAICB1Business Information Technology, 5th semester, ICT and Business, Pasila, group 1
-
CONTACTContact implementation
-
BLENDEDBlended implementation
- Course
- ICB018AS2AE
Evaluation scale
H-5
Schedule
Virtual part 3.3.-12.5.2025
Intensive part at Pasila Campus 19.-25.5.2025
Implementation methods, demonstration and Work&Study
Teaching Plan
1) Asynchronious Part: Students are provided with a select number of jupyter-notebooks to work through in their own time, at the latest to be finished prior to the in-person part. These lectures include:
-A Python Warmup
-A Pandas Warmup
-Data Visualization in Python (seaborn)
-Regression Analysis in python (statsmodels)
-Tree-Based Methods in Python (sklearn)
-Explainability Methods (learning to use shap values)
2) In Person Part: Students are provided with the atmospheric_pollution dataset. In the intensive program, they will go through the following steps:
-Build a Team of 2-3 people of different universities.
-Review their provided data with respect to quality (missings, outliers, data understanding)
-Identify relationships between potential features and the responses pm10 and no2. Think of the nature of these relationships and let them influence the modelling.
-Model the response for 2015 - 2019 with respect to the previously identified features. Evaluate and diagnose these models and attempt to improve. This is a cyclic step. Investigate which features influence the response and how.
-Provide prediction for 2020 and see whether a lockdown effect can be identified (Start: March 16th) for either one of the responses.
-Summarize findings, present, and discuss with colleagues to learn from their chosen paths and pitfalls.
Materials
Bibliography
-Hörmann, S., Jammoul, F., Kuenzer, T., & Stadlober, E. (2021, February). Separating the impact of gradual lockdown measures on air pollutants from seasonal variability. Atmospheric Pollution Research, 12, doi: 10.1016/j.apr.2020.10.011.
-James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An Introduction to Statistical Learning with Applications in Python. Springer.
Teaching methods and instruction
• Blended Intensive programme
• In co-operation with FH Joaneum
• Virtual part: (3.3.-12.5.2025)
• Intensive part at Pasila Campus 19.-25.5.2025
Information regarding the virtual part:
In this phase, students will use hands-on lecture materials to learn the fundamentals of applied data modelling. The materials will be available as Jupyter Notebooks, which provide an interactive way for students to engage with the included topic. Each notebook comes with executable code, additional information as well as small tasks where students can immediately apply their newfound knowledge. Each notebook is also available with a solution, so students may compare their own results with the suggestions provided by the lecturer. The topics that will be covered are:
• Python Warmup
• Pandas Warmup
• Data Visualization in Python
• Regression Analysis in Python
• Tree-Based Methods in Python
• Explainability & Diagnostic Methods
To help students stay on track in their learning, there will be some online meetings, where the lecturer and the participants will meet and discuss setup questions as well as questions arising from the virtual exercises. The meetings are:
• Kick-Off: 05.03.2025 at 18:00 - 20:00
• Tutorial 1: 26.03.2025 at 18:00 - 20:00
• Tutorial 2: 23.04.2025 at 18:00 - 20:00
The virtual part culminates in a small task that will ensure students know all the required skills for the in-person part. This task will have no provided solution by the lecturer. Students are required to hand in this task until the given deadline and will be given feedback by the lecturer prior to the in-person phase. The task deadline is: 12.05.2025
Information regarding the intensive part held at Pasila Campus:
In this phase, students meet in Haaga-Helia for the in-person part of this intensive programme. Here, students will split into groups of 2-3 people and be provided with a new dataset and a list of tasks. Students will be required to complete these tasks over the course of the week. Students will be able to apply their acquired skills to a real-life dataset and associated research questions in teams. At the end of the week students will summarise their findings, confer with their colleagues and present their results to an audience. The course will conclude with this presentation.
Internationality
In co-operation with FH Joaneum