Python for Data (CS 199) - Spring 2020 - Pot A

Description

Introduces concepts for obtaining, analyzing, visualizing, and processing data from various sources. Covers image processing and feature identification, streaming data sources, text scraping, and basic elements of data analysis and machine learning. Prerequisite: CS 101, 105, 107, or equivalent Python experience (there is no major or class restriction, however, students must have taken at least one intro to programming course).

Meeting time

Meets 01/21/20-03/13/20 on Wednesdays 1:00pm to 2:20pm, 2101 Everitt Laboratory

Technology

We will be using Python, and since we only have one meeting per week, we will not be able to go over the basics during lecture. Therefore, we expect students to be either familiar with Python, or have solid programming skills to be able to learn another language at a quick pace. You can use this optional tutorial to either brush up your Python knowledge, or learn it for the first time. We will not go over this python tutorial during lecture. Make sure to come to the first lecture prepared – we will assume you know how to use Python.

We will provide the lecture activities in the format of IPython notebooks, available via CoCalc. You can use Jupyter locally from your machine, or complete the lecture activities using CoCalc. You will need to bring a laptop or tablet to class.

Weekly lesson plan

Week 1 Image processing
Week 2 Intro to Pandas (data cleaning and visualization)
Week 3 Monte Carlo Methods
Week 4 Clustering techniques
Week 5 Markov chains
Week 6 Machine learning for classification problems
Week 7 Project presentation

Lecture activities

This class follows an active learning format. Instructors will not be using traditional lecture delivery where they present the material, while students listen and/or take notes. Instead students will be assigned to a group of 5-6 students in the first week of classes. They will be seated with their group in every class. In the beginning of each lecture, we will have a 5-10 minute introduction of a new topic. For the remaining of the lecture, students will be expected to complete the IPython notebook provided via CoCalc. Instructors will be answering questions as needed.

Students can complete the activity in one of the two ways:

  • Use the IPython notebook provided using the group shared folder. Here the entire group is collaborating to complete one file, that is synchronized in real-time among all the members of the group.

  • Use the IPython notebook provided via a public shared link. In this option, students complete the activity individually.

You will be encouraged to work collaboratively with your group, but this is not a requirement.

Grade:

The course grade will be based on attendance, participation, homework and a final project. Here is how you can earn points towards your grade:

Category Points Description
Attendance/Participation 720 Each lecture is worth 120 points, allowing students to miss one lecture. Hence the total score for attendance/participation is capped at 720.
Homework120 Each HW is worth 24 points, for a total of 6 assignments. Only the 5 highest scores will count towards the grade. The homework will be delivered using PrairieLearn.
Final Project 160 Each group will have 3-4 students. Project details below.

The grade scale is given below:

Grade Point Range
A [900,1000]
B+ [800, 900)
B [700, 800)
C [600, 700)
D [500, 600)
F < 500

You are expected to be on time for class. If you are 15 minutes late (or more), you will not get the attendance for that class. You are also expected to work on the IPython notebook during the entire class. Students that are not working on the given activities (and instead playing with their phones or laptops) will NOT get the participation points (might as well not attend). Note that the attendance/participation points are not given for achievement and completion, you will receive all your points even if you don’t complete the activity, or you don’t get it perfectly correct. If you try, you will get the points!

Final project

Students will work in groups of 3-4 students. Each group will have to create an IPython notebook similar to the ones that you will be using during lectures. You will save the notebook (.ipynb) inside the CoCalc shared folder that your group has access to. The topic for the project will be “Clustering”.

Each group will have 5 minutes to present the big picture idea of your project. This is not a lot of time, so you should only share a brief intro, indicate the method you used to solve the problem, highlight your main result, and conclusions. You will present using the IPython notebook (do not create slides). Every member of the group will have to present (so plan accordingly how you will split the 5 minutes).

The overall project is worth 160 points. The notebook submission is worth 80 points (same grade for all members of the team), and the presentation is worth 80 points (each student will receive their individual score). The grading rubrics are provided below.

Rubric for the final project presentation.
Rubric for the final project IPython notebook.

Preparing for the first class:

Before you come to the first lecture, make sure you take a look at the Python tutorials available here. If you are familiar with Python, you probably don’t need to do anything. These are not required tutorials, but it may be a helpful resource to some of you.

Piazza:

We will be using Piazza online message board for communication (piazza.com/illinois/fall2019/cs199py1/home). The course staff will post important announcements there and it is your responsibility to check often for these announcements. If you have a question or a concern, please post it on Piazza. Please do not email the course staff. This is both to assist other students who may have similar questions and to ensure you receive the fastest response possible by making it visible to the entire course staff.

Office hours:

We will not have pre-defined office hours. If you feel you need to talk to one of the instructors, reach out to us here and we will try to schedule an appointment.

Contact information

Mariana Silva
2213 Siebel Center
(217) 300-6633
mfsilva@illinois.edu