Python for Data (CS 199) - Spring 2020 - Pot B

Description

Introduces concepts for obtaining, analyzing, visualizing, and processing data from various sources. Covers image processing and feature identification, streaming data sources, text scraping, and basic elements of data analysis and machine learning. Prerequisite: CS 101, 105, 107, or equivalent Python experience (there is no major or class restriction, however, students must have taken at least one intro to programming course).

Meeting time

Meets 03/16/20-05/06/20 on Wednesdays 1:00pm to 2:20pm, 2101 Everitt Laboratory

Technology

We will be using Python, and since we only have one meeting per week, we will not be able to go over the basics during lecture. Therefore, we expect students to be either familiar with Python, or have solid programming skills to be able to learn another language at a quick pace. You can use this optional tutorial to either brush up your Python knowledge, or learn it for the first time. We will not go over this python tutorial during lecture. Make sure to come to the first lecture prepared – we will assume you know how to use Python.

We will provide the lecture activities in the format of IPython notebooks, available via CoCalc. You will need to bring a laptop or tablet to class.

Weekly lesson plan

Week 1 Intro to Pandas (data cleaning and visualization)
Week 2 Image processing
Week 3 Monte Carlo Methods
Week 4 Clustering techniques
Week 5 Markov chains
Week 6 Machine learning for classification problems
Week 7 Project presentation

Lecture activities

This class follows an active learning format. Instructors will not be using traditional lecture delivery where they present the material, while students listen and/or take notes. Instead students will be assigned to a group of 5-6 students in the first week of classes. They will be seated with their group in every class. In the beginning of each lecture, we will have a 5-10 minute introduction of a new topic. For the remaining of the lecture, students will be expected to complete the IPython notebook provided via CoCalc. Instructors will be answering questions as needed.

Grade:

The course grade will be based on attendance, participation, homework and a final project. Here is how you can earn points towards your grade:

Category Points Description
Attendance 480 Each lecture is worth 80 points, allowing students to miss one lecture. Hence the total score for attendance is capped at 480.
Participation 240 See details about lecture participation below.
Homework120 Each HW is worth 24 points, for a total of 6 assignments. Only the 5 highest scores will count towards the grade. The homework will be delivered using PrairieLearn.
Final Project 160 Each group will have 3-4 students. Project details below.

The grade scale is given below:

Grade Point Range
A [900,1000]
B+ [800, 900)
B [700, 800)
C [600, 700)
D [500, 600)
F < 500

You are expected to be on time for class. If you are 15 minutes late (or more), you will not get the attendance for that class. You are also expected to work on the IPython notebook during the entire class. Students that are not working on the given activities (and instead playing with their phones or laptops) will NOT get the participation points (might as well not attend). Note that the attendance/participation points are not given for achievement and completion, you will receive all your points even if you don’t complete the activity, or you don’t get it perfectly correct. If you try, you will get the points!

Final project

Students will work in groups of 3-4 students. Each group will have to create an IPython notebook similar to the ones that you will be using during lectures. You will save the notebook (.ipynb) inside the CoCalc shared folder that your group has access to. The topic for the project will be “Clustering”.

Each group will have 5 minutes to present the big picture idea of your project. This is not a lot of time, so you should only share a brief intro, indicate the method you used to solve the problem, highlight your main result, and conclusions. You will present using the IPython notebook (do not create slides). Every member of the group will have to present (so plan accordingly how you will split the 5 minutes).

The overall project is worth 160 points. The notebook submission is worth 80 points (same grade for all members of the team), and the presentation is worth 80 points (each student will receive their individual score). The grading rubrics are provided below.

Rubric for the final project presentation.
Rubric for the final project IPython notebook.

Lecture Participation

Groups will have access to a shared IPython notebook on CoCalc. You can think of these notebooks as “Google Docs”, but instead of collaborating on a text document, you will be creating a programming assignment collaboratively.

To encourage and help collaborations during lectures, in each class the groups will have to assign the following roles to students:

• Manager: this student will make sure that the group is “on-track” to complete the activity, and that students are performing their assigned roles. At the end of each lecture, the manager will complete a short survey (3-4 questions) to provide us feedback about the activity.

• Recorder (or “driver”): this student will be the one responsible for doing most of the writing (completion of the IPython notebook). All students are able to contribute to the notebook, but the recorder should be the one coordinating these efforts.

• Reflector: this student will answer to a short survey at the end of the class that reflects about how the group worked together to complete the activity.

Students will take turns in each one of these roles. The participation in each role will give students 80 points. Hence, in order to get the total participation points of 240, each student will need to participate in each role at least once.

(If you are interested, you can ready more about these roles for group collaborations here.)

Preparing for the first class:

Before you come to the first lecture, make sure you take a look at the Python tutorials available here. If you are familiar with Python, you probably don’t need to do anything. These are not required tutorials, but it may be a helpful resource to some of you.

Piazza:

We will be using Piazza online message board for communication (piazza.com/illinois/fall2019/cs199py1/home). The course staff will post important announcements there and it is your responsibility to check often for these announcements. If you have a question or a concern, please post it on Piazza. Please do not email the course staff. This is both to assist other students who may have similar questions and to ensure you receive the fastest response possible by making it visible to the entire course staff.

Office hours:

We will not have pre-defined office hours. If you feel you need to talk to one of the instructors, reach out to us here and we will try to schedule an appointment.

Contact information

Mariana Silva
2213 Siebel Center
(217) 300-6633
mfsilva@illinois.edu