Table of Content
CSCI 140: Programming for Data Science - Fall 2023
In this course, I will introduce you to the beautiful world of computer science, programming, and data science. I will not be assuming any programming experience. We will explore and learn the associated concepts using the
python programming language, one of the most beginner-friendly programming languages out there. To install the associated programming libraries and frameworks, we will solely rely on
Anaconda, the most widely used data-science platform for python.
Throughout the course, you will learn the following:
- Variables, objects, and object-oriented programming
- Control flow
- Built-in data structures, including lists and dictionaries
- Data structures from packages such as pandas
- File input/output and file formats
- Obtaining, cleaning, and exploring data
- Processing of text data
- Data Visualization
- Testing and debugging
python, we will analyze, manipulate, and visualize different data types, such as spatial, temporal, numerical, and textual. Of course, doing any analysis and visualization becomes more manageable when we at first convert these to standardized formats through manipulation, which we will learn. We will rely on the
pandas data analysis toolkit to analyze the different data types and will use
seaborn to visualize data. Throughout the course, we will learn to do all these using data from various fields, such as economics, statistics, and linguistics, so that you become comfortable using these at different contexts.
Mandatory Lab Sessions
This is a 4 credit course that includes a weekly lab session. All of the labs are mandatory. The lab sessions allow you to ask questions and go over concepts with real-time support of a graduate TA.
Instructor and Meeting Information
|Name||Amit Seal Ami|
|Schedule||TR:1230-1350, Wren 301|
|Tuesday||11AM to 12PM, 3PM to 4PM|
|Thursday||11AM to 12PM|
|Room||109, McGlothlin Hall|
- We will use Blackboard for course organization, and assignment submissions will be accepted through Blackboard. Other assessments may also be administered through Blackboard.
- Programming in this course will be done in Python3 using the Anaconda Python Stack (link is in Setup document in Module 0). We will provide instructions and assistance for setting up Anaconda for Windows 7/10, Mac OS X, or Linux on your personal machine. The environment will also be available on Department Linux systems in McGl 121. You must not use Python 2.7, which is often the default on personal computers.
- We will use Piazza for class discussion and questions. You will receive an e-mail invitation to the Piazza class specific to CSCI 141; please accept it.
- It is your responsibility to bring an operable device to class each session in order to participate in lecture activity. You will need to access Jupyter notebooks during class.
- Optional textbook: Think Python: How to Think Like a Computer Scientist 2nd edition,
- It is available at the library, as both online and offline version.
- You may also want to consult the following textbook for Module 4 of the course:
- Python for data analysis
- It is also available for online access at the library.
|August 29||Add/drop period begins at 1:00 p.m.|
|August 30||First day of classes|
|September 11||Last day to add/drop|
|October 30||Last day to withdraw from a full-term course|
|December 11-15,18-19||Final Exam period|
For more dates, check the official calendar:
Grade / Marks
(Excluding the Course Syllabus Agreement Quiz and
Final Exam Guidelines Quiz)
|In-term Problem Sets||20%|
|Part 1 (Final Project)||25%|
|Part 2 (Final Problem Set)||12%|
To assign final letter grades, we begin on a ten-percent scale with 100—90 mapping to A and A-, 89—80 falling into B+, B, and B-, etc. If appropriate, we will curve those standards in your favor, but not raise them. We reserve the right to consider participation in borderline grade decisions.
Problem sets are based on material covered in the lecture and the readings and provide an opportunity to figure out concepts that may need to be clarified. Note that:
- Problem sets will be due at 1700 (5:00 PM EST), deadlines will be shared at least a week earlier,
- You may drop the lowest problem set score,
- If you miss a problem set for any reason, that will be your drop.
Your dropped problem set covers the case where you cannot turn an assignment in on time for any reason.
There will be four equally weighted in-term programming projects, one each for Modules 1, 2, 3, and 4. The projects are to demonstrate what you have learned in each module and the previous modules, as programming is a cumulative skill. Each project will showcase code that you write on your own, as well as a debugging task, which helps build your skills in debugging. Debugging is the most essential skill in programming that will always be useful to you.
Finally, you will have the chance to reflect on your work for each project and will share your thoughts, struggles, and plans for future work with us. Reflection is required to be submitted as a PDF document.
For projects, you may collaborate with up to 3 other students. However, you will still have to submit your report individually. In the submitted report, specify the names of your collaborators.
Each of you can take a 48-hour extension on any TWO separate projects of your choice (excluding the final project). To elaborate, for two of the four projects, you may choose to extend the deadline by 48 hours.
Once the extensions have been used, projects may be submitted up to 48 hours after the deadline for up to 50% credit. If the 48-hour extension has already been applied to a project, a project can be submitted for up to 50 points total within 48 hours of the original deadline.
Projects received more than 48 hours after the original deadline or not turned in at all will automatically default to a grade of 0.
The extensions cover all cases for which you would request an extension to include but are not limited to personal, medical, academic, and technological issues. Projects are given over a week in advance, and choosing to start the project a day or two before the deadline does not warrant a longer extension.
It is also your responsibility to ensure that you have submitted gradable files and the correct version of your project. If your files are not gradable, you will receive a 0. The last version of the project you submit by the deadline will be graded, even if you submitted it in error.
Emailed submissions will not be accepted.
The final exam consists of a Final Project and a Final Exam Problem Set.
Exam deadlines are specified by the Dean of Students, and the Dean’s office must grant any changes to exam deadlines in advance of the due date.
The tentative deadline for final project submission is:
December 14, 7:00 p.m. - 10:00 p.m.
Please refer to the official final exam schedule website for the latest update:
There are comprehension quizzes for each section of the course, which are designed to assess your understanding of core concepts and ideas from the readings, lecture material, and other additional materials you may receive. These quizzes can be completed as often as possible up to the deadline.
Participation and Communication
Participation in this course is critical to your understanding. There are 2 components to your participation: (1) lecture and lab attendance, and in-class participation; (2) Piazza. We expect you to be present and in good hygienic condition at class meetings and labs. Contact me as soon as possible in the event of illness, emergency, or co/extra-curricular conflict.
Piazza participation involves contributing to posted problems, and posting, and answering questions of your peers. You should only use Piazza send message to Instructors privately only if it contains private information or confidential information. Posting a short snippet of code to the public channel to ask generic questions (e.g., to understand a concept) is OK. However, please keep your assignment solution private.
The instructor and TAs will send emails for essential and urgent messages.
Collaboration and Honor Policy
All work in this course is subject to the W&M’s Honor Code. Cheating cases involving projects in CS courses are typically Level III violations of the Honor Code. This means that they must be adjudicated by the Honor Council and may result in one or more of the following: permanent dismissal from the College, suspension from the College, a failing grade, or a grade reduction in the course.
Policy on outside sources: This course emulates real-life programming and problem-solving practices, so you can use any static sources. This means books (and ebooks), journal articles, online videos, and static web pages (e.g., written tutorials and blog posts). You are NOT permitted to use sites where you submit part or all of your assignment and other individuals solve the problems/write the code for you - this includes but is not limited to sites like Reddit and StackOverflow. It is OK to read the static posts made by other people on these sites, but receiving direct assistance from individuals online or offline is not permitted. Using sites like Chegg, where individuals post questions verbatim and receive answers, is not permitted. This course’s goal is learning, which only happens when you do the thinking and find the answer yourself. You know the difference between using a resource and asking someone to do the work for you. Similarly, you must not rely on machine-learning/artificial intelligence-based coding assistants like ChatGPT, GitHub Co-Pilot, etc. While these assistants help reach a solution, they do not help with building, developing, and improving your programming skills.
You may collaborate with individuals currently enrolled in CSCI 140 at W&M. However, you should not be receiving assistance from individuals not currently enrolled in the course, such as friends, parents, siblings, random internet strangers, etc. Collaboration with your classmates is strongly encouraged..
Evaluation of Works
Generally, we expect anything you submit to be neatly, carefully, and professionally presented. You must adhere to the file formats that we specify. Photographs of written work embedded in PDF documents will not be accepted. All submissions must be submitted online.
We will not accept anything e-mailed to any instructor or assistant for this course.
Written work should be grammatically correct. You should cite any references you consult, noting well any restrictions on permissible items for the assignment.
Programs should be well-styled. Consistent and clear naming conventions, appropriate use of comments, and efficient design are all critical attributes of quality programs. Awarded scores will reflect these attributes in addition to functionality.
It is the policy of The College of William and Mary to accommodate students with disabilities and qualifying diagnosed conditions by federal and state laws. Any student who feels s/he may need accommodation based on the impact of a learning, psychiatric, physical, or chronic health diagnosis should contact me privately to discuss your specific needs.
Students will also need to contact Student Accessibility Services staff at 757-221-2509 or at firstname.lastname@example.org to determine if accommodations are warranted and to obtain an official letter of accommodation. For more information, please see www.wm.edu/sas. You have our word that such matters will be handled professionally and with sensitivity. We associate no stigma with any form of disability.