Data Science 511 Syllabus

DSE511 Fall 2020
Meeting Time: MWF 6:00 PM - 6:50 PM
Location: Online - Canvas
Office Hours: online by appointment

Credit Hours: 3
Textbook: None
Prerequisites: None
Course Number: 53690

Instructors:

Jacob Hinkle - hinklejd@ornl.gov
Todd Young - youngmt1@ornl.gov
TA: None

Note: Please include DSE 511 in subject line of emails

Course Description

This course is an introduction to general data science skills not commonly covered in other courses. In particular, we will discuss handling data, best practices for coding, and how to operate in common computing environments like Linux workstations and shared computing clusters. You will learn communication and reproducibility fundamentals to enable your success in large collaborative data science projects.

Assessment and Grading:

Most course-related materials will be posted on Canvas. Your course grade will be evaluated based on your performance in the following areas below. Late assignments will have a 10% penalty per day. If you require extra time to complete an assignment (e.g. due to a paper deadline or other research need), notify the instructors at least one week in advance.

Homework: 30% 
Projects: 70%

Individual Project:

We will assign a mid-term project involving ingestion of raw data and basic data analysis. Code and a short report will be produced. Students will be judged on documentation, reproducibility, and functionality of their solution.

Group Project:

We will split the class into groups of 3-5 individuals who will approach a more complex project involving multiple repositories and roles. Students will be expected to work with one another, contribute to each other’s code-bases, and communicate effectively in order to accomplish their task. At the end of the semester a presentation will be prepared by each group, outlining their approach and we will discuss what worked and did not work as a class.The grade system below will be used to determine your final grade. To minimize possible bias, no grades will be rounded and there will be no extra credit.

Grading

The grade system below will be used to determine your final grade. To minimize possible bias, no grades will be rounded and there will be no extra credit.

GradeMinimum Percentage
A90
B80
C70
D60
F00

Attendance and Participation

Because we will regularly be having group discussions and test material will be presented in class, attendance is expected. No make-up assessments can be scheduled for non-emergency absences. See the university’s policy on valid/excused absences.

Academic Integrity:

“An essential feature of the University of Tennessee, Knoxville is a commitment to maintaining an atmosphere of intellectual integrity and academic honesty. As a student of the university, I pledge that I will neither knowingly give nor receive any inappropriate assistance in academic work, thus affirming my own personal commitment to honor and integrity.” Cheating, plagiarism, or any other type of academic dishonesty will result in a failing grade.

Tentative Class Modules

This schedule is subject to change as we adapt to the experience level of participants.

Introduction

Course introduction/Survey
Software Management with Conda
Environments and dependencies
Intro to text editing

Exploratory Data Analysis & Visualization

Python Data Science Ecosystem
Exploratory Analysis with Notebooks
Introducing Scikit-learn and Pandas
Information Visualization
Scientific Visualization
Interactive Visualization

Machine Learning

Supervised learning
Unsupervised learning
Generalization
Managing randomness

Array Based Computing

Numeric computing in python
Array-based data analysis 
Common linear algebra operations 
Basic GPU Computing

Data Structures

Variables and namespaces 
Functions, closures, and callables
Python classes 
Modules and organization

Effective Linux

Introduction to the GNU/Linux command line
Text editors (vim, emacs, others)
Shell scripting and Pipes
Advanced text processing

Data Storage and Access

HDF5 and other common formats
Handling Relational Data 
ORM
Basics of graph databases
ETL for graph and relational DBMS

Collaboration

Version control
Collaboration with Github/Gitlab
Project management
Best practices for Python packaging
Testing and CI
Documentation 
Typical Data Science Workflows

Computing on Shared Resources

Remote computing with shared computing resources 
Containers

HPC

Deep Learning
Multi-node Parallelism
Cloud computing
Supercomputing

Disability Services

Any student who feels s/he may need an accommodation based on the impact of a disability should contact Student Disability Services in Dunford Hall, at 865-974-6087, or by video relay at, 865-622-6566, to coordinate reasonable academic accommodations.

University Civility Statement

Civility is genuine respect and regard for others: politeness, consideration, tact, good manners, graciousness, cordiality, affability, amiability and courteousness. Civility enhances academic freedom and integrity, and is a prerequisite to the free exchange of ideas and knowledge in the learning community. Our community consists of students, faculty, staff, alumni, and campus visitors. Community members affect each other’s well-being and have a shared interest in creating and sustaining an environment where all community members and their points of view are valued and respected. Affirming the value of each member of the university community, the campus asks that all its members adhere to the principles of civility and community adopted by the campus: http://civility.utk.edu/

Your Role in Improving Teaching and Learning Through Course Assessment

At UT, it is our collective responsibility to improve the state of teaching and learning. During the semester, you may be requested to assess aspects of this course either during class or at the completion of the class. You are encouraged to respond to these various forms of assessment as a means of continuing to improve the quality of the UT learning experience.

Key Campus Resources for Students:

Center for Career Development: Career counseling and resources
Hilltopics: Campus and academic policies, procedures and standards of conduct
Student Success Center: Academic support resources
University Libraries: Library resources, databases, course reserves, and services

The instructors reserve the right to revise, alter or amend this syllabus as necessary. Students will be notified of any such changes.