Research Profile
Teaching Profile
Professional Profile
Personal Profile
Contact Information

Professor Hossein Saiedian: EECS 331: Data Science

Course title

EECS 331: Introduction to Data Science -- Spring 2024
Tuesdays and Thursdays, 11:00 am - 12:15 pm, LEEP2 G415
Teaching website: people.eecs.ku.edu/~saiedian/Teaching


Professor Hossein Saiedian
Office: Eaton Hall 3012
Telephone: 785-864-8812
E-Mail: saiedian AT ku.edu
WWW: people.eecs.ku.edu/~saiedian
Office Hours: Tuesdays and Thursdays, 1:00-2:00 PM (and by appointment)

Course description

This course covers the core concepts in data science via programming. Topics include data lifecycle activities such as data collection, data analysis and integration, data cleaning and wrangling, and data visualization. Data science concepts such as classification, KNN and linear regression analysis, clustering, and statistical inference will be presented. The course includes practical case studies and problem solving in science, engineering, business, medicine, and social sciences. Programming tools include Python, R, SQL, and Unix shell. Prerequisite: Programming (EECS138 or EECS168) and statistics (Math 365 or a comparable stats course).

Recommended textbook

The following textbook is a very popular DS textbook; its Python version will be the primary course textbook resource (not yet published):

Tiffany Timbers, Trevor Campbell, and Melissa Lee,
Data Science: A First Introduction,
Taylor and Francis, 2022.

Chiragh Shah,
A Hands-On Introduction to Data Science,
Cambridge, 2020

Joel Grus,
Data Science from Scratch,
O'Reilly, 2019

Students are responsible for lecture notes, reading assignments, as well as items distributed during the classroom sessions. Important reading materials as well as lecture slides will be placed on Canvas.

Lecture notes


Project data resources

Evaluation criteria (subject to revision)

Students will be evaluated as follows:

Exams and quizzes: 60%
Assignments (labs, homework): 40%

Grading scale:
A = 90%..100%
B = 80%..89%
C = 70%..79%
D = 60%..69%

Exams and quizzes will be closed book and notes and on Canvas. Always bring a device that allows you to connect to Canvas and take the exam or quiz. No other devices is to be used other than the device used

    On exams, quizzes, and lab assignments: For questions with multiple parts, like "name three parts of...", avoid the single-paragraph trap! Treat each part separately. This prevents muddled answers and makes your understanding clear. Use clear identifiers, like "(1)", "(2)", "(3)", to easily navigate your response. Remember, for answers exceeding three parts, only the first three will be assessed.

  • On exams, quizzes, and lab assignments: Show your class engagement! Responses pulled from generative AI, Wikipedia, or similar online sources won't win you points. We seek responses rooted in what was discussed and presented in class and captured in your lecture notes. This demonstrates your attentiveness, study effort, and ability to connect concepts from class to your answers. We will be rewarding students who attend class, take notes, study their own notes, and provide responses that connect to the classroom discussions and contents.
  • On exams, quizzes, and lab assignments: Technical explanations and specific details are your allies! Vague and incomplete answers leave us guessing about your grasp of the concepts. To award full credit, we need confident evidence of your understanding. So, strive for accuracy, comprehensiveness, and technical soundness and avoid vague response. Remember, an answer that's almost there but lacks precision or strays from the target won't earn full marks.

All written work must be typeset and submitted on Canvas.

Tentative weekly schedule (re-visit for updates)

Course introduction

What is data science: Big data characteristics
Data science relation to other fields
    Computer science
    Business analytics
    Social sciences

All lecture notes (slides) are on Canvas

Data life cycle

Structured vs semi-structured vs unstructured data

Data collection and pre-processing
    Data cleaning
    Data integration
    Data transformation
    Data discretization

All lecture notes (slides) are on Canvas

Data collection and pre-processing (continued with a case study)
    Data cleaning
    Data integration
    Data transformation
    Data discretization

A quick introduction to R and tidyinverse (via a hands-on case study)

All lecture notes (slides) are on Canvas

Data science techniques
    Descriptive analysis
    Exploratory analysis
    Predictive analytics
    Inferential analysis
    Causal analysis
    Mechanistic analysis

All lecture notes (slides) are on Canvas

Data science techniques (continued)
    Descriptive analysis
    Exploratory analysis
    Predictive analytics
    Inferential analysis
    Causal analysis
    Mechanistic analysis

Thursday February 16: Exam 1

All lecture notes (slides) are on Canvas

Tools and skills for data science
    Reading data in varying format and sources
    Data cleaning and wrangling

All lecture notes (slides) are on Canvas

Tools and skills for data science (continued)
   Data science visualization

A preliminary introduction to ML
Why modeling, optimization models
Domain analysis/understanding
Cancer basics (for the case study)

Tools and skills for data science (continued)
     Supervised learning
     Classification (training and predicting)
     Advanced classification (evaluation and tuning)

All lecture notes (slides) are on Canvas

No classes - spring break

Tools and skills for data science (continued)
     classification: K-nearest neighbors
     classification: linear regression

Thursday March 23: Exam 2

All lecture notes (slides) are on Canvas

Tools and skills for data science (continued)
     Classification: evaluation and tuning
     Regression: linear regression

All lecture notes (slides) are on Canvas

Tools and skills for data science (continued)
     Regression: linear regression
     Unsupervised learning
     Clustering techniques and analysis

All lecture notes (slides) are on Canvas

Conceptual and logical data modeling
Database and SQL processing for data science

All lecture notes (slides) are on Canvas

Statistical inference
Unix tools for data science
Data science with R

All lecture notes (slides) are on Canvas

Data collection evaluation
    Comparing models
    A/B testing
    Cross validation

All lecture notes (slides) are on Canvas

Model evaluation
    Classification evaluation measures
    Sensitivity and specificity
    Methods for model evaluation
    An application of model evaluation
Emerging trends in data science

All lecture notes (slides) are on Canvas

Comprehensive final May 10 10:30-1:00 pm

Common policies

Attendance. Attendance is important and required. Throughout the semester, attendance may randomly be taken; each three absences (in classroom or lab) will result in a letter-grade drop (will show when the final grade is posted). Furthermore, if a student misses a class session, he or she will be entirely responsible for learning the materials missed without the benefit of a private lecture on the instructor's part. Furthermore, the student will be responsible for finding out what assignments may have been given and when they are due, any updates to the term project, schedule or the course syllabus.

Late-work, makeup policy. No late work will be accepted. No makeup option (for a lab, quiz, or exam) will be provided.

Exceptions will be made for .

Verification (documentation) of an excusable absence will be required. An excusable absence requests must be submitted in advance and approved by the instructor, unless it is an emergency. Verification documents must be attached to the request.

Make-up quizzes and exams for excused absences will have to be completed before the following session when the quiz/exam key becomes public. Make-up for an excused lab absence should be completed within one week.

Technical problems. If you experience technical problems with your EECS account or the EECS servers or the lab equipment, please submit a support request help at: https://tsc.ku.edu/request-support-engineering-tsc.

Inside classroom policy. Students are expected to come to the class on time, be attentive and engaged, conduct themselves professionally, and avoid anything that could cause a distraction or detrimental either for other students learning or for the instructor's presentations. Profanity and swearing is not allowed.

Students are expected to actively participate in all classroom presentations and discussions, ask questions, and regularly make contributions such as offering comments, responding with good answers, and providing feedback.

Canvas announcements. Announcements is a Canvas tool to post important information and updates to all members of a course. It is your responsibility to regularly check your Canvas account for such announcements (students may also receive an email notification when a new announcement is posted).

Email communications E-mail communication is fast, flexible, and effective. You have an @ku.edu email account and you are expected to regularly check it. Important information will also be communicated via email.

You are a student registered in a course offered by the School of Engineering at the University of Kansas, a top regional and a nationally ranked institution. Your communications, especially written communications (composition, grammar, spelling, punctuation, etc), must reflect that status. Please follow these email guidelines and etiquettes.

Send text-only emails in text-only format. All classroom assignments, labs, or projects should be typeset and submitted on Canvas. Other documents (e.g., documents for an excusable absence) shoud be emailed in PDF or a well-known image format (e.g., JPG or PNG). See the Guidelines for submitting electronic documents.

Grade and absence clarification or correction. If you believe your grades on an assignment, lab, quiz, or exam are incorrect, you should formally submit a grade appeal via email to the instructor within one week of receiving the graded work. Similarly, if you have an excusable absence, and you did not provide documentation prior to the absence, submit relevant documentation within one week of the absence. Failure to address concerns within these timeframes will result in the decision becoming final. This timeline ensures timely resolution and fairness for all parties involved.

Late exam-taking policy. If a student will have to take an exam or a quiz at a later time (due to an excused and verified absence), he or she will be asked to make the following statement: I understand that I have been granted the opportunity to take this exam or quiz on [date of rescheduled exam] due to an excused absence from the original exam on [date of original exam]. In making this arrangement, I affirm that I did not and will not, by any means (in writing, speaking, or through digital communications), obtain any information about the exam content or details from anyone who has taken it at the original time. I understand that violating this pledge may result in disciplinary action, including receiving a failing grade on the exam.

Cell phone policy. Cell phones should be turned off before coming to the classroom. Cell phone use for the purposes of texting, email or other social media should be avoided. Earphones for music are OK during lab work or individualized problem solving, as long as the volume allows you to hear announcements. Also cell phone or other cameras may be used to photograph projects and the whiteboard but avoid shots that include the presenter or other students.

Laptop/electronic device policy. The use of laptops, tablets or similar devices is common for taking notes but turn off audio and avoid any possible uses that could cause distraction for others (e.g., Web surfing or social media visits).

Incomplete grade policy. "Incomplete (I) grades are used to note, temporarily, that students have been unable to complete a portion of the required course work during that semester due to circumstances beyond their control. Incomplete work must be completed and assigned an A-F or S/U grade within the time period prescribed by the course instructor. After one calendar year from the original grade due date, an Incomplete (I) grade will automatically convert to a grade of F or U, or the lapsed grade assigned by the course instructor."

Accommodations for students with disabilities. The University of Kansas is committed to providing equal opportunity for participation in all programs, services and activities. Requests for special accommodations may be made thru the KU Student Access Services.

KU's diversity policy statement. As a premier international research university, the University of Kansas is committed to an open, diverse and inclusive learning and working environment that nurtures the growth and development of all. KU holds steadfast in the belief that an array of values, interests, experiences, and intellectual and cultural viewpoints enrich learning and our workplace. The promotion of and support for a diverse and inclusive community of mutual respect require the engagement of the entire university.

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression, and genetic information in the University's programs and activities. Retaliation is also prohibited by University policy. If you have questions about filing a report of discrimination, contact the Office of Civil Rights and Title IX at civilrights@ku.edu.

KU's sexual harassment policy. The University of Kansas prohibits sexual harassment and is committed to preventing, correcting, and disciplining incidents of unlawful harassment, including sexual harassment and sexual assault. Sexual harassment, sexual violence, and a hostile environment because of sex are forms of sex discrimination and should be reported. (“Sexual Harassment” means behavior, including physical contact, advances, and comments in person, through an intermediary, and/or via phone, text message, email, social media, or other electronic medium, that is unwelcome; based on sex or gender stereotypes; and is so severe, pervasive and objectively offensive that it has the purpose or effect of substantially interfering with a person’s academic performance, employment or equal opportunity to participate in or benefit from University programs or activities or by creating an intimidating, hostile or offensive working or educational environment.)

Under Title IX of the Education Amendments of 1972, harassment based on sex, including sexual assault, stalking, domestic and dating violence, and harassment or discrimination based on the individual’s sexual orientation, gender identity, gender expression, and pregnancy or related conditions, is prohibited. If a student would like to file a complaint for Title IX discrimination or has any questions, please contact KU’s Title IX Coordinator (Lauren Jones McKown, Associate Vice Chancellor for Civil Rights and Title IX, Dole Human Development Center, 1000 Sunnyside Ave, Suite 1082, Lawrence, KS 66045, civilrights@ku.edu, 785.864.6414).

Mandatory reporter statement. The University of Kansas has decided that all employees, with few exceptions, are responsible employees or mandatory reporters who must report incidents of discrimination, harassment, and sexual violence that they learn of in their employment at KU to the Office of Civil Rights and Title IX. This includes faculty members. As such, if you share information about discrimination, harassment, or sexual violence with me, I will have to relay that information to the Office of Civil Rights and Title IX. I truly value your trust in me to share that information and I want to be upfront about my requirement as a mandatory reporter. If you are interested in contacting KU’s confidential resources (those who do not have to make disclosures to OCRTIX), there are: the Care Coordinator, Melissa Foree; CAPS therapists; Watkins Health Care Providers; and the Ombuds Office.

Commercial note-taking ventures. Pursuant to the University of Kansas’ Policy on Commercial Note-Taking Ventures, commercial note-taking is not permitted in this course. Lecture notes and course materials may be taken for personal use, for the purpose of mastering the course material, and may not be sold to any person or entity in any form. Any student engaged in or contributing to the commercial exchange of notes or course materials will be subject to discipline, including academic misconduct charges, in accordance with University policy. Please note: note-taking provided by a student volunteer for a student with a disability, as a reasonable accommodation under the ADA, is not the same as commercial note-taking and is not covered under this policy. In fact, we often have students needing help with note taking (including this very course). If you are able to take well-organized and detailed notes, have legible handwriting, and regularly attend the class, your help will be greatly appreciated and will be recognized with a a KU certificate. Please visit with me.

Concealed handguns. Individuals who choose to carry concealed handguns are solely responsible to do so in a safe and secure manner in strict conformity with state and federal laws and KU weapons policy. Safety measures outlined in the KU weapons policy specify that a concealed handgun:

  • Must be under the constant control of the carrier.
  • Must be out of view, concealed either on the body of the carrier, or backpack, purse, or bag that remains under the carriers custody and control.
  • Must be in a holster that covers the trigger area and secures any external hammer in an un-cocked position.
  • Must have the safety on, and have no round in the chamber.

Suggested readings Textbooks are excellent survey and tutorial resources. Most up-to-date topics on topics discussed in class can be found in technical journals and recent conference proceedings. Students should develop a habit of regularly browsing IEEE Software, IEEE Computer, Communications of the ACM, IEEE Security & Privacy, IEEE Network, IEEE IT Professional, IEEE Cloud Computing, and similar magazines.

LLM and generative AI tools

You may use ChatGPT and other generative AI tools in some instances. That includes generating ideas, outlining steps in a project, finding sources, getting feedback on your writing, and overcoming obstacles on papers and projects. Using those tools to generate all or most of an assignment, though, will be considered academic misconduct. If you are ever in doubt, ask. In your course work, you will be asked to explain in a reflection statement how you used any generative AI tools.

How to use generative AI ethically. Using ChatGPT and similar tools to avoid the intellectual work of your classes is wrong. If you simply copy what a chatbot creates and turn that in as your own, it will be considered academic misconduct. It is the same as copying and pasting from a webpage or having a friend write a paper for you. All of the assignments in this class are intended to help you develop intellectually and professionally. Learning can be challenging, and working through those challenges – and failures – will help you be more successful in this course, in other courses, and in your profession.

Instructors and students everywhere are trying to negotiate the boundaries of generative AI in learning. Many areas are murky. Here’s how I see things: The work you submit in this class has your name on it and should represent your intellectual efforts. Using generative AI for assistance is much like working with a partner. You exchange ideas and offer feedback to each other, with a goal of improving each other’s work. If you ask your partner to do the assignment for you because you are tired or sick, you deserve no credit because the assignment with your name on it is a lie. The same holds if you have generative AI do the work. So always ask yourself: What intellectual work have I done? Is this really my work?

Create reflection statements. Each GAI-assisted assignment you turn must include a reflection on how you went about the work. That includes the steps you took to find information, the way you organized your information, and the steps you took in creating the assignment. If you used generative AI, explain which tool you used, how you used it, and to what extent. Also explain what you learned from the process. Were AI tools helpful? If so, what approaches do you want to use next time and what approaches should you avoid? Did the use of generative AI make the assignment feel less like your own work? Or were you able to edit and use other strategies to shape the work into your own?

Academic integrity

The University of Kansas, the School of Engineering, and in particular, the Department of Electrical Engineering & Computer Science (EECS) have zero tolerance for academic dishonesty and academic misconduct.

The institutional definitions and consequences of institutional academic integrity policies will used. Academic dishonesty includes any form of plagiarism (cheating) as well as "giving or receiving of unauthorized aid on examinations or in the preparation of assignments or reports, knowingly misrepresenting the source of any academic work, falsification of research results, and plagiarizing of another's work." An absolutely minimum consequence of an academic integrity violation will be a zero for the item in question (e.g., a lab, an assignment, an exam or quiz), but depending on the severity, the consequence may be a lower grade, or simply an F for the course, and the case may be forwarded to the SoE committee for additional penalties and disciplinary measures.

LMS features.To further facilitate academic integrity, the following features of Canvas learning management system (LMS) will be utilized:

  • Each exam and quiz will be conducted synchronously in classroom, will be scheduled on a regular weekly session.
  • Each exam or quiz will have a limited and narrow time (set via a timer) to be completed and each person will get only one chance to do it.
  • Each exam or quiz will have about the same amount of time as a paper exam or quiz, plus some additional LMS overhead time.
  • Exam and quiz questions will be randomly numbered for each person. Furthermore, the multiple choice, matching and similar questions will have randomized choice selections. As a result, a choice like "All of the above" or "None of the above" may not be the last choice and refers to the other choices.
  • Exam and quiz questions will be displayed one-at-a time, with no backtracking.
  • The "originality checking" mechanisms of LMS will be utilized for exams but also assignments. LMS is able to check written responses against online databases of previously published works and trace sentences or clauses to other sources.
  • LMS features to prohibit printing, copying/pasting of exams will be turned on.
  • LMS lockdown feature will be employed.
The ACM's and IEEE's code of ethics. As IT and computing professionals and/or as engineers, you should be familiar with the ACM's (IT, computing) and IEEE (engineering) codes of ethics and apply them during your academic and professional careers. These are lifelong commitments to integrity and professional conduct.

We will review these during the first class session, but you are strongly encouraged to review these codes in detail:

From the ACM's preamble: Computing professionals' actions change the world. To act responsibly, they should reflect upon the wider impacts of their work, consistently supporting the public good. The ACM Code of Ethics and Professional Conduct ("the Code") expresses the conscience of the profession.

From the IEEE's preamble: We, the members of the IEEE, in recognition of the importance of our technologies in affecting the quality of life throughout the world, and in accepting a personal obligation to our profession, its members and the communities we serve, do hereby commit ourselves to the highest ethical and professional conduct and agree.

The School of Engineering Statement on EdTech. "With the switch to online teaching as a result of the Coronavirus pandemic, professors and instructors at the KU School of Engineering are aware that some students are actively posting assignments, laboratory, and exam questions and responses to EdTech services (e.g., Chegg) even during exam time frames.

Keep in mind that when a person signs up to participate by either uploading, and/or downloading, and/or using posted material from these sites, the “terms of service” that are agreed to do not protect the person when KU and/or the School of Engineering decide to conduct investigations related to academic misconduct (e.g., plagiarism and/or cheating).

In fact, EdTech services, like Chegg, retain contact information of students who use their services and will release that information, which is traceable, upon request. Using these services constitutes academic misconduct, which is not tolerated in the School of Engineering. It violates Article 3r, Section 6 of its Rules & Regulations, and may lead to grades of F in compromised course(s), transcript citations of academic misconduct, and expulsion from the University of Kansas.

If unsure about assignments, it is important that students use the allowable available resources, such as instructor office hours, graduate teaching assistants, and/or tutoring. The School of Engineering wants students to be successful; cheating is not the way to attain that success."