Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/april_18/projects/unit-projects/README.md
1904 views

Data Science: Unit Projects

Globally, we have 4 Unit Projects in Data Science, each building on top of skills learned previously to scaffold students' learning over the entire course.

Our projects include objectives, requirements, starter-code, rubric, and suggested resources - all of which tie into the overall competencies for each unit.

Project 1

Students will create a framework to scope out their data science projects, using iPython notebook and a UCLA admissions dataset.

This framework will provide you with a guide for exploratory data analysis and help you identify features of the dataset, including the outcome and covariate/predictors. You'll develop a well-articulated problem statement and analysis plan that will be robust and reproducible. Using an iPython notebook, you'll state the risks and assumptions of your data and create a data dictionary.

  • Goal: Create a problem statement, analysis plan, and data dictionary in iPython.

  • Detailed Spec File

Project 2

Building upon the framework you created in Project 1, now you'll need to explore your dataset using descriptive statistics and basic visualizations, in order to identify biases, limitations, or variables in your model. This will lay the groundwork for your modeling approach in Project 3.

  • Goal: Explore data with visualizations and statistical analysis in an iPython notebook.

  • Detailed Spec File

Project 3

Students will put into practice the framework and exploratory analysis created in Projects 1 and 2 by completing a logistic regression model on the UCLA admissions dataset. You'll have to create dummy variables and calculate the OR by hand. After plotting the data, your iPython notebook writeup should also include calculations of the predicted probabilities and an interpretation of your findings.

  • Goal: Perform logistic regression on UCLA dataset, creating dummy variables and calculating probabilities.

  • Detailed Spec File

Project 4

Students will polish their iPython notebook by combining prior project deliverables into a final, polished iPython notebook that begins with an executive summary, states goals and success criteria, outlines methods and aims, describes risks and assumptions, explains modeling approach using visualizations, and concludes with findings and next steps.

This project will familiarize students with the role of audience analysis and model defense in real world data science presentations.

  • Goal: Present your findings in an iPython notebook with executive summary, visuals, and recommendations.

  • Detailed Spec File