## Data Science Essentials

**
Started May 2, 2018
**

Sorry! The enrollment period is currently closed. Please check back soon.

### Full program description

**Attention IU employees:** If university funds will be used for this course registration, do ** not** proceed to PayPal and use a departmental P-card as this is a restricted use of the P-card. Instead, please contact Erin at

__ooe@iu.edu__to initiate an internal billing document and obtain a promotion code. The promotion code will allow enrollment and entry into the course.

### About the Course

The Data Science Essentials package is comprised of several self-paced, or asynchronous, modules specifically created by IU faculty and staff. Individuals whom purchase the package will have access to the following modules:

Introduction to R

Basics of Python

Introduction to SQL

Basics of Java

Introduction to C++

Introduction to MongoDB

Linear Algebra

Learn on your on schedule, at a time convenience for you! Additionally, the package does not follow a sequential order so you may choose which modules you complete and in what order. All modules are designed in text format and some have video content; solutions, or answer keys, to each assignment are located within each module. No certificates or badges of understanding will be awarded as these are self-graded modules.

### Required Materials

No special software is required. Please ensure you are utilizing a support browser such as **Chrome**, Firefox, or

**Safari.**

### Meet the Instructor

**Length:**30-120 hours, self-pace learning

**Department:**Luddy School of Informatics, Computing, and Engineering

**Credit:**0

**Audience:**Public

### Course Objectives

## Introduction to R

The goal of this course is to teach you how to program in R and how to use R for effective data analysis. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics.

## Basics of Python

In this course, we are going to learn about the basics of python. It is good to have hands on experience of the concepts you have learned so, at the end, you will get introduced to one of the most commonly performed tasks of classification in data mining and machine learning and we will implement a recommendation system using Scikit-learn package. We will also introduce you to other useful packages, like Matplotlib and Scikit-learn, which are being widely used in the industry.

## Introduction to SQL

This is a course on SQL(Structured Query Language) which covers the basic concepts of databases and SQL as a language for accessing databases. We will start the course by introducing the need to use databases, building entity-relationship models, installing MySQL and then we will proceed to programming(querying) the database using SQL. At the end of the course, you will work to create a sample database, input data and perform complex SQL operations on the database to retrieve the desired output.

## Basics of Java

The course is an interweaving of two parts. The first part is knowledge about basic programming with Java, and the second part is to use the knowledge to work on a project. The two parts will not be strictly separated. Once we are equipped with enough knowledge to take a small step, we make some progress in our project. In the process, we may encounter some confusion and wonder what could be done next. Then we learn something new that helps us resolve the problem, and make further progress. This cycle repeats.

## Introduction to C++

This course requires no prior knowledge of any programming languages, and we will not cover programming languages other than C/C++. However, we will go through the C++ appendix of Eddelbuettel's "Seamless R and C++ integration with Rcpp," which provides the necessary knowledge of C/C++ for reading how to use Rcpp in the main text. After learning a solid knowledge of standard C/C++, you will have a basic idea how to write a C/C++ program. Whenever you need to optimize their R and Python, you will have enough knowledge to start learning Rcpp and Cython by yourselves.

## MongoDB

This course is for those students who are not familiar with DataBase concepts and want to learn NoSQL databases like MongoDB. Students who know MongoDB and wants to work on a project using python and MongoDB to develop their cross-technology skills will also be benefited by taking this course. In this course, we will work on yelp data set which contains more than 1.5 million records and hence is ideal to learn MongoDB MapReduce concept.

## Linear Algebra

In this course, we will cover basic Linear Algebra and Calculus used in

Machine Learning with Python.Machine Learning with Pythoncourse breaks the topics of machine learning into different Pages. Each page uses different levels of Linear Algebra and Calculus from easy to hard. Thus, we organize our course in different parts. By understanding each part (with their previous parts), we understand the Linear Algebra and Calculus used in a particular page inMachine Learning with Pythoncourse. In particular, Part A and B cover one variable differentiable Calculus used inParameter ESTIMATION. Part C covers basic matrix operations. In particular, we will explain the details when a matrix is invertible. Part D explains multi-variable differentiable Calculus which is used inCurve Fitting and Error Function.