Applicants to the M.S. in Health Informatics & Data Science are required to submit documentation that demonstrates their knowledge of programming languages. Proficiency in Python, R and SQL is required; knowledge of other programming languages is also welcome. Any programming courses or projects completed during undergraduate or graduate studies, a student GitHub page displaying code you have created, certificates from Massive Open Online Courses (e.g., edX), and other similar documents are acceptable to demonstrate proficiency.
Program Expectations
Health Informatics & Data Science students must enter the program with a certain level of proficiency in programming languages, indicated by the following expectations.
Minimum Expectations for SQL
- Connect to a RDBMs using an ODBC connection from a scripting language (Python, R)
- Execute a query and generate results sets
- Understand the basics of relational storage models
- Understand the concept of NULL
- Be able to execute 2 table INNER and LEFT OUTER JOINS joining on a single column
- Be able to summarize data using GROUP BY, COUNT, SUM and AVG statement
- Be able to use compound WHERE clauses to filter data
Minimum Expectations for Python
- Know how to import a library
- Able to install python on platform of choice and execute a simple script
- Basic understanding of iteration constructs such as for, while
- Understand basic variable assignment
- Familiarity with Pandas (simple grouping and summaries)
- Understand basic python data structures such as string, list, dictionaries
Minimum Expectations for R
- How to install packages in R
How to navigate and use RStudio - How to import txt and csv files into R, and save it as a data matrix or data
frame - How to filter rows and columns based on filtering criteria
- Basic understanding of iteration constructs. How to write loops in R (while-loop and for-loop)
- How to write a function in R
General Concepts
- Basics of shell/command line
- Basics of Github
- What is a bar graph
- What is a box plot
- What is a histogram
- Basic statistics
- Basic probability
- Basics of Javascript: The Modern JavaScript Tutorial
Educational Resources
Below is a list of resources to get you started with programming in Python, R and SQL. Please contact us if you have any questions as you work on your programming certificates.
- Coursera: Programming for Everybody (Getting Started with Python)
- Coursera: Data Science Specialization – Courses 1-3
- Course 1: The Data Scientist’s Toolbox
- Course 2: R Programming
- Course 3: Getting and Cleaning Data
- Coursera: SQL for Data Science
- R Programming for Data Science free e-book by Dr. Roger Peng (includes links to videos) – Chapters 1-5, 9, 12-14, 16
- Chapter 1: Preface
- Chapter 2: History and Overview of R
- Chapter 3: Getting Started with R
- Chapter 4: R Nuts and Bolts
- Chapter 5: Getting Data In and Out of R
- Chapter 9: Subsetting R Objects
- Chapter 12: Managing Data Frames with the dplyr package
- Chapter 13: Control Structures
- Chapter 14: Functions
- Chapter 16: Loop Functions
- The Modern JavaScript Tutorial