The only firm requirement for taking Big Data Analytics is knowledge of statistics through linear regression. Knowledge of R computer language is not required or even expected.
The workload is severe. There are weekly problem sets which have to be run on your computer. Most important, it requires an ambitious final project, which you define for yourself. Graduate students learn best from each other, and almost all work can be done in pairs (team of 2).
I do recommend a short introduction to R, via Coursera – see the page what to do in the first week. The TA will be leading sessions specifically about R. But BDA is not a programming course, and unless you make extra effort, you won’t become proficient in R as a general programming language. You will learn to use R code that other people have written, and glue it together for your own needs.
Other useful background:
- Knowledge of probability, such as decision trees.
- Some empirical area that you are informed about and would like to explore. It does not have to be quantitative, but you must be able to find real data about it.
- Knowledge of regression beyond straight linear regression, including binary outcomes, discrete variables, panel data, etc.
- Experience getting your hands dirty with data. Real data does not arrive in neatly formatted matrices. It has missing values, ambiguous definitions, outright errors, internally conflicting numbers, and so forth.
- Some programming experience, in any language. Programming is a mental discipline, because the computer hates you,and wants you to mess up. If you have never programmed before, this can be frustrating.
- Nobody will have all of these pieces (possibly excepting a few PhD students who occasionally take the course). You will have lots of opportunities to learn about them in the course.
All this will be discussed further in the first meeting.