As of today March 29, the course is oversubscribed. Come to the first class anyway, because by the third week lots of people will drop the course, for various reasons. See the page on Should I take Big Data Analytics in 2018? for more information.
Class will probably meet in RBC 3203, but there is a chance it will move to the Gardner Room.
Here are steps that you need to do by Tuesday, April 3. If you can get most of it done before the first class, even better. Most important: get the software installed. Some students will run into problems, and we don’t want to wait to discover them.
- Installing R is straightforward and covered in many places. We will use R version 3.4.4 (Someone to Lean On) which was released on 2018-03-15. Here is a Coursera video on how to install. The official web site for latest versions is https://cran.r-project.org.
- Start R, make sure it runs. Set up at least one folder/directory where your R programs will go.
- Install Rattle. Its home page is https://rattle.togaware.com . To install Rattle, start up R, then follow the instructions on the Rattle page.
- Hints on Installing Rattle on the Mac
- Start Rattle, make sure it works.
- Download the Rattle textbook from Springerlink.com. It also has instructions for installing Rattle. https://link.springer.com/book/10.1007/978-1-4419-9890-3
- Get the main textbook. You can try using the library’s online copies, especially for the first chapters. Instructions are on this web site (BDA2020.wordpress.com).
- Read chapter 1 on your own.
- Start on Chapter 2.
The TA for the course will be Feiyang Chan. She will hold informal office hours before and after the Wednesday class, for anyone who is having trouble installing the software. So 10:30 to 11AM, and again 12:30 onward. In the main classroom, RBC 3203.
What is Big Data Analytics?
This confuses students every year, and for excellent reasons. A variety of terms are thrown around without clear definitions, or clear distinctions among them. The concepts and applications are evolving so fast that there is no consensus. You should think of all of the following as closely related, and all covered by this course:
- Data Analytics
- Data Mining
- Machine Learning
- Business Analytics
- Data Science
- at least 5 others.
It is very helpful to look at a range of case studies where these ideas have been used successfully. Here are a few. Some may be bogus – as we will try to discuss during the course.
Assignment: send me other examples. Either put them in the comments, or email them to me and I will post them.