Preparing for first week of BDA

As of today March 29, the course is oversubscribed. Come to the first class anyway, because by the third week lots of people will drop the course, for various reasons. See the page on Should I take Big Data Analytics in 2018? for more information.
Class will probably meet in RBC 3203, but there is a chance it will move to the Gardner Room. 

Here are steps that you need to do by Tuesday, April 3. If you can get most of it done before the first class, even better.  Most important: get the software installed. Some students will run into problems, and we don’t want to wait to discover them.

  • Installing R is straightforward and covered in many places. We will use  R version 3.4.4 (Someone to Lean On) which was released on 2018-03-15. Here is a Coursera video on how to install. The official web site for latest versions is https://cran.r-project.org.
  • Start R, make sure it runs. Set up at least one folder/directory where your R programs will go.
  • Install Rattle. Its home page is https://rattle.togaware.com . To install Rattle, start up R, then follow the instructions on the Rattle page.
  • Download the Rattle textbook from Springerlink.com. It also has instructions for installing Rattle. https://link.springer.com/book/10.1007/978-1-4419-9890-3
  • Get the main textbook. You can try using the library’s online copies, especially for the first chapters. Instructions are on this web site (BDA2020.wordpress.com).
    • Read chapter 1 on your own.
    • Start on Chapter 2.

The TA for the course will be Feiyang Chan. She will hold informal office hours before and after the Wednesday class, for anyone who is having trouble installing the software. So 10:30 to 11AM, and again 12:30 onward. In the main classroom, RBC 3203.

What is Big Data Analytics?

This confuses students every year, and for excellent reasons. A variety of terms are thrown around without clear definitions, or clear distinctions among them. The concepts and applications are evolving so fast that there is no consensus. You should think of all of the following as closely related, and all covered by this course:

  • Data Analytics
  • Data Mining
  • Machine Learning
  • Business Analytics
  • Data Science
  • at least 5 others.

It is very helpful to look at a range of case studies where these ideas have been used successfully. Here are a few.  Some may be bogus – as we will try to discuss during the course.

Assignment: send me other examples. Either put them in the comments, or email them to me and I will post them.

Big Data At Caesars Entertainment – A One Billion Dollar Asset? – Forbes

BDA examples: Pollution and health

Popular Press Articles

Analyzing 170,000,000 NYC Taxi trips

 

Advertisements

Author: Roger Bohn

Professor of Technology Management, UC San Diego. Visiting Stanford Medical School Rbohn@ucsd.edu. Twitter =Roger.Bohn

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s