Welcome! We will use one principal textbook, with a variety of supplements.
The TA will be available to help with software installation before and after the class of April 4. Location uncertain. Look for her in the lobby of the auditorium.
1. Main Text DMBA: Data Mining for Business Analytics: Concepts, Techniques, and Applications in R. by Galit Shmueli et al.
This textbook is required. It’s a good survey of the topic. It uses R, which is the only language we will use in the course. This book will be referred to as DMBA.
You can get the book on Amazon for about $106, or Kindle for $90, or a company called VitalSource online version for about $100. I’m not familiar with VitalSource, but they appear to have sensible study aids, and they claim “read anywhere, 100% offline.” Data Mining for Business Analytics: Concepts, Techniques, and Applications in R 1st edition | 9781118879368 | VitalSource. You can even rent the book from Amazon for $44. So everyone will find it worthwhile to have your own copy for the assignments, studying, etc. I have asked the UCSD bookstore to get it. I suspect they will charge close to the list price.
The UCSD library has the e-book. It is on their ProQuest platform. The rules on that platform limit use to 3 simultaneous users, so when you are done reading a section, close the page. http://roger.ucsd.edu/record=b9688724
2. Supplemental text for first 2 weeks: DMRR
Our supplemental textbook is: Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery by Graham Williams.
Log in from campus (or VPN) so you don’t have to pay for it.
3. Other reference books (optional)
The UCSD library, via Springerlink.com and other sources, has a variety of good books on data mining, AI, business analytics, and so forth. All are available free, and most can be downloaded as PDFs. We will use chapters from some of these books. By week 4, check out Springerlink, which has thousands of free technical books on every computer language and applied math method you can think of. If you like physical books, hard copy versions of all Springerlink books are available for $25 each. Springer has a collection called “Use R!” of about 100 books, at http://link.springer.com/search/page/2?facet-content-type=%22Book%22&query=%22use+R%22
ISLR: For reference and to fill holes on statistical issues, I will mainly refer to An Introduction to Statistical Learning with Applications in R (ISLR) by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.
R-Stata If you are proficient with Stata, get this book. R for Stata Users . Similarly, there are “translation” books for other computer lanuages.
Some of these books and web courses are also available in Chinese. A few are available in other languages.
We will use the following software: The R statistical language, the Rattle package for R which provides a graphical user interface (GUI), RStudio, and numerous special purpose analysis packages that you load via R as the course unfolds. All of this software is free.
- Install R by downloading and running the appropriate file for your operating system from CRAN. CRAN is the central repository for R packages. R is an open source language so it is coordinated by volunteers.
- The Udacity course walks through installing R and RStudio.
- Get Rstudio here. RStudio IDE.
- To download Rattle, look in Appendix A page 331 of the Graham Williams book, or http://rattle.togaware.com.
- Rattle requires that some other packages are also installed, and if you are just getting started with R this can be frustrating. For Windows, use http://rattle.togaware.com/rattle-install-mswindows.html . For Macintosh, use http://rattle.togaware.com/rattle-install-mac.html . Some Rattle tutorial videos are at http://rattle.togaware.com/rattle-videos.html.
- The TA will be available to help with software installation before and after the class of April 4. Look for her in the lobby of the auditorium.