Data Mining with Rattle and R

Cover-DataMiningWithRattleKnowledge leads to wisdom and better understanding. Data mining builds knowledge from information, adding value to the ever-increasing stores of electronic data that abound today. Emerging from the database community in the late 1980s’ data mining grew quickly to encompass researchers and technologies from machine learning, high-performance computing, visualisation, and statistics, recognising the growing opportunity to add value to data. download-buttons-75 Today, this multidisciplinary and transdisciplinary effort continues to deliver new techniques and tools for the analysis of very large collections of data. Working on databases that are now measured in the terabytes and petabytes, data mining delivers discoveries that can improve the way an organisation does business. Data mining enables companies to remain competitive in this modern, data-rich, information-poor, knowledge-hungry, and wisdom-scarce world. Data mining delivers knowledge to drive the getting of wisdom.

A wide range of techniques and algorithms are used in data mining. In performing data mining, many decisions need to be made regarding the choice of methodology, data, tools, and algorithms.

Throughout this book, we will be introduced to the basic concepts and algorithms of data mining. We use the free and open source software Rattle (Williams, 2009), built on top of the R statistical software package (R Development Core Team, 2011). As free software the source code of Rattle and R is available to everyone, without limitation. Everyone is permitted, and indeed encouraged, to read the source code to learn, understand verify, and extend it. R is supported by a worldwide network of some of the world’s leading statisticians and implements all of the key algorithms for data mining.