Georsgis Blog

↑ Grab this Headline Animator

Monday 11 March 2019

Data Mining Practical Machine Learning Tools and Techniques


Data Mining Practical Machine Learning Tools and Techniques

Foreword

Technology now allows us to capture and store vast quantities of data. Finding patterns, trends, and anomalies in these datasets, and summarizing them with simple quantitative models, is one of the grand challenges of the information age—turning data into information and turning information into knowledge. There has been stunning progress in data mining and machine learning.

The synthesis of statistics, machine learning, information theory, and computing has created a solid science, with a firm mathematical base, and with very powerful tools.
Witten and Frank present much of this progress in this book and in the companion implementation of the key algorithms. As such, this is a milestone in the synthesis of data mining, data analysis, information theory, and machine learning. If you have not been following this field for the last decade, this is a great way to catch up on this exciting progress. If you have, then Witten and Frank’s presentation and the companion open-source workbench, called Weka, will be a useful addition to your toolkit. 

They present the basic theory of automatically extracting models from data, and then validating those models. The book does an excellent job of explaining the various models (decision trees, association rules, linear models, clustering, Bayes nets, neural nets) and how to apply them in practice. With this basis, they then walk through the steps and pitfalls of various approaches. 

They describe
how to safely scrub datasets, how to build models, and how to evaluate a model’s predictive quality. Most of the book is tutorial, but Part II broadly describes how commercial systems work and gives a tour of the publicly available data mining workbench that the authors provide through a website. This Weka workbench has a graphical user interface that leads you through data mining tasks and has excellent data visualization tools that help understand the models. It is a great companion to the text and a useful and popular tool in its own right.

This book presents this new discipline in a very accessible form: as a text both to train the next generation of practitioners and researchers and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate.
If you are interested in databases, and have not been following the machine learning field, this book is a great way to catch up on this exciting progress. 

If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start.

Download Link

No comments:

Contact us

Name

Email *

Message *

Follow us on Facebook and YouTube