Data mining and statistics for decision making / Stéphane Tufféry.

Author: Tuffery, Stéphane [Browse]
Format: Book
Language: English
Published/Created: Chichester, West Sussex ; Hoboken, NJ. : Wiley, 2011.
Description: xxiv, 689 p. : ill. ; 25 cm.

Available Online

Ebook Central Perpetual, DDA and Subscription Titles

Copies in the Library

Location	Call Number	Status	Location Service	Notes
Lewis Library - Stacks	QA76.9.D343 T84 2011 Browse related items		Request

Details

Subject(s)

Series

Wiley series in computational statistics [More in this series]

Summary note

"This practical guide to understanding and implementing data mining techniques discusses traditional methods--cluster analysis, factor analysis, linear regression, PLS regression, and generalized linear models--and recent methods--bagging and boosting, decision trees, neural networks, support vector machines, and genetic algorithm. The book focuses largely on credit scoring, one of the most common applications of predictive techniques, but also includes other descriptive techniques, such as customer segmentation. It also covers data mining with R, provides a comparison of SAS and SPSS, and includes an appendix presenting the necessary statistical background"-- Provided by publisher.
"Data Mining is a practical guide to understanding and implementing data mining techniques, featuring traditional methods such as cluster analysis, factor analysis, linear regression, PLS regression and generalised linear models"-- Provided by publisher.

Bibliographic references

Includes bibliographical references and index.

Contents

Machine generated contents note: Preface
Foreword
Contents
Overview of data mining
1.1. What is data mining?
1.2. What is data mining used for?
1.3. Data Mining and statistics
1.4. Data mining and information technology
1.5. Data mining and protection of personal data
1.6. Implementation of data mining
The development of a data mining study
2.1. Defining the aims
2.2. Listing the existing data
2.3. Collecting the data
2.4. Exploring and preparing the data
2.5. Population segmentation
2.6. Drawing up and validating predictive models
2.7. Synthesizing predictive models of different segments
2.8. Iteration of the preceding steps
2.9. Deploying the models
2.10. Training the model users
2.11. Monitoring the models
2.12. Enriching the models
2.13. Remarks
2.14. Life cycle of a model
2.15. Costs of a pilot project
Data exploration and preparation
3.1. The different types of data
3.2. Examining the distribution of variables
3.3. Detection of rare or missing values
3.4. Detection of aberrant values
3.5. Detection of extreme values
3.6. Tests of normality
3.7. Homoscedasticity and heteroscedasticity
3.8. Detection of the most discriminating variables
3.9. Transformation of variables
3.10. Choosing ranges of values of continuous variables
3.11. Creating new variables
3.12. Detecting interactions 89
3.13. Automatic variable selection
3.14. Detection of collinearity
3.15. Sampling
Using commercial data
4.1. Data used in commercial applications
4.2. Special data
4.3. Data used by business sector
Statistical and data mining software
5.1. Types of data mining and statistical software
5.2. Essential characteristics of the software
5.3. The main software packages
5.4. Comparison of R, SAS and IBM SPSS
5.5. How to reduce processing time
An outline of data mining methods
6.1. A note on terminology
6.2. Classification of the methods
6.3. Comparison of the methods
6.4. Using these methods in the business world
Factor analysis
7.1. Principal component analysis
7.2. Variants of principal component analysis
7.3. Correspondence analysis
7.4. Multiple correspondence analysis
Neural networks
8.1. General information on neural networks
8.2. Structure of a neural network
8.3. Choosing the training sample
8.4. Some empirical rules for network design
8.5. Data normalization
8.6. Learning algorithms
8.7. The main neural networks
Automatic clustering methods
9.1. Definition of clustering
9.2. Applications of clustering
9.3. Complexity of clustering
9.4. Clustering structures
9.5. Some methodological considerations
9.6. Comparison of factor analysis and clustering
9.7. Intra-class and inter-class inertias
9.8. Measurements of clustering quality
9.9. Partitioning methods
9.10. Hierarchical ascending clustering
9.11. Hybrid clustering methods
9.12. Neural clustering
9.13. Clustering by aggregation of similarities
9.14. Clustering of numeric variables
9.15. Overview of clustering methods
Finding associations
10.1. Principles
10.2. Using taxonomy
10.3. Using supplementary variables
10.4. Applications
10.5. Example of use
Classification and prediction methods
11.1. Introduction
11.2. Inductive and transductive methods
11.3. Overview of classification and prediction methods
11.4. Classification by decision tree
11.5. Prediction by decision tree
11.6. Classification by discriminant analysis
11.7. Prediction by linear regression
11.8. Classification by logistic regression
11.9. Developments in logistic regression
11.10. Bayesian methods
11.11. Classification and prediction by neural networks
11.12. Classification by support vector machines (SVMs)
11.13. Prediction by genetic algorithms
11.14. Improving the performance of a predictive model
11.15. Bootstrapping and aggregation of models
11.16. Using classification and prediction methods
An application of data mining: scoring
12.1. The different types of score
12.2. Using propensity scores and risk scores
12.3. Methodology
12.4. Implementing a strategic score
12.5. Implementing an operational score
12.6. The kinds of scoring solutions used in a business
12.7. An example of credit scoring (data preparation)
12.8. An example of credit scoring (modelling by logistic regression)
12.9. An example of credit scoring (modelling by DISQUAL discriminant analysis)
12.10. A brief history of credit scoring
Factors for success in a data mining project
13.1. The subject
13.2. The people
13.3. The data
13.4. The IT systems
13.5. The business culture
13.6. Data mining: eight common misconceptions
13.7. Return on investment
Text mining
14.1. Definition of text mining
14.2. Text sources used
14.3. Using text mining
14.4. Information retrieval
14.5. Information extraction
14.6. Multi-type data mining
Web mining
15.1. The aims of web mining
15.2. Global analyses
15.3. Individual analyses
15.4. Personal analyses
Appendix: Elements of statistics
16.1. A brief history
16.2. Elements of statistics
16.3. Statistical tables
Further reading
17.1. Statistics and data analysis
17.2. Data mining and statistical learning
17.3. Text mining
17.4. Web mining
17.5. R software
17.6. SAS software
17.7. IBM SPSS software
17.8. Websites
Index .

ISBN

9780470688298 (hardback)
0470688297 (hardback)

LCCN

2010039789

OCLC

669160723

Statement on language in description

Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage. Read more...

Other views: Staff view

Other versions

Data mining and statistics for decision making [electronic resource] / Stéphane Tufféry; translated by Rod Riesco.
id 99125348616806421

Princeton University Library Catalog

Data mining and statistics for decision making / Stéphane Tufféry.

Availability

Available Online

Copies in the Library

Details

Supplementary Information

Other versions