Data mining and statistics for decision making [electronic resource] / Stéphane Tufféry; translated by Rod Riesco.

Author
Tuffery, Stéphane [Browse]
Format
Book
Language
English
Published/​Created
Chichester, West Sussex ; Hoboken, NJ. : Wiley, 2011.
Description
1 online resource (717 p.)

Details

Subject(s)
Series
Wiley series in computational statistics. [More in this series]
Summary note
Data mining is the process of automatically searching large volumes of data for models and patterns using computational techniques from statistics, machine learning and information theory; it is the ideal tool for such an extraction of knowledge. Data mining is usually associated with a business or an organization's need to identify trends and profiles, allowing, for example, retailers to discover patterns on which to base marketing objectives. This book looks at both classical and recent techniques of data mining, such as clustering, discriminant analysis, logistic regression, generalized l
Notes
Description based upon print version of record.
Bibliographic references
Includes bibliographical references and index.
Language note
English
Contents
  • Data Mining and Statistics for Decision Making; Contents; Preface; Foreword; Foreword from the French language edition; List of trademarks; 1 Overview of data mining; 1.1 What is data mining?; 1.2 What is data mining used for?; 1.2.1 Data mining in different sectors; 1.2.2 Data mining in different applications; 1.3 Data mining and statistics; 1.4 Data mining and information technology; 1.5 Data mining and protection of personal data; 1.6 Implementation of data mining; 2 The development of a data mining study; 2.1 Defining the aims; 2.2 Listing the existing data; 2.3 Collecting the data
  • 2.4 Exploring and preparing the data2.5 Population segmentation; 2.6 Drawing up and validating predictive models; 2.7 Synthesizing predictive models of different segments; 2.8 Iteration of the preceding steps; 2.9 Deploying the models; 2.10 Training the model users; 2.11 Monitoring the models; 2.12 Enriching the models; 2.13 Remarks; 2.14 Life cycle of a model; 2.15 Costs of a pilot project; 3 Data exploration and preparation; 3.1 The different types of data; 3.2 Examining the distribution of variables; 3.3 Detection of rare or missing values; 3.4 Detection of aberrant values
  • 3.5 Detection of extreme values3.6 Tests of normality; 3.7 Homoscedasticity and heteroscedasticity; 3.8 Detection of the most discriminating variables; 3.8.1 Qualitative, discrete or binned independent variables; 3.8.2 Continuous independent variables; 3.8.3 Details of single-factor non-parametric tests; 3.8.4 ODS and automated selection of discriminating variables; 3.9 Transformation of variables; 3.10 Choosing ranges of values of binned variables; 3.11 Creating new variables; 3.12 Detecting interactions; 3.13 Automatic variable selection; 3.14 Detection of collinearity; 3.15 Sampling
  • 3.15.1 Using sampling3.15.2 Random sampling methods; 4 Using commercial data; 4.1 Data used in commercial applications; 4.1.1 Data on transactions and RFM Data; 4.1.2 Data on products and contracts; 4.1.3 Lifetimes; 4.1.4 Data on channels; 4.1.5 Relational, attitudinal and psychographic data; 4.1.6 Sociodemographic data; 4.1.7 When data are unavailable; 4.1.8 Technical data; 4.2 Special data; 4.2.1 Geodemographic data; 4.2.2 Profitability; 4.3 Data used by business sector; 4.3.1 Data used in banking; 4.3.2 Data used in insurance; 4.3.3 Data used in telephony; 4.3.4 Data used in mail order
  • 5 Statistical and data mining software5.1 Types of data mining and statistical software; 5.2 Essential characteristics of the software; 5.2.1 Points of comparison; 5.2.2 Methods implemented; 5.2.3 Data preparation functions; 5.2.4 Other functions; 5.2.5 Technical characteristics; 5.3 The main software packages; 5.3.1 Overview; 5.3.2 IBM SPSS; 5.3.3 SAS; 5.3.4 R; 5.3.5 Some elements of the R language; 5.4 Comparison of R, SAS and IBM SPSS; 5.5 How to reduce processing time; 6 An outline of data mining methods; 6.1 Classification of the methods; 6.2 Comparison of the methods; 7 Factor analysis
  • 7.1 Principal component analysis
ISBN
  • 1-283-37397-1
  • 9786613373977
  • 0-470-97928-3
  • 0-470-97916-X
  • 0-470-97917-8
OCLC
716215543
Statement on language in description
Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage. Read more...
Other views
Staff view

Supplementary Information