The essentials of data science : knowledge discovery using R / Graham J. Williams.

Author
Williams, Graham J. [Browse]
Format
Book
Language
English
Published/​Created
  • Boca Raton, FL ; New York, NY : CRC Press, an imprint of the Taylor & Francis Group, [2017]
  • ©2017
Description
1 online resource (xviii, 322 pages)

Details

Subject(s)
Series
Summary note
"The Essentials of Data Science: Knowledge Discovery Using R presents the concepts of data science through a hands-on approach using free and open source software. It systematically drives an accessible journey through data analysis and machine learning to discover and share knowledge from data. Building on over thirty years' experience in teaching and practising data science, the author encourages a programming-by-example approach to ensure students and practitioners attune to the practise of data science while building their data skills. Proven frameworks are provided as reusable templates. Real world case studies then provide insight for the data scientist to swiftly adapt the templates to new tasks and datasets. The book begins by introducing data science. It then reviews R's capabilities for analysing data by writing computer programs. These programs are developed and explained step by step. From analysing and visualising data, the framework moves on to tried and tested machine learning techniques for predictive modelling and knowledge discovery. Literate programming and a consistent style are a focus throughout the book."--Provided by publisher
Bibliographic references
Includes bibliographical references and index.
Source of description
Print version record.
Contents
  • Chapter 1 Data Science
  • 1.1 Exercises
  • Chapter 2 Introducing R
  • 2.1 Tooling For R Programming
  • 2.2 Packages and Libraries
  • 2.3 Functions, Commands and Operators
  • 2.4 Pipes
  • 2.5 Getting Help
  • 2.6 Exercises
  • Chapter 3 Data Wrangling
  • 3.1 Data Ingestion
  • 3.2 Data Review
  • 3.3 Data Cleaning
  • 3.4 Variable Roles
  • 3.5 Feature Selection
  • 3.6 Missing Data
  • 3.7 Feature Creation
  • 3.8 Preparing the Metadata
  • 3.9 Preparing for Model Building
  • 3.10 Save the Dataset
  • 3.11 A Template for Data Preparation
  • 3.12 Exercises
  • Chapter 4 Visualising Data
  • 4.1 Preparing the Dataset
  • 4.2 Scatter Plot
  • 4.3 Bar Chart
  • 4.4 Saving Plots to File
  • 4.5 Adding Spice to the Bar Chart
  • 4.6 Alternative Bar Charts
  • 4.7 Box Plots
  • 4.8 Exercises
  • Chapter 5 Case Study: Australian Ports
  • 5.1 Data Ingestion
  • 5.2 Bar Chart: Value/Weight of Sea Trade
  • 5.3 Scatter Plot: Throughput versus Annual Growth
  • 5.4 Combined Plots: Port Calls
  • 5.5 Further Plots
  • 5.6 Exercises
  • Chapter 6 Case Study: Web Analytics
  • 6.1 Sourcing Data from CKAN
  • 6.2 Browser Data
  • 6.3 Entry Pages
  • 6.4 Exercises
  • Chapter 7 A Pattern for Predictive Modelling
  • 7.1 Loading the Dataset
  • 7.2 Building a Decision Tree Model
  • 7.3 Model Performance
  • 7.4 Evaluating Model Generality
  • 7.5 Model Tuning
  • 7.6 Comparison of Performance Measures
  • 7.7 Save the Model to File
  • 7.8 A Template for Predictive Modelling
  • 7.9 Exercises
  • Chapter 8 Ensemble of Predictive Models
  • 8.1 Loading the Dataset
  • 8.2 Random Forest
  • 8.3 Extreme Gradient Boosting
  • 8.4 Exercises
  • Chapter 9 Writing Functions in R
  • 9.1 Model Evaluation
  • 9.2 Creating a Function.
  • 9.3 Function for ROC Curves
  • 9.4 Exercises
  • Chapter 10 Literate Data Science
  • 10.1 Basic LATEX Template
  • 10.2 A Template for our Narrative
  • 10.3 Including R Commands
  • 10.4 Inline R Code
  • 10.5 Formatting Tables Using Kable
  • 10.6 Formatting Tables Using XTable
  • 10.7 Including Figures
  • 10.8 Add a Caption and Label
  • 10.9 Knitr Options
  • 10.10Exercises
  • Chapter 11 R with Style
  • 11.1 Why We Should Care
  • 11.2 Naming
  • 11.3 Comments
  • 11.4 Layout
  • 11.5 Functions
  • 11.6 Assignment
  • 11.7 Miscellaneous
  • 11.8 Exercises
  • Bibliography
  • Index.
ISBN
  • 9781498740012 ((electronic bk.))
  • 1498740014 ((electronic bk.))
  • 9781351647496 ((electronic bk.))
  • 1351647490 ((electronic bk.))
  • 9781315151458 ((electronic bk.))
  • 1315151456 ((electronic bk.))
OCLC
999643978
Doi
  • 10.1201/9781315151458
Statement on language in description
Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage. Read more...
Other views
Staff view