Data Science for Decision Makers : Enhance Your Leadership Skills with Data Science and AI Expertise / Jon Howells.

Author
Howells, Jon [Browse]
Format
Book
Language
English
Εdition
First edition.
Published/​Created
  • Birmingham, England : Packt Publishing, [2024]
  • ©2024
Description
1 online resource (270 pages)

Details

Subject(s)
Summary note
Bridge the gap between business and data science by learning how to interpret machine learning and AI models, manage data teams, and achieve impactful results Key Features Master the concepts of statistics and ML to interpret models and guide decisions Identify valuable AI use cases and manage data science projects from start to finish Empower top data science teams to solve complex problems and build AI products Purchase of the print Kindle book includes a free PDF eBook Book Description As data science and artificial intelligence (AI) become prevalent across industries, executives without formal education in statistics and machine learning, as well as data scientists moving into leadership roles, must learn how to make informed decisions about complex models and manage data teams. This book will elevate your leadership skills by guiding you through the core concepts of data science and AI. This comprehensive guide is designed to bridge the gap between business needs and technical solutions, empowering you to make informed decisions and drive measurable value within your organization. Through practical examples and clear explanations, you'll learn how to collect and analyze structured and unstructured data, build a strong foundation in statistics and machine learning, and evaluate models confidently. By recognizing common pitfalls and valuable use cases, you'll plan data science projects effectively, from the ground up to completion. Beyond technical aspects, this book provides tools to recruit top talent, manage high-performing teams, and stay up to date with industry advancements. By the end of this book, you'll be able to characterize the data within your organization and frame business problems as data science problems. What you will learn Discover how to interpret common statistical quantities and make data-driven decisions Explore ML concepts as well as techniques in supervised, unsupervised, and reinforcement learning Find out how to evaluate statistical and machine learning models Understand the data science lifecycle, from development to monitoring of models in production Know when to use ML, statistical modeling, or traditional BI methods Manage data teams and data science projects effectively Who this book is for This book is designed for executives who want to understand and apply data science methods to enhance decision-making. It is also for individuals who work with or manage data scientists and machine learning engineers, such as chief data officers (CDOs), data science managers, and technical project managers.
Bibliographic references
Includes bibliographical references and index.
Source of description
  • Description based on publisher supplied metadata and other sources.
  • Description based on print version record.
Contents
  • Cover
  • Title Page
  • Copyright and Credits
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Understanding Data Science and Its Foundations
  • Introducing Data Science
  • Data science, AI, and ML - what's the difference?
  • The mathematical and statistical underpinnings of data science
  • Statistics and data science
  • What is statistics?
  • Descriptive and inferential statistics
  • Sampling strategies
  • Probability
  • Probability distribution
  • Conditional probability
  • Describing our samples
  • Measures of central tendency
  • Measures of dispersion
  • Degrees of freedom
  • Correlation, causation, and covariance
  • The shape of data
  • Probability distributions
  • Discrete probability distributions
  • Continuous probability distributions
  • Summary
  • Characterizing and Collecting Data
  • What are the key criteria to consider when evaluating datasets?
  • Data quantity
  • Data velocity
  • Data variety
  • Data quality
  • First-, second-, and third-party data
  • First-party data - the treasure trove within
  • Second-party data - building bridges through collaboration
  • Third-party data - broadening horizons with external expertise
  • Structured, unstructured, and semi-structured data
  • Structured data
  • Unstructured data
  • Semi-structured data
  • Methods for collecting data
  • Storing and processing data
  • Cloud, on-premises, and hybrid solutions - navigating the data storage and analysis landscape
  • Cloud computing - scalable services in the cloud
  • On-premises - maintaining control within your walls
  • Hybrid - the best of both worlds?
  • Data processing
  • Exploratory Data Analysis
  • Getting started with Google Colab
  • What is Google Colab?
  • A step-by-step guide to setting up Google Colab
  • Understanding the data you have
  • EDA techniques and tools
  • Descriptive statistics
  • Data visualization.
  • Histograms
  • Density curves
  • Boxplots
  • Heatmaps
  • Dimensionality reduction
  • Correlation analysis
  • Outlier detection
  • The Significance of Significance
  • The idea of testing hypotheses
  • What is a hypothesis?
  • How does hypothesis testing work?
  • Formulating null and alternative hypotheses
  • Determining the significance level
  • Understanding errors
  • Getting to grips with p-values
  • Significance tests for a population proportion - making informed decisions about proportions
  • The z-test - comparing a sample proportion to a population proportion
  • Z-test example made easy
  • Significance tests for a population average (mean)
  • Writing hypotheses for a significance test about a mean
  • Conditions for a t-test about a mean
  • When to use z or t statistics in significance tests
  • Example - calculating the t-statistic for a test about a mean
  • Using a table to estimate the p-value from the t-statistic
  • Comparing the p-value from the t-statistic to the significance level
  • One-tailed and two-tailed tests
  • Walking through a case study
  • Understanding Regression
  • How can I benefit from understanding regression?
  • Introduction to trend lines
  • Fitting a trend line to data
  • Estimating the line of best fit
  • Calculating the equations of the lines of best fit
  • Interpreting the slope of a regression line
  • Interpreting the intercept of a regression line
  • Understanding residuals
  • Evaluating the goodness of fit in least-squares regression
  • Part 2: Machine Learning - Concepts, Applications, and Pitfalls
  • Introducing Machine Learning
  • From statistics to machine learning
  • What is machine learning?
  • How does machine learning relate to statistics?
  • Why is machine learning important?
  • Customer personalization and segmentation
  • Fraud detection and security.
  • Supply chain and inventory optimization
  • Predictive maintenance
  • Healthcare diagnostics and treatment
  • The different types of machine learning
  • Supervised learning
  • Unsupervised learning
  • Semi-supervised learning
  • Reinforcement learning
  • Transfer learning
  • Popular machine learning algorithms
  • Linear regression
  • Logistic regression
  • Decision trees
  • Random forests
  • Support vector machines
  • k-nearest neighbors
  • Neural networks
  • The machine learning process
  • Training a supervised machine learning model
  • Validation of a supervised machine learning model
  • Testing a supervised machine learning model
  • Evaluating machine learning models
  • Risks and limitations of machine learning
  • Overfitting and underfitting
  • Bias and variance
  • Balanced dataset
  • Models are approximations of reality
  • Machine learning on unstructured data
  • Natural language processing (NLP)
  • Computer vision
  • Deep learning and artificial intelligence
  • Artificial intelligence
  • Deep learning
  • Supervised Machine Learning
  • Defining supervised learning
  • Applications of supervised learning
  • The two types of supervised learning
  • Key factors in supervised learning
  • Steps within supervised learning
  • Data preparation - laying the foundation
  • Algorithm selection - choosing the right tool
  • Model training - learning from data
  • Model evaluation - assessing performance
  • Prediction and deployment - putting the model to work
  • Characteristics of regression and classification algorithms
  • Regression algorithms
  • Classification algorithms
  • Key considerations in supervised learning
  • Evaluation metrics
  • Consumer goods
  • Retail
  • Manufacturing
  • Unsupervised Machine Learning
  • Defining UL
  • Practical examples of UL
  • Steps in UL
  • Step 1 - Data collection.
  • Step 2 - Data preprocessing
  • Step 3 - Choosing the right model
  • Step 4 - Training the model
  • Step 5 - Interpretation and evaluation
  • In summary
  • Clustering - unveiling hidden patterns in your data
  • What is clustering?
  • How does clustering work?
  • k-means clustering
  • Practical applications of clustering
  • Evaluation metrics for clustering
  • Association rule learning
  • What is association rule learning?
  • The Apriori algorithm - a practical example
  • Applications of UL
  • Market segmentation
  • Anomaly detection
  • Feature extraction
  • Interpreting and Evaluating Machine Learning Models
  • How do I know whether this model will be accurate?
  • Evaluating on test (holdout) data
  • Understanding evaluation metrics
  • Evaluating regression models
  • R-squared
  • Root mean squared error
  • Mean absolute error
  • When and how to use each metric
  • Practical evaluation strategies
  • Summarizing the evaluation of regression models
  • Evaluating classification models
  • Classification model evaluation metrics
  • Precision, recall, and F1-Score
  • Recall
  • F1-score
  • Methods for explaining machine learning models
  • Making sense of regression models - the power of coefficients
  • Decoding classification models - unveiling feature importance
  • Beyond specific models - universal insights using SHAP values
  • Common Pitfalls in Machine Learning
  • Understanding the complexity
  • Dirty data, damaged models - how data quantity and quality impact ML
  • The importance of adequate training data
  • Dealing with poor data quality
  • Conclusion
  • Overcoming overfitting and underfitting
  • Navigating training-serving skew and model drift
  • Ensuring fairness
  • Mastering overfitting and underfitting for optimal model performance.
  • Overfitting - when your model is too specific
  • Underfitting - when your model is too simplistic
  • Spotting the problem
  • Training-serving skew and model drift
  • Training-serving skew
  • Model drift
  • Key takeaways
  • Bias and fairness
  • Understanding bias
  • Understanding fairness
  • Mitigating bias and ensuring fairness
  • Part 3: Leading Successful Data Science Projects and Teams
  • The Structure of a Data Science Project
  • The various types of data science projects
  • Data products
  • Reports and analytics
  • Research and methodology
  • The stages of a data product
  • Identifying use cases
  • Evaluating use cases
  • Planning the data product
  • Developing a data product
  • Data preparation and exploratory analysis
  • Model design and development
  • Evaluation and testing
  • Deploying and monitoring a data product
  • General best practices for data product development
  • Evaluating impact
  • Predictive maintenance in manufacturing
  • Fraud detection in banking
  • Customer churn prediction in telecom
  • Demand forecasting in retail
  • Personalized recommendations in e-commerce
  • Predictive maintenance in energy
  • Workforce optimization in quick service restaurants
  • Chatbot-assisted customer support
  • The Data Science Team
  • Assembling your data science team - key roles and considerations
  • Data scientists
  • Machine learning engineers
  • Data engineers
  • MLOps engineers
  • Analytics engineers
  • Software engineers (full stack, frontend, backend)
  • Product managers
  • Business analysts
  • Data storytellers/visualization experts
  • Considerations when assembling your team
  • Data science teams within larger organizations
  • The hub and spoke model
  • What is the hub and spoke model?
  • Practical applications of the hub and spoke model
  • Building a hub and spoke model.
  • The art of recruitment.
ISBN
1-83763-834-9
OCLC
1441716909
Statement on language in description
Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage. Read more...
Other views
Staff view

Supplementary Information