Data science for business

by Fawcett, Tom;Provost, Foster

Data Science

Book Details

Book Title

Data science for business

Author

Fawcett, Tom;Provost, Foster

Publisher

O'Reilly

Publication Date

2013

ISBN

9781449361327

Number of Pages

409

Language

English

Format

PDF

File Size

12MB

Subject

Data science

Table of Contents

  • Copyright
  • Table of Contents
  • Preface
  • Chapter 1. Introduction: Data-Analytic Thinking
  • The Ubiquity of Data Opportunities
  • Example: Hurricane Frances
  • Example: Predicting Customer Churn
  • Data Science, Engineering, and Data-Driven Decision Making
  • Data Processing and “Big Data”
  • From Big Data 1.0 to Big Data 2.0
  • Data and Data Science Capability as a Strategic Asset
  • Data-Analytic Thinking
  • This Book
  • Data Mining and Data Science, Revisited
  • Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
  • Summary
  • Chapter 2. Business Problems and Data Science Solutions
  • From Business Problems to Data Mining Tasks
  • Supervised Versus Unsupervised Methods
  • Data Mining and Its Results
  • The Data Mining Process
  • Implications for Managing the Data Science Team
  • Other Analytics Techniques and Technologies
  • Summary
  • Chapter 3. Introduction to Predictive Modeling: From Correlation to Supervised Segmentation
  • Models, Induction, and Prediction
  • Supervised Segmentation
  • Visualizing Segmentations
  • Trees as Sets of Rules
  • Probability Estimation
  • Example: Addressing the Churn Problem with Tree Induction
  • Summary
  • Chapter 4. Fitting a Model to Data
  • Classification via Mathematical Functions
  • Regression via Mathematical Functions
  • Class Probability Estimation and Logistic “Regression”
  • Example: Logistic Regression versus Tree Induction
  • Nonlinear Functions, Support Vector Machines, and Neural Networks
  • Summary
  • Chapter 5. Overfitting and Its Avoidance
  • Generalization
  • Overfitting
  • Overfitting Examined
  • Example: Overfitting Linear Functions
  • Example: Why Is Overfitting Bad?
  • From Holdout Evaluation to Cross-Validation
  • The Churn Dataset Revisited
  • Learning Curves
  • Overfitting Avoidance and Complexity Control
  • Summary
  • Chapter 6. Similarity, Neighbors, and Clusters
  • Similarity and Distance
  • Nearest-Neighbor Reasoning
  • Some Important Technical Details Relating to Similarities and Neighbors
  • Clustering
  • Stepping Back: Solving a Business Problem Versus Data Exploration
  • Summary
  • Chapter 7. Decision Analytic Thinking I: What Is a Good Model?
  • Evaluating Classifiers
  • Generalizing Beyond Classification
  • A Key Analytical Framework: Expected Value
  • Evaluation, Baseline Performance, and Implications for Investments in Data
  • Summary
  • Chapter 8. Visualizing Model Performance
  • Ranking Instead of Classifying
  • Profit Curves
  • ROC Graphs and Curves
  • The Area Under the ROC Curve (AUC)
  • Cumulative Response and Lift Curves
  • Example: Performance Analytics for Churn Modeling
  • Summary
  • Chapter 9. Evidence and Probabilities
  • Example: Targeting Online Consumers With Advertisements
  • Combining Evidence Probabilistically
  • Applying Bayes’ Rule to Data Science
  • A Model of Evidence “Lift”
  • Example: Evidence Lifts from Facebook “Likes”
  • Summary
  • Chapter 10. Representing and Mining Text
  • Why Text Is Important
  • Why Text Is Difficult
  • Representation
  • Example: Jazz Musicians
  • The Relationship of IDF to Entropy
  • Beyond Bag of Words
  • Example: Mining News Stories to Predict Stock Price Movement
  • Summary
  • Chapter 11. Decision Analytic Thinking II: Toward Analytical Engineering
  • Targeting the Best Prospects for a Charity Mailing
  • Our Churn Example Revisited with Even More Sophistication
  • Chapter 12. Other Data Science Tasks and Techniques
  • Co-occurrences and Associations: Finding Items That Go Together
  • Profiling: Finding Typical Behavior
  • Link Prediction and Social Recommendation
  • Data Reduction, Latent Information, and Movie Recommendation
  • Bias, Variance, and Ensemble Methods
  • Data-Driven Causal Explanation and a Viral Marketing Example
  • Summary
  • Chapter 13. Data Science and Business Strategy
  • Thinking Data-Analytically, Redux
  • Achieving Competitive Advantage with Data Science
  • Sustaining Competitive Advantage with Data Science
  • Attracting and Nurturing Data Scientists and Their Teams
  • Examine Data Science Case Studies
  • Be Ready to Accept Creative Ideas from Any Source
  • Be Ready to Evaluate Proposals for Data Science Projects
  • A Firm’s Data Science Maturity
  • Chapter 14. Conclusion
  • The Fundamental Concepts of Data Science
  • What Data Can’t Do: Humans in the Loop, Revisited
  • Privacy, Ethics, and Mining Data About Individuals
  • Is There More to Data Science?
  • Final Example: From Crowd-Sourcing to Cloud-Sourcing
  • Final Words
  • Appendix A. Proposal Review Guide
  • Appendix B. Another Sample Proposal
  • Glossary
  • Bibliography
  • Index
  • About the Authors