Subscribe to our Newsletter

80 Best Data Science Books That Are Worthy Reading

Data science is probably the most popular concept nowadays. I believe that many people are looking for an entrance to get inside the industry, and I just happened to read an article that lists some great data science books that may be helpful for you. So I concluded it in this article and I’ve also given the books brief introductions, so you can choose the ones you’d like to read. Some of the data science books you can find it online, and I've given out the links. But most of them I think you may need to find them on Amazon.

Part I: Data Scientist Core Skills

  • Data Science
  • Math
  • Probability and Statistics
  • Machine Learning
  • Data Mining
  • SQL
  • R
  • Python
  • Data Scientist Interview
  • Algorithm
  • Handbook
  • Web Scraping and Data Wrangling
  • Data Visualization and Storytelling
  • A/B Testing

Part II: Data Science Advanced Skills

  • Neural Network and Deep Learning
  • Information Theory
  • Causal Inference
  • Sampling
  • Convex
  • Growth Analytics
  • Text Mining and Natural Language Processing
  • Anomaly Detection
  • Recommender Systems
  • Social Network Analysis
  • Time Series Analysis and Forecasting
  • Reinforcement Learning and Artificial Intelligence

Part III: Leisure Reading

Part I: Data Scientist Core Skills

Data Science

1. The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists


25 experts in the industry gave out some advice in this handbook, very helpful for starters.

 

2. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking


Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.

 

3. Doing Data Science: Straight Talk from the Frontline


In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.

 

Math

4. Multivariate Calculus


https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/index.htm

 

5. Linear Algebra


https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm

 

Probability and Statistics

6. Introduction to Probability, Statistics, and Random Processes


This book introduces students to probability, statistics, and stochastic processes. It can be used by both students and practitioners in engineering, various sciences, finance, and other related fields. It provides a clear and intuitive approach to these topics while maintaining mathematical accuracy. You can also find courses and videos online.
https://www.probabilitycourse.com

 

7. OpenIntro Statistics


The OpenIntro project was founded in 2009 to improve the quality and availability of education by producing exceptional books and teaching tools that are free to use and easy to modify. And whose inaugural effort is OpenIntro Statistics. Corresponding courses and videos can be found in:
https://www.openintro.org

 

8. Statistical Inference


It’s a textbook for fresh graduates in many colleges.
Discusses both theoretical statistics and the practical applications of the theoretical developments. Includes a large number of exercises covering both theory and applications.

 

9. Applied Linear Statistical Models

Applied Linear Statistical Models is the long established leading authoritative text and reference on statistical modeling. The Fifth edition provides an increased use of computing and graphical analysis throughout, without sacrificing concepts or rigor. In general, the 5e uses larger data sets in examples and exercises, and where methods can be automated within software without loss of understanding, it is so done.

 

10. An Introduction to Generalized Linear Models


Contents summarized as the title. An introduction to generalized linear models.

 

11. All of Statistics: A Concise Course in Statistical Inference


This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines.

 

12. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science


Efron and Hastie gave us a comprehensive introduction to statistics in the big data era through this book.

 

13. Statistics in a Nutshell: A Desktop Quick Reference


A quick reference as the title says

 

14. Bayes' Rule: A Tutorial Introduction to Bayesian Analysis

 

15. Think Bayes: Bayesian Statistics in Python


Briefly introduces how to use Python to do Bayesian Statistics
http://www.greenteapress.com/thinkbayes/thinkbayes.pdf

 

16. Bayesian Methods for Hackers


Advance tutorials on how to use Python to do Bayesian statistics
https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

 

17. Practical Statistics for Data Scientists: 50 Essential Concepts


This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.
You can find it here: https://github.com/andrewgbruce/statistics-for-data-scientists

 

Machine Learning

18. An Introduction to Statistical Learning: with Applications in R


A good book no doubt, everyone in the field should have heard about it.
http://www-bcf.usc.edu/~gareth/ISL/
https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/about

 

19. Applied Predictive Modeling


Applied Predictive Modeling covers the overall predictive modeling process. A must-read before interview or work.

 

20. Python Machine Learning


Python Machine Learning Second Edition now includes the popular TensorFlow deep learning library. The scikit-learn code has also been fully updated to include recent improvements and additions to this versatile machine learning library.

 

21. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies


A comprehensive introduction to the most important machine learning approaches used in predictive data analytics, covering both theoretical concepts and practical applications.

 

22. Real-World Machine Learning


This book tells you how to use machine learning to solve real-world problems. Strongly recommend to all data scientists to read it before internship or work

 

23. Learning From Data


Explained many machine learning theories that many books don’t mention, such as VC dimension.
https://work.caltech.edu/telecourse.html

 

24. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition


This book describes the important ideas in a variety of fields such as medicine, biology, finance, and marketing in a common conceptual framework. The great ESL, I think it is suitable for thumbing through and excerpting.

 

25. Pattern Recognition and Machine Learning


The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning.

 

Data Mining

26. Principles of Data Mining


A basic introduction to Data mining, talks a lot about association rules.

 

27. Introduction to Data Mining


Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time.

 

28. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management


Uses practical examples to introduce how to use data mining to earn from customers.

 

SQL

29. SQL Cookbook: Query Solutions and Techniques for Database Developers


This cookbook mentions lots of traps in SQL query, and it gives out every popular database’s query code.

 

R

30. R in Action


The book begins by introducing the R language, including the development environment. Focusing on practical solutions, the book also offers a crash course in practical statistics and covers elegant methods for dealing with messy and incomplete data using features of R.

 

31. R for Data Science


32. R Packages


33. Advanced R


Written by Professor Hadley Wickham.
R for Data Science, with Garrett Grolemund, introduces the key tools for doing data science with R.
R packages teaches good software engineering practices for R, using packages for bundling, documenting, and testing your code.
Advanced R helps you master R as a programming language, teaching you what makes R tick.

 

Python

34. Think Python


This hands-on guide takes you through the language a step at a time, beginning with basic programming concepts before moving on to functions, recursion, data structures, and object-oriented design. Suitable for beginners

 

35. Fluent Python


Author Luciano Ramalho takes you through Python’s core language features and libraries, and shows you how to make your code shorter, faster, and more readable at the same time.

 

36. Python for Probability, Statistics, and Machine Learning


This book covers the key ideas that link probability, statistics, and machine learning illustrated using Python modules in these areas.

 

37. Python Data Science Handbook


A very comprehensive handbook, tells about using Python to solve data science problems.
https://github.com/jakevdp/PythonDataScienceHandbook

 

Data Scientist Interview

38. Data Science Interviews Exposed

Data Science Interviews Exposed offers data science career advice and REAL interview questions to help you get the six-figures salary jobs!

 

39. Cracking the PM Interview: How to Land a Product Manager Job in Technology


In U.S.A., many data scientists work closely related to products, even some of they are employed as product managers, so this book talking PM interview has its referential value to data scientists.

 

Algorithm

40. Grokking Algorithms: An illustrated guide for programmers and other curious people


Grokking Algorithms is a fully illustrated, friendly guide that teaches you how to apply common algorithms to the practical problems you face every day as a programmer.

 

41. Problem Solving with Algorithms and Data Structures Using Python


The study of algorithms and data structures is central to understanding what computer science is all about. And these are what this book all about.
Electronic edition: http://interactivepython.org/runestone/static/pythonds/index.html

 

42. Algorithms in a Nutshell: A Practical Guide


An algorithm guide for quick review.

 

Handbook

43. The Data Science Handbook


A comprehensive overview of data science covering the analytics, programming, and business skills necessary to master the discipline

 

Web Scraping and Data Wrangling

44. Web Scraping with Python: Collecting Data from the Modern Web


With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Actually, simply using Octoparse can fulfill your web scraping needs.

 

45. Data Wrangling with Python: Tips and Tools to Make Your Life Easier


This book teaches you how to cleanse messy original data. Wrangle it into the way you want.

 

46. Regular Expressions Cookbook


Though regular expressions are annoying, you have to face it. You can use this book to check up the regular expressions you want.

 

Data Visualization and Storytelling

47. Communicating Data with Tableau: Designing, Developing, and Delivering Data Visualizations


This practical guide shows you how to use Tableau Software to convert raw data into compelling data visualizations that provide insight or allow viewers to explore the data for themselves.

 

48. Interactive Data Visualization for the Web: An Introduction to Designing with D3


This fully updated and expanded second edition takes you through the fundamental concepts and methods of D3, the most powerful JavaScript library for expressing data visually in a web browser.

 

49. Data Visualization with Python and JavaScript: Scrape, Clean, Explore & Transform Your Data


With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations.

 

50. Storytelling with Data: A Data Visualization Guide for Business Professionals


This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story.

 

A/B Testing

51. A / B Testing: The Most Powerful Way to Turn Clicks Into Customers


52. Designing with Data: Improving the User Experience with A/B Testing

 

 

 

Part II: Data Science Advanced Skills

This part of books is recommended for those who are wishing to become a Saiyan among data scientists.

Neural Network and Deep Learning

53. Make Your Own Neural Network

A step-by-step gentle journey through the mathematics of neural networks, and making your own using the Python computer language.This guide will take you on a fun and unhurried journey, starting from very simple ideas, and gradually building up an understanding of how neural networks work.

 

54. Deep Learning


An introduction to a broad range of topics in deep learning, covering mathematical and conceptual background, deep learning techniques used in industry, and research perspectives.

 

55. Hands-On Machine Learning with Scikit-Learn and TensorFlow


This practical book shows you how to use simple and efficient tools to implement programs capable of learning from data.

 

Information Theory

56. Data Science and Information Theory
This is an article that introduces the importance of Information Theory in data science field.

 

57. Information Theory: A Tutorial Introduction


In this richly illustrated book, accessible examples are used to introduce information theory in terms of everyday games like ‘20 questions’ before more advanced topics are explored.

 

58. Information, Entropy, Life and the Universe: What We Know and What We Do Not Know


If you are interested in exploring the world of Information, Entropy and Probability or just the world in general this is a great place to start. Arieh takes the reader through a detailed unfolding of these topics while providing numerous common examples to help with these sometimes difficult to grasp topics

 

Causal Inference

59. Causal Inference in Statistics: A Primer


Judea Pearl presents a book ideal for beginners in statistics, providing a comprehensive introduction to the field of causality.

 

60. Field Experiments: Design, Analysis, and Interpretation


A brief, authoritative introduction to field experimentation in the social sciences.

 

Sampling

61. Sampling


Sampling provides an up-to-date treatment of both classical and modern sampling design and estimation methods, along with sampling methods for rare, clustered, and hard-to-detect populations.

 

Convex

62. Convex Optimization


A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. 

 

Growth Analytics

63. Lean Analytics: Use Data to Build a Better Startup Faster (Lean Series)


Written by Alistair Croll (Coradiant, CloudOps, Startupfest) and Ben Yoskovitz (Year One Labs, GoInstant), the book lays out practical, proven steps to take your startup from initial idea to product/market fit and beyond.

 

64. Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity


Web Analytics 2.0 provides specific recommendations for creating an actionable strategy, applying analytical techniques correctly, solving challenges such as measuring social media and multichannel campaigns, achieving optimal success by leveraging experimentation, and employing tactics for truly listening to your customers.

 

Text Mining And Natural Language Processing

65. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit


This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation.
Read online: http://www.nltk.org/book/

 

66. Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data


Text Analytics with Python teaches you the techniques related to natural language processing and text analytics, and you will gain the skills to know which technique is best suited to solve a particular problem.

 

67. Introduction to Information Retrieval


Class-tested and coherent, this groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts.
Read online: https://nlp.stanford.edu/IR-book/

 

Anomaly Detection

68. Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection


Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques is an authoritative guidebook for setting up a comprehensive fraud detection analytics solution.

 

69. Outlier Analysis


This book provides comprehensive coverage of the field of outlier analysis from a computer science point of view. It integrates methods from data mining, machine learning, and statistics within the computational framework and therefore appeals to multiple communities.

 

Recommender Systems

70. Recommender Systems: The Textbook


This book comprehensively covers the topic of recommender systems, which provide personalized recommendations of products or services to users based on their previous searches or purchases.

 

Social network analysis

71. Network Science


This pioneering textbook, spanning a wide range of topics from physics to computer science, engineering, economics and the social sciences, introduces network science to an interdisciplinary audience.



72. Social and Economic Networks


In Social and Economic Networks, Matthew Jackson offers a comprehensive introduction to social and economic networks, drawing on the latest findings in economics, sociology, computer science, physics, and mathematics.

 

73. Social Network Analysis for Startups: Finding connections on the social web


You'll learn concepts and techniques for recognizing patterns in social media, political groups, companies, cultural trends, and interpersonal networks.

 

Time Series Analysis and Forecasting

74. Practical Time Series Forecasting with R: A Hands-On Guide


The book introduces popular forecasting methods and approaches used in a variety of business applications. The book offers clear explanations, practical examples, and end-of-chapter exercises and cases.

 

75. Forecasting: principles and practice


This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly.

 

Reinforcement Learning and Artificial Intelligence

76. Reinforcement Learning: An Introduction


Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.

 

77. Artificial Intelligence: A Modern Approach


Artificial Intelligence: A Modern Approach, 3e offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence. Number one in its field, this textbook is ideal for one or two-semester, undergraduate or graduate-level courses in Artificial Intelligence.

 

Part III: Leisure Reading

78. Soft Skills: The software developer's life manual


Soft Skills: The software developer's life manual is a unique guide, offering techniques and practices for a more satisfying life as a professional software developer.

 

79. The Healthy Programmer: Get Fit, Feel Better, and Keep Coding


This is an excellent book for any professional who sits too much for the job. It contains informative suggestions to improve your health in ways that fit into your busy day. What makes this book different is its practical suggestions which fit into the hectic lifestyle.

 

80. Exposing the Magic of Design


This book offers a way of thinking about complicated, multifaceted problems with a repeatable degree of success. Design synthesis methods can be applied in business to produce new and compelling products and services, or these methods can be applied in government with the goal of changing culture and bettering society.

 

81. Thinking, Fast and Slow


The book has about 3k reviews in Amazon. No certain description was given, but I believe it’s a great and interesting book for all people.

 

82. Naked Statistics: Stripping the Dread from the Data


Perhaps the most interesting statistics textbook you’d have ever read.

 

83. Uncertainty: The Soul of Modeling, Probability & Statistics


This book presents a philosophical approach to probability and probabilistic thinking, considering the underpinnings of probabilistic reasoning and modeling, which effectively underlie everything in data science.

 

Source: Octoparse

 

More related sources:

Top 30 Big Data Tools for Data Analysis

Top 8 Technology Trends for 2018 You Must Know

Top 30 Process Automation Tools for 2018

Why we need data service?

Top 30 Free Web Scraping Software

Big Data: 70 Amazing Free Data Sources You Should Know for 2017

Views: 15441

Comment

You need to be a member of BigDataNews to add comments!

Join BigDataNews

© 2018   BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service