簡體周年慶
數據科學原理(影印版)

數據科學原理(影印版)

  • 定價:552
  • 優惠價:87480
  • 運送方式:
  • 臺灣與離島
  • 海外
  • 可配送點:台灣、蘭嶼、綠島、澎湖、金門、馬祖
  • 可取貨點:台灣、蘭嶼、綠島、澎湖、金門、馬祖
載入中...
  • 分享
 

內容簡介

本書旨在幫助你將數學、編程和商業分析這三者融會貫通。有了這本書,在面對復雜的問題時,無論是抽象和原始的數據統計,還是可實施的理念,你都會充滿自信。我們采用了一種獨特的方法來建立起數學和計算機科學之間的橋梁,你會在這次令人興奮的學習之旅中成長為一名數據科學家。

從清洗和准備數據開始,然后到給出有效的數據挖掘策略和技術,你會經歷數據科學的整個流程,建立起數據科學的各個組成部分是如何相互協作的宏觀概念,學習基本的數學和統計學知識以及一些目前由數據科學家和分析師用到的偽代碼。

除此之外,你還將掌握機器學習,了解一些有用的統計模型,這些模型能夠幫助你控制和處理很密集的數據集,學會如何創建出能股表達數據意圖的可視化方法。
 

目錄

Preface

Chapter 1: How to Sound Like a Data Scientist

What is data science?

Basic terminology

Why data science?

Example - Sigma Technologies

The data science Venn diagram

The math

Example - spawner-recruit models

Computer programming

Why Python?

Python practices

Example of basic Python

Domain knowledge

Some more terminology

Data science case studies

Case study - automating government paper pushing

Fire all humans, right?

Case study - marketing dollars

Case study - what’’s in a job description?

Summary

Chapter 2: Types of Data

Flavors of data

Why look at these distinctions?

Structured versus unstructured data

Example of data preprocessing

Word/phrase counts

Presence of certain special characters

Relative length of text

Picking out topics

Quantitative versus qualitative data

Example - coffee shop data

Example - world alcohol consumption data

Digging deeper

The road thus far

The four levels of data

The nominal level

Mathematical operations allowed

Measures of center

What data is like at the nominal level

The ordinal level

Examples

Mathematical operations allowed

Measures of center

Quick recap and check

The interval level

Example

Mathematical operations allowed

Measures of center

Measures of variation

The ratio level

Examples

Measures of center

Problems with the ratio level

Data is in the eye of the beholder

Summary

Chapter 3: The Five Steps of Data Science

Introduction to Data Science

Overview of the five steps

Ask an interesting question

Obtain the data

Explore the data

Model the data

Communicate and visualize the results

Explore the data

Basic questions for data exploration

Dataset 1 - Yelp

Dataframes

Series

Exploration tips for qualitative data

Dataset 2 - titanic

Summary

Chapter 4: Basic Mathematics

Mathematics as a discipline

Basic symbols and terminology

Vectors and matrices

Quick exercises

Answers

Arithmetic symbols

Summation

Proportional

Dot product

Graphs

Logarithms/exponents

Set theory

Linear algebra

Matrix multiplication

How to multiply matrices

Summary

Chapter 5: Impossible or Improbable - A Gentle Introduction to Probability

Basic definitions

Probability

Bayesian versus Frequentist

Frequentist approach

The law of large numbers

Compound events

Conditional probability

The rules of probability

The addition rule

Mutual exclusivity

The multiplication rule

Independence

Complementary events

A bit deeper

Summary

Chapter 6: Advanced Probability

Collectively exhaustive events

Bayesian ideas revisited

Bayes theorem

More applications of Bayes theorem

Example - Titanic

Example - medical studies

Random variables

Discrete random variables

Types of discrete random variables

Summary

Chapter 7: Basic Statistics

What are statistics?

How do we obtain and sample data?

Obtaining data

Observational

Experimental

Sampling data

Probability sampling

Random sampling

Unequal probability sampling

How do we measure statistics?

Measures of center

Measures of variation

Definition

Example - employee salaries

Measures of relative standing

The insightful part - correlations in data

The Empirical rule

Summary

Chapter 8: Advanced Statistics

Point estimates

Sampling distributions

Confidence intervals

Hypothesis tests

Conducting a hypothesis test

One sample t-tests

Example of a one sample t-tests

Assumptions of the one sample t-tests

Type I and type II errors

Hypothesis test for categorical variables

Chi-square goodness of fit test

Chi-square test for association/independence

Summary

Chapter 9: Communicating Data

Why does communication matter?

Identifying effective and ineffective visualizations

Scatter plots

Line graphs

Bar charts

Histograms

Box plots

When graphs and statistics lie

Correlation versus causation

Simpson’’s paradox

If correlation doesn’’t imply causation, then what does?

Verbal communication

It’’s about telling a story

On the more formal side of things

The whylhowlwhat strategy of presenting

Summary

Chapter 10: How to Tell If Your Toaster Is Learning - Machine Learning Essentials

What is machine learning?

Machine learning isn’’t perfect

How does machine learning work?

Types of machine learning

Supervised learning

It’’s not only about predictions

Types of supervised learning

Data is in the eyes of the beholder

Unsupervised learning

Reinforcement learning

Overview of the types of machine learning

How does statistical modeling fit into all of this?

Linear regression

Adding more predictors

Regression metrics

Logistic regression

Probability, odds, and log odds

The math of logistic regression

Dummy variables

Summary

Chapter 11: Predictions Don’’t Grow on Trees - or Do They?

Na’’fve Bayes classification

Decision trees

How does a computer build a regression tree?

How does a computer fit a classification tree?

Unsupervised learning

When to use unsupervised learning

K-means clustering

Illustrative example - data points

Illustrative example - beer!

Choosing an optimal number for K and cluster validation

The Silhouette Coefficient

Feature extraction and principal component analysis

Summary

Chapter 12: Beyond the Essentials

The bias variance tradeoff

Error due to bias

Error due to variance

Two extreme cases of bias/variance tradeoff

Underfitting

Overfitting

How bias/variance play into error functions

K folds cross-validation

Grid searching

Visualizing training error versus cross-validation error

Ensembling techniques

Random forests

Comparing Random forests with decision trees

Neural networks

Basic structure

Summary

Chapter 13: Case Studies

Case study 1 - predicting stock prices based on social media

Text sentiment analysis

Exploratory data analysis

Regression route

Classification route

Going beyond with this example

Case study 2 - why do some people cheat on their spouses?

Case study 3 - using tensorflow

Tensorflow and neural networks

Summary

Index
 

詳細資料

  • ISBN:9787564173647
  • 規格:369頁 / 普通級 / 1-1
  • 出版地:中國

最近瀏覽商品

 

相關活動

  • 【自然科普、電腦資訊】遠流電子書展|單本79折、兩本75折|世界在變,你不能不變!
 

購物說明

溫馨提醒您:若您訂單中有購買簡體館無庫存/預售書或庫存於海外廠商的書籍,建議與其他商品分開下單,以避免等待時間過長,謝謝。

大陸出版品書況:因裝幀品質及貨運條件未臻完善,書況與台灣出版品落差甚大,封面老舊、出現磨痕、凹痕等均屬常態,故簡體字館除封面破損、內頁脫落...等較嚴重的狀態外,其餘所有商品將正常出貨。 

 

請注意,部分書籍附贈之內容(如音頻mp3或影片dvd等)已無實體光碟提供,需以QR CODE 連結至當地網站註冊“並通過驗證程序”,方可下載使用。

調貨時間:若您購買海外庫存之商品,於您完成訂購後,商品原則上約45個工作天內抵台(若有將延遲另行告知)。為了縮短等待的時間,建議您將簡體書與其它商品分開訂購,以利一般商品快速出貨。 

若您具有法人身份為常態性且大量購書者,或有特殊作業需求,建議您可洽詢「企業採購」。 

退換貨說明 

會員所購買的商品均享有到貨十天的猶豫期(含例假日)。退回之商品必須於猶豫期內寄回。 

辦理退換貨時,商品必須是全新狀態與完整包裝(請注意保持商品本體、配件、贈品、保證書、原廠包裝及所有附隨文件或資料的完整性,切勿缺漏任何配件或損毀原廠外盒)。退回商品無法回復原狀者,恐將影響退貨權益或需負擔部分費用。 

訂購本商品前請務必詳閱商品退換貨原則

  • 簡體週年慶
  • 商業新品
  • 福寶