Hands-on exploratory data analysis with Python: perform EDA techniques to understand, summarize, and investigate your data
Gespeichert in:
Beteiligte Personen: | , |
---|---|
Format: | Buch |
Sprache: | Englisch |
Veröffentlicht: |
Birmingham ; Mumbai
Packt
March 2020
|
Schlagwörter: | |
Links: | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
Umfang: | vii, 336 Seiten Illustrationen, Diagramme |
ISBN: | 9781789537253 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047207505 | ||
003 | DE-604 | ||
005 | 20210526 | ||
007 | t| | ||
008 | 210322s2020 xx a||| |||| 00||| eng d | ||
020 | |a 9781789537253 |9 978-1-78953-725-3 | ||
035 | |a (OCoLC)1250468951 | ||
035 | |a (DE-599)BVBBV047207505 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-355 | ||
084 | |a ST 600 |0 (DE-625)143681: |2 rvk | ||
100 | 1 | |a Mukhiya, Suresh Kumar |e Verfasser |0 (DE-588)1202455956 |4 aut | |
245 | 1 | 0 | |a Hands-on exploratory data analysis with Python |b perform EDA techniques to understand, summarize, and investigate your data |c Suresh Kumar Mukhiya, Usman Ahmed |
264 | 1 | |a Birmingham ; Mumbai |b Packt |c March 2020 | |
300 | |a vii, 336 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Explorative Datenanalyse |0 (DE-588)4128896-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Explorative Datenanalyse |0 (DE-588)4128896-8 |D s |
689 | 0 | 1 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Ahmed, Usman |e Verfasser |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-032612379 |
Datensatz im Suchindex
_version_ | 1819315321819889664 |
---|---|
adam_text | Table of Contents Preface 1 Section 1: The Fundamentals of EDA Chapter 1: Exploratory Data Analysis Fundamentals Understanding data science The significance of EDA Steps in EDA Making sense of data Numerical data Discrete data Continuous data Categorical data Measurement scales Nominal Ordinal Interval Ratio Comparing EDA with classical and Bayesian analysis Software tools available for EDA Getting started with EDA NumPy Pandas SciPy Matplotlib Summary Further reading Chapter 2: Visual Aids for EDA Technical requirements Line chart Steps involved Bar charts Scatter plot Bubble chart Scatter plot using seaborn Area plot and stacked plot Pie chart Table chart Polar chart ց ю 12 13 14 15 15 15 16 17 17 19 20 20 21 22 22 24 29 35 35 35 Յ6 37 յց 38 40 41 45 50 51 53 55 58 во
Table of Contents Histogram Lollipop chart Choosing the best chart Other libraries to explore Summary Further reading Chapter 3: EDA with Personal Email Technical requirements Loading the dataset Data transformation Data cleansing Loading the CSV file Converting the date Removing NaN values Applying descriptive statistics Data refactoring Dropping columns Refactoring timezones Data analysis Number of emails Time of day Average emails per day and hour Number of emails per day Most frequently used words Summary Further reading Chapter 4: Data Transformation Technical requirements Background Merging database-style dataframes Concatenating along with an axis Using df.merge with an inner join Using the pd.mergeQ method with a left join Using the pd.mergeO method with a right join Using pd.merge() methods with outer join Merging on index Reshaping and pivoting Transformation techniques Performing data deduplication Replacing values Handling missing data NaN values in pandas objects Dropping missing values 62 65 67 6Ց 70 70 71 72 74 75 76 76 76 77 77 79 во 80 82 82 83 84 87 90 91 92 93 ցյ 94 95 99 100 102 юз 104 104 106 108 108 111 112 114 116 ---------------------------------------- [ІП ----------------------------------------
Table of Contents Dropping by rows Dropping by columns Mathematical operations with NaN Filling missing values Backward and forward filling Interpolating missing values Renaming axis indexes Discretization and binning Outlier detection and filtering Permutation and random sampling Random sampling without replacement Random sampling with replacement Computing indicators/dummy variables String manipulation Benefits of data transformation Challenges Summary Further reading 116 117 118 119 121 122 122 123 127 130 131 132 133 135 135 135 136 136 Section 2: Descriptive Statistics Chapter 5: Descriptive Statistics Technical requirements Understanding statistics 139 139 140 140 141 143 144 145 146 147 147 147 148 148 153 153 154 156 157 158 159 160 161 164 164 Distribution function Uniform distribution Normal distribution Exponential distribution Binomial distribution Cumulative distribution function Descriptive statistics Measures of central tendency Mean/average Median Mode Measures of dispersion Standard deviation Variance Skewness Kurtosis Types of kurtosis Calculating percentiles Quartiles Visualizing quartiles Summary Further reading [ІІІ]
Table of Contents Chapter 6: Grouping Datasets Technical requirements Understanding groupby() Groupby mechanics 165 165 166 166 Selecting a subset of columns Max and min Mean 168 169 170 Data aggregation Group-wise operations Renaming grouped aggregation columns Group-wise transformations Pivot tables and cross-tabulations Pivot tables Cross-tabulations Summary Further reading Chapter 7: Correlation Technical requirements Introducing correlation Types of analysis Fundamentals of TSA Univariate time series Characteristics of time series data TSA with Open Power System Data --------------------------------------------------------- 178 iso 180 184 190 Discussing multivariate analysis using the Titanic dataset Outlining Simpson s paradox Correlation does not imply causation Summary Further reading Chapter 8: Time Series Analysis Technical requirements Understanding the time series dataset Summary Further reading 177 189 Understanding univariate analysis Understanding bivariate analysis Understanding multivariate analysis Data cleaning Time-based indexing Visualizing time series Grouping time series data Resampling time series data 172 174 191 192 192 194 194 199 201 206 213 215 216 216 217 217 218 218 221 221 222 223 226 226 230 231 233 233 [iv] ---------------------------------------------------------
Table of Contents Section 3: Model Development and Evaluation Chapter 9: Hypothesis Testing and Regression Technical requirements Hypothesis testing Hypothesis testing principle statsmodels library Average reading time Types of hypothesis testing T-test p-hacking Understanding regression Types of regression Simple linear regression Multiple linear regression Nonlinear regression Model development and evaluation Constructing a linear regression model Model evaluation Computing accuracy Understanding accuracy Implementing a multiple linear regression model Summary Further reading Chapter 10: Model Development and Evaluation Technical requirements Types of machine learning Understanding supervised learning Regression Classification Understanding unsupervised learning Applications of unsupervised learning Clustering using MiniBatch К-means clustering Extracting keywords Plotting clusters Word cloud Understanding reinforcement learning Difference between supervised and reinforcement learning Applications of reinforcement learning Unified machine learning workflow Data preprocessing Data collection Data analysis Data cleaning, normalization, and transformation Data preparation --------------------------------------------------------- [v] 237 237 2ՅՑ 238 240 241 242 242 243 244 245 245 246 246 247 247 253 255 255 257 260 260 261 261 262 262 263 263 264 265 265 267 269 270 273 273 274 274 276 276 276 277 277 ---------------------------------------------------------
Table of Contents Training sets and corpus creation Model creation and training Model evaluation Best model selection and evaluation Model deployment Summary Further reading Chapter 11 : EDA on Wine Quality Data Analysis Technical requirements Disclosing the wine quality dataset Loading the dataset Descriptive statistics Data wrangling Analyzing red wine 277 277 278 280 280 281 281 283 284 284 284 286 287 288 Finding correlated columns Alcohol versus quality Alcohol versus pH 289 294 295 Analyzing white wine 297 Red wine versus white wine Adding a new attribute Converting into a categorical column Concatenating dataframes Grouping columns Univariate analysis Multivariate analysis on the combined dataframe Discrete categorical attributes 3-D visualization 298 298 299 зоо 301 302 304 305 306 Model development and evaluation յօց Summary 313 Further reading 313 Appendix A: Appendix_____________________________________________ 315 String manipulation 315 Creating strings Accessing characters in Python String slicing Deleting/updating from a string Escape sequencing in Python Formatting strings 315 316 316 317 317 318 Using pandas vectorized string functions Using string functions with a pandas DataFrame 319 320 322 325 Using regular expressions Further reading Other Books You May Enjoy________________________________________ 327 [vi]
Table of Contents Index 331 [vii]
Exploratory Data Analysis with Python Exploratory Data Analysis (EDA) is an approach to data analysis that involves the application of diverse techniques to gain insights into a dataset. This book will help you gain practical knowledge of the main pillars of EDA - data cleaning, data preparation, data exploration, and data visualization. how to implement EDA techniques for model development and evaluation and build predictive models to visualize results. Using Python for data analysis, you’ll work with real-world datasets, understand data, summarize its characteristics, and visualize it for business intelligence. You’ll start by performing EDA using open source datasets and perform simple to advanced analyses to turn data into meaningful insights. You’ll then learn various descriptive statistical techniques to describe the basic characteristics of data and progress to performing EDA on time-series data. As you advance, you’ll learn By the end of this EDA book, you’ll have developed the skills required to carry out a preliminary investigation on any dataset, yield insights into data, present your results with visual aids, and build a model that correctly predicts future outcomes. Things you will learn: • Import, clean, and explore data to perform preliminary analysis using powerful Python packages • Identify and transform erroneous data using different data wrangling techniques • Explore the use of multiple regression to describe non-linear relationships • Discover hypothesis testing and explore techniques of time-series analysis • Understand and interpret results obtained
from graphical analysis • Build, train, and optimize predictive models to estimate results • Perform complex EDA techniques on open source datasets
|
any_adam_object | 1 |
author | Mukhiya, Suresh Kumar Ahmed, Usman |
author_GND | (DE-588)1202455956 |
author_facet | Mukhiya, Suresh Kumar Ahmed, Usman |
author_role | aut aut |
author_sort | Mukhiya, Suresh Kumar |
author_variant | s k m sk skm u a ua |
building | Verbundindex |
bvnumber | BV047207505 |
classification_rvk | ST 600 |
ctrlnum | (OCoLC)1250468951 (DE-599)BVBBV047207505 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01824nam a2200361 c 4500</leader><controlfield tag="001">BV047207505</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20210526 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">210322s2020 xx a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781789537253</subfield><subfield code="9">978-1-78953-725-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1250468951</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047207505</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 600</subfield><subfield code="0">(DE-625)143681:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Mukhiya, Suresh Kumar</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1202455956</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Hands-on exploratory data analysis with Python</subfield><subfield code="b">perform EDA techniques to understand, summarize, and investigate your data</subfield><subfield code="c">Suresh Kumar Mukhiya, Usman Ahmed</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham ; Mumbai</subfield><subfield code="b">Packt</subfield><subfield code="c">March 2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">vii, 336 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Explorative Datenanalyse</subfield><subfield code="0">(DE-588)4128896-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Explorative Datenanalyse</subfield><subfield code="0">(DE-588)4128896-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ahmed, Usman</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032612379</subfield></datafield></record></collection> |
id | DE-604.BV047207505 |
illustrated | Illustrated |
indexdate | 2024-12-20T19:12:49Z |
institution | BVB |
isbn | 9781789537253 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032612379 |
oclc_num | 1250468951 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR |
owner_facet | DE-355 DE-BY-UBR |
physical | vii, 336 Seiten Illustrationen, Diagramme |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | Packt |
record_format | marc |
spellingShingle | Mukhiya, Suresh Kumar Ahmed, Usman Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data Explorative Datenanalyse (DE-588)4128896-8 gnd Python Programmiersprache (DE-588)4434275-5 gnd |
subject_GND | (DE-588)4128896-8 (DE-588)4434275-5 |
title | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data |
title_auth | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data |
title_exact_search | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data |
title_full | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data Suresh Kumar Mukhiya, Usman Ahmed |
title_fullStr | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data Suresh Kumar Mukhiya, Usman Ahmed |
title_full_unstemmed | Hands-on exploratory data analysis with Python perform EDA techniques to understand, summarize, and investigate your data Suresh Kumar Mukhiya, Usman Ahmed |
title_short | Hands-on exploratory data analysis with Python |
title_sort | hands on exploratory data analysis with python perform eda techniques to understand summarize and investigate your data |
title_sub | perform EDA techniques to understand, summarize, and investigate your data |
topic | Explorative Datenanalyse (DE-588)4128896-8 gnd Python Programmiersprache (DE-588)4434275-5 gnd |
topic_facet | Explorative Datenanalyse Python Programmiersprache |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032612379&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT mukhiyasureshkumar handsonexploratorydataanalysiswithpythonperformedatechniquestounderstandsummarizeandinvestigateyourdata AT ahmedusman handsonexploratorydataanalysiswithpythonperformedatechniquestounderstandsummarizeandinvestigateyourdata |