Python data cleaning cookbook: prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI
Gespeichert in:
Beteilige Person: | |
---|---|
Format: | Elektronisch E-Book |
Sprache: | Englisch |
Veröffentlicht: |
Birmingham
Packt Publishing Ltd.
May 2024
|
Ausgabe: | Second edition |
Schriftenreihe: | Expert Insight
|
Links: | https://ebookcentral.proquest.com/lib/hsansbach/detail.action?docID=31355096 https://portal.igpublish.com/iglibrary/search/PACKT0007203.html https://portal.igpublish.com/iglibrary/search/PACKT0007203.html https://portal.igpublish.com/iglibrary/search/PACKT0007203.html https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |
Abstract: | Cover -- Copyright -- Contributors -- Table of Contents -- Preface -- Chapter 1: Anticipating Data Cleaning Issues When Importing Tabular Data with pandas -- Technical requirements -- Importing CSV files -- Importing Excel files -- Importing data from SQL databases -- Importing SPSS, Stata, and SAS data -- Importing R data -- Persisting tabular data -- Summary -- Chapter 2: Anticipating Data Cleaning Issues When Working with HTML, JSON, and Spark Data -- Technical requirements -- Importing simple JSON data -- Importing more complicated JSON data from an API -- Importing data from web pages -- Working with Spark data -- Persisting JSON data -- Versioning data -- Summary -- Chapter 3: Taking the Measure of Your Data -- Technical requirements -- Getting a first look at your data -- Selecting and organizing columns -- Selecting rows -- Generating frequencies for categorical variables -- Generating summary statistics for continuous variables -- Using generative AI to display descriptive statistics -- Summary -- Chapter 4: Identifying Outliers in Subsets of Data -- Technical requirements -- Identifying outliers with one variable -- Identifying outliers and unexpected values in bivariate relationships -- Using subsetting to examine logical inconsistencies in variable relationships -- Using linear regression to identify data points with significant influence -- Using k-nearest neighbors to find outliers -- Using Isolation Forest to find anomalies -- Using PandasAI to identify outliers -- Summary -- Chapter 5: Using Visualizations for the Identification of Unexpected Values -- Technical requirements -- Using histograms to examine the distribution of continuous variables -- Using boxplots to identify outliers for continuous variables -- Using grouped boxplots to uncover unexpected values in a particular group. |
Umfang: | 1 Online-Ressource (xvii, 453 Seiten) Illustrationen |
ISBN: | 9781803246291 |
Internformat
MARC
LEADER | 00000nam a22000001c 4500 | ||
---|---|---|---|
001 | BV049746289 | ||
003 | DE-604 | ||
005 | 20250211 | ||
007 | cr|uuu---uuuuu | ||
008 | 240618s2024 xx a||| o|||| 00||| eng d | ||
020 | |a 9781803246291 |9 978-1-80324-629-1 | ||
035 | |a (ZDB-30-PQE)31355096 | ||
035 | |a (ZDB-221-PDA)9781803246291 | ||
035 | |a (OCoLC)1443577560 | ||
035 | |a (DE-599)KEP103609776 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-1102 |a DE-706 |a DE-91 |a DE-573 | ||
100 | 1 | |a Walker, Michael |e Verfasser |4 aut | |
245 | 1 | 0 | |a Python data cleaning cookbook |b prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |c Michael Walker |
250 | |a Second edition | ||
264 | 1 | |a Birmingham |b Packt Publishing Ltd. |c May 2024 | |
300 | |a 1 Online-Ressource (xvii, 453 Seiten) |b Illustrationen | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 0 | |a Expert Insight | |
520 | 3 | |a Cover -- Copyright -- Contributors -- Table of Contents -- Preface -- Chapter 1: Anticipating Data Cleaning Issues When Importing Tabular Data with pandas -- Technical requirements -- Importing CSV files -- Importing Excel files -- Importing data from SQL databases -- Importing SPSS, Stata, and SAS data -- Importing R data -- Persisting tabular data -- Summary -- Chapter 2: Anticipating Data Cleaning Issues When Working with HTML, JSON, and Spark Data -- Technical requirements -- Importing simple JSON data -- Importing more complicated JSON data from an API -- Importing data from web pages -- Working with Spark data -- Persisting JSON data -- Versioning data -- Summary -- Chapter 3: Taking the Measure of Your Data -- Technical requirements -- Getting a first look at your data -- Selecting and organizing columns -- Selecting rows -- Generating frequencies for categorical variables -- Generating summary statistics for continuous variables -- Using generative AI to display descriptive statistics -- Summary -- Chapter 4: Identifying Outliers in Subsets of Data -- Technical requirements -- Identifying outliers with one variable -- Identifying outliers and unexpected values in bivariate relationships -- Using subsetting to examine logical inconsistencies in variable relationships -- Using linear regression to identify data points with significant influence -- Using k-nearest neighbors to find outliers -- Using Isolation Forest to find anomalies -- Using PandasAI to identify outliers -- Summary -- Chapter 5: Using Visualizations for the Identification of Unexpected Values -- Technical requirements -- Using histograms to examine the distribution of continuous variables -- Using boxplots to identify outliers for continuous variables -- Using grouped boxplots to uncover unexpected values in a particular group. | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 978-1-80323-987-3 |
856 | 4 | 0 | |u https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |x Verlag |z URL des Erstveröffentlichers |3 Volltext |
912 | |a ZDB-30-PQE | ||
912 | |a ZDB-221-PDA | ||
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-035088081 | |
966 | e | |u https://ebookcentral.proquest.com/lib/hsansbach/detail.action?docID=31355096 |l DE-1102 |p ZDB-30-PQE |q FAN_Einzelkauf_2024 |x Aggregator |3 Volltext | |
966 | e | |u https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |l DE-573 |p ZDB-221-PDA |x Aggregator |3 Volltext | |
966 | e | |u https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |l DE-91 |p ZDB-221-PDA |q TUM_Paketkauf_2025 |x Verlag |3 Volltext | |
966 | e | |u https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |l DE-706 |p ZDB-221-PDA |x Verlag |3 Volltext |
Datensatz im Suchindex
DE-BY-TUM_katkey | 2839415 |
---|---|
_version_ | 1823807853118357505 |
adam_text | |
any_adam_object | |
author | Walker, Michael |
author_facet | Walker, Michael |
author_role | aut |
author_sort | Walker, Michael |
author_variant | m w mw |
building | Verbundindex |
bvnumber | BV049746289 |
collection | ZDB-30-PQE ZDB-221-PDA |
ctrlnum | (ZDB-30-PQE)31355096 (ZDB-221-PDA)9781803246291 (OCoLC)1443577560 (DE-599)KEP103609776 |
edition | Second edition |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a22000001c 4500</leader><controlfield tag="001">BV049746289</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20250211</controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">240618s2024 xx a||| o|||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781803246291</subfield><subfield code="9">978-1-80324-629-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)31355096</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-221-PDA)9781803246291</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1443577560</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP103609776</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-1102</subfield><subfield code="a">DE-706</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-573</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Walker, Michael</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Python data cleaning cookbook</subfield><subfield code="b">prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI</subfield><subfield code="c">Michael Walker</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Second edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham</subfield><subfield code="b">Packt Publishing Ltd.</subfield><subfield code="c">May 2024</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (xvii, 453 Seiten)</subfield><subfield code="b">Illustrationen</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Expert Insight</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">Cover -- Copyright -- Contributors -- Table of Contents -- Preface -- Chapter 1: Anticipating Data Cleaning Issues When Importing Tabular Data with pandas -- Technical requirements -- Importing CSV files -- Importing Excel files -- Importing data from SQL databases -- Importing SPSS, Stata, and SAS data -- Importing R data -- Persisting tabular data -- Summary -- Chapter 2: Anticipating Data Cleaning Issues When Working with HTML, JSON, and Spark Data -- Technical requirements -- Importing simple JSON data -- Importing more complicated JSON data from an API -- Importing data from web pages -- Working with Spark data -- Persisting JSON data -- Versioning data -- Summary -- Chapter 3: Taking the Measure of Your Data -- Technical requirements -- Getting a first look at your data -- Selecting and organizing columns -- Selecting rows -- Generating frequencies for categorical variables -- Generating summary statistics for continuous variables -- Using generative AI to display descriptive statistics -- Summary -- Chapter 4: Identifying Outliers in Subsets of Data -- Technical requirements -- Identifying outliers with one variable -- Identifying outliers and unexpected values in bivariate relationships -- Using subsetting to examine logical inconsistencies in variable relationships -- Using linear regression to identify data points with significant influence -- Using k-nearest neighbors to find outliers -- Using Isolation Forest to find anomalies -- Using PandasAI to identify outliers -- Summary -- Chapter 5: Using Visualizations for the Identification of Unexpected Values -- Technical requirements -- Using histograms to examine the distribution of continuous variables -- Using boxplots to identify outliers for continuous variables -- Using grouped boxplots to uncover unexpected values in a particular group.</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">978-1-80323-987-3</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007203.html</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-221-PDA</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-035088081</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/hsansbach/detail.action?docID=31355096</subfield><subfield code="l">DE-1102</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">FAN_Einzelkauf_2024</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007203.html</subfield><subfield code="l">DE-573</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007203.html</subfield><subfield code="l">DE-91</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="q">TUM_Paketkauf_2025</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007203.html</subfield><subfield code="l">DE-706</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV049746289 |
illustrated | Illustrated |
indexdate | 2025-02-11T15:01:18Z |
institution | BVB |
isbn | 9781803246291 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-035088081 |
oclc_num | 1443577560 |
open_access_boolean | |
owner | DE-1102 DE-706 DE-91 DE-BY-TUM DE-573 |
owner_facet | DE-1102 DE-706 DE-91 DE-BY-TUM DE-573 |
physical | 1 Online-Ressource (xvii, 453 Seiten) Illustrationen |
psigel | ZDB-30-PQE ZDB-221-PDA ZDB-30-PQE FAN_Einzelkauf_2024 ZDB-221-PDA TUM_Paketkauf_2025 |
publishDate | 2024 |
publishDateSearch | 2024 |
publishDateSort | 2024 |
publisher | Packt Publishing Ltd. |
record_format | marc |
series2 | Expert Insight |
spellingShingle | Walker, Michael Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |
title | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |
title_auth | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |
title_exact_search | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |
title_full | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI Michael Walker |
title_fullStr | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI Michael Walker |
title_full_unstemmed | Python data cleaning cookbook prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI Michael Walker |
title_short | Python data cleaning cookbook |
title_sort | python data cleaning cookbook prepare your data for analysis with pandas numpy matplotlib scikit learn and openai |
title_sub | prepare your data for analysis with pandas, NumPy, Matplotlib, scikit-learn, and OpenAI |
url | https://portal.igpublish.com/iglibrary/search/PACKT0007203.html |
work_keys_str_mv | AT walkermichael pythondatacleaningcookbookprepareyourdataforanalysiswithpandasnumpymatplotlibscikitlearnandopenai |