Data analytics with Hadoop: an introduction for data scientists
Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular a...
Gespeichert in:
Beteiligte Personen: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | Englisch |
Veröffentlicht: |
Sebastopol, CA
O'Reilly Media
2016
|
Ausgabe: | First edition. |
Schlagwörter: | |
Links: | https://learning.oreilly.com/library/view/-/9781491913734/?ar |
Zusammenfassung: | Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher. |
Beschreibung: | Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016) |
Umfang: | 1 Online-Ressource illustrations |
ISBN: | 9781491913765 1491913762 9781491913758 1491913754 |
Internformat
MARC
LEADER | 00000cam a22000002 4500 | ||
---|---|---|---|
001 | ZDB-30-ORH-047612177 | ||
003 | DE-627-1 | ||
005 | 20240228120121.0 | ||
007 | cr uuu---uuuuu | ||
008 | 191023s2016 xx |||||o 00| ||eng c | ||
020 | |a 9781491913765 |c electronic bk. |9 978-1-4919-1376-5 | ||
020 | |a 1491913762 |c electronic bk. |9 1-4919-1376-2 | ||
020 | |a 9781491913758 |c electronic bk. |9 978-1-4919-1375-8 | ||
020 | |a 1491913754 |c electronic bk. |9 1-4919-1375-4 | ||
035 | |a (DE-627-1)047612177 | ||
035 | |a (DE-599)KEP047612177 | ||
035 | |a (ORHE)9781491913734 | ||
035 | |a (DE-627-1)047612177 | ||
040 | |a DE-627 |b ger |c DE-627 |e rda | ||
041 | |a eng | ||
072 | 7 | |a COM |2 bisacsh | |
082 | 0 | |a 004.36 |2 23 | |
100 | 1 | |a Bengfort, Benjamin |e VerfasserIn |4 aut | |
245 | 1 | 0 | |a Data analytics with Hadoop |b an introduction for data scientists |c Benjamin Bengfort and Jenny Kim |
250 | |a First edition. | ||
264 | 1 | |a Sebastopol, CA |b O'Reilly Media |c 2016 | |
300 | |a 1 Online-Ressource |b illustrations | ||
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016) | ||
520 | |a Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher. | ||
630 | 2 | 0 | |a Apache Hadoop |
650 | 0 | |a Electronic data processing |x Distributed processing | |
650 | 0 | |a Cluster analysis |x Data processing | |
650 | 4 | |a Apache Hadoop | |
650 | 4 | |a Traitement réparti | |
650 | 4 | |a Classification automatique (Statistique) ; Informatique | |
650 | 4 | |a COMPUTERS ; Data Processing | |
650 | 4 | |a Cluster analysis ; Data processing | |
650 | 4 | |a Electronic data processing ; Distributed processing | |
700 | 1 | |a Kim, Jenny |e VerfasserIn |4 aut | |
776 | 1 | |z 9781491913703 | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9781491913703 |
966 | 4 | 0 | |l DE-91 |p ZDB-30-ORH |q TUM_PDA_ORH |u https://learning.oreilly.com/library/view/-/9781491913734/?ar |m X:ORHE |x Aggregator |z lizenzpflichtig |3 Volltext |
912 | |a ZDB-30-ORH | ||
912 | |a ZDB-30-ORH | ||
951 | |a BO | ||
912 | |a ZDB-30-ORH | ||
049 | |a DE-91 |
Datensatz im Suchindex
DE-BY-TUM_katkey | ZDB-30-ORH-047612177 |
---|---|
_version_ | 1821494872857640960 |
adam_text | |
any_adam_object | |
author | Bengfort, Benjamin Kim, Jenny |
author_facet | Bengfort, Benjamin Kim, Jenny |
author_role | aut aut |
author_sort | Bengfort, Benjamin |
author_variant | b b bb j k jk |
building | Verbundindex |
bvnumber | localTUM |
collection | ZDB-30-ORH |
ctrlnum | (DE-627-1)047612177 (DE-599)KEP047612177 (ORHE)9781491913734 |
dewey-full | 004.36 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 004 - Computer science |
dewey-raw | 004.36 |
dewey-search | 004.36 |
dewey-sort | 14.36 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | First edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03446cam a22005412 4500</leader><controlfield tag="001">ZDB-30-ORH-047612177</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20240228120121.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">191023s2016 xx |||||o 00| ||eng c</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781491913765</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">978-1-4919-1376-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1491913762</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">1-4919-1376-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781491913758</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">978-1-4919-1375-8</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1491913754</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">1-4919-1375-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)047612177</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP047612177</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781491913734</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)047612177</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="072" ind1=" " ind2="7"><subfield code="a">COM</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004.36</subfield><subfield code="2">23</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bengfort, Benjamin</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Data analytics with Hadoop</subfield><subfield code="b">an introduction for data scientists</subfield><subfield code="c">Benjamin Bengfort and Jenny Kim</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Sebastopol, CA</subfield><subfield code="b">O'Reilly Media</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource</subfield><subfield code="b">illustrations</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher.</subfield></datafield><datafield tag="630" ind1="2" ind2="0"><subfield code="a">Apache Hadoop</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Electronic data processing</subfield><subfield code="x">Distributed processing</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Cluster analysis</subfield><subfield code="x">Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apache Hadoop</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Traitement réparti</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Classification automatique (Statistique) ; Informatique</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">COMPUTERS ; Data Processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cluster analysis ; Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Electronic data processing ; Distributed processing</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kim, Jenny</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="1" ind2=" "><subfield code="z">9781491913703</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781491913703</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781491913734/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection> |
id | ZDB-30-ORH-047612177 |
illustrated | Illustrated |
indexdate | 2025-01-17T11:21:17Z |
institution | BVB |
isbn | 9781491913765 1491913762 9781491913758 1491913754 |
language | English |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource illustrations |
psigel | ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | O'Reilly Media |
record_format | marc |
spelling | Bengfort, Benjamin VerfasserIn aut Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim First edition. Sebastopol, CA O'Reilly Media 2016 1 Online-Ressource illustrations Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016) Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher. Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing Kim, Jenny VerfasserIn aut 9781491913703 Erscheint auch als Druck-Ausgabe 9781491913703 |
spellingShingle | Bengfort, Benjamin Kim, Jenny Data analytics with Hadoop an introduction for data scientists Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing |
title | Data analytics with Hadoop an introduction for data scientists |
title_auth | Data analytics with Hadoop an introduction for data scientists |
title_exact_search | Data analytics with Hadoop an introduction for data scientists |
title_full | Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim |
title_fullStr | Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim |
title_full_unstemmed | Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim |
title_short | Data analytics with Hadoop |
title_sort | data analytics with hadoop an introduction for data scientists |
title_sub | an introduction for data scientists |
topic | Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing |
topic_facet | Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing |
work_keys_str_mv | AT bengfortbenjamin dataanalyticswithhadoopanintroductionfordatascientists AT kimjenny dataanalyticswithhadoopanintroductionfordatascientists |