Verfügbarkeit: Data analytics with Hadoop | Technische Universität München

Data analytics with Hadoop: an introduction for data scientists

Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Beteiligte Personen:	Bengfort, Benjamin (VerfasserIn), Kim, Jenny (VerfasserIn)
Format:	Elektronisch E-Book
Sprache:	Englisch
Veröffentlicht:	Sebastopol, CA O'Reilly Media 2016
Ausgabe:	First edition.
Schlagwörter:	Apache Hadoop Electronic data processing > Distributed processing Cluster analysis > Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing
Links:	https://learning.oreilly.com/library/view/-/9781491913734/?ar
Zusammenfassung:	Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher.
Beschreibung:	Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016)
Umfang:	1 Online-Ressource illustrations
ISBN:	9781491913765 1491913762 9781491913758 1491913754

Internformat

MARC


LEADER	00000cam a22000002 4500
001	ZDB-30-ORH-047612177
003	DE-627-1
005	20240228120121.0
007	cr uuu---uuuuu
008	191023s2016 xx \|\|\|\|\|o 00\| \|\|eng c
020			\|a 9781491913765 \|c electronic bk. \|9 978-1-4919-1376-5
020			\|a 1491913762 \|c electronic bk. \|9 1-4919-1376-2
020			\|a 9781491913758 \|c electronic bk. \|9 978-1-4919-1375-8
020			\|a 1491913754 \|c electronic bk. \|9 1-4919-1375-4
035			\|a (DE-627-1)047612177
035			\|a (DE-599)KEP047612177
035			\|a (ORHE)9781491913734
035			\|a (DE-627-1)047612177
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
072		7	\|a COM \|2 bisacsh
082	0		\|a 004.36 \|2 23
100	1		\|a Bengfort, Benjamin \|e VerfasserIn \|4 aut
245	1	0	\|a Data analytics with Hadoop \|b an introduction for data scientists \|c Benjamin Bengfort and Jenny Kim
250			\|a First edition.
264		1	\|a Sebastopol, CA \|b O'Reilly Media \|c 2016
300			\|a 1 Online-Ressource \|b illustrations
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
500			\|a Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016)
520			\|a Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher.
630	2	0	\|a Apache Hadoop
650		0	\|a Electronic data processing \|x Distributed processing
650		0	\|a Cluster analysis \|x Data processing
650		4	\|a Apache Hadoop
650		4	\|a Traitement réparti
650		4	\|a Classification automatique (Statistique) ; Informatique
650		4	\|a COMPUTERS ; Data Processing
650		4	\|a Cluster analysis ; Data processing
650		4	\|a Electronic data processing ; Distributed processing
700	1		\|a Kim, Jenny \|e VerfasserIn \|4 aut
776	1		\|z 9781491913703
776	0	8	\|i Erscheint auch als \|n Druck-Ausgabe \|z 9781491913703
966	4	0	\|l DE-91 \|p ZDB-30-ORH \|q TUM_PDA_ORH \|u https://learning.oreilly.com/library/view/-/9781491913734/?ar \|m X:ORHE \|x Aggregator \|z lizenzpflichtig \|3 Volltext
912			\|a ZDB-30-ORH
912			\|a ZDB-30-ORH
951			\|a BO
912			\|a ZDB-30-ORH
049			\|a DE-91

Datensatz im Suchindex

DE-BY-TUM_katkey	ZDB-30-ORH-047612177
_version_	1821494872857640960
adam_text
any_adam_object
author	Bengfort, Benjamin Kim, Jenny
author_facet	Bengfort, Benjamin Kim, Jenny
author_role	aut aut
author_sort	Bengfort, Benjamin
author_variant	b b bb j k jk
building	Verbundindex
bvnumber	localTUM
collection	ZDB-30-ORH
ctrlnum	(DE-627-1)047612177 (DE-599)KEP047612177 (ORHE)9781491913734
dewey-full	004.36
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	004 - Computer science
dewey-raw	004.36
dewey-search	004.36
dewey-sort	14.36
dewey-tens	000 - Computer science, information, general works
discipline	Informatik
edition	First edition.
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03446cam a22005412 4500</leader><controlfield tag="001">ZDB-30-ORH-047612177</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20240228120121.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">191023s2016 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781491913765</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">978-1-4919-1376-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1491913762</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">1-4919-1376-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781491913758</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">978-1-4919-1375-8</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1491913754</subfield><subfield code="c">electronic bk.</subfield><subfield code="9">1-4919-1375-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)047612177</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP047612177</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781491913734</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)047612177</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="072" ind1=" " ind2="7"><subfield code="a">COM</subfield><subfield code="2">bisacsh</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004.36</subfield><subfield code="2">23</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bengfort, Benjamin</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Data analytics with Hadoop</subfield><subfield code="b">an introduction for data scientists</subfield><subfield code="c">Benjamin Bengfort and Jenny Kim</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">First edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Sebastopol, CA</subfield><subfield code="b">O'Reilly Media</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource</subfield><subfield code="b">illustrations</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher.</subfield></datafield><datafield tag="630" ind1="2" ind2="0"><subfield code="a">Apache Hadoop</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Electronic data processing</subfield><subfield code="x">Distributed processing</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Cluster analysis</subfield><subfield code="x">Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apache Hadoop</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Traitement réparti</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Classification automatique (Statistique) ; Informatique</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">COMPUTERS ; Data Processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cluster analysis ; Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Electronic data processing ; Distributed processing</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kim, Jenny</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="1" ind2=" "><subfield code="z">9781491913703</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781491913703</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781491913734/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection>
id	ZDB-30-ORH-047612177
illustrated	Illustrated
indexdate	2025-01-17T11:21:17Z
institution	BVB
isbn	9781491913765 1491913762 9781491913758 1491913754
language	English
open_access_boolean
owner	DE-91 DE-BY-TUM
owner_facet	DE-91 DE-BY-TUM
physical	1 Online-Ressource illustrations
psigel	ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH
publishDate	2016
publishDateSearch	2016
publishDateSort	2016
publisher	O'Reilly Media
record_format	marc
spelling	Bengfort, Benjamin VerfasserIn aut Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim First edition. Sebastopol, CA O'Reilly Media 2016 1 Online-Ressource illustrations Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Includes index. - Includes bibliographical references and index. - Online resource; title from title page (Safari, viewed June 13, 2016) Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you'll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You'll also learn about the analytical processes and data systems available to build and empower data products that can handle-and actually require-huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark's MLlib.--Provided by publisher. Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing Kim, Jenny VerfasserIn aut 9781491913703 Erscheint auch als Druck-Ausgabe 9781491913703
spellingShingle	Bengfort, Benjamin Kim, Jenny Data analytics with Hadoop an introduction for data scientists Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing
title	Data analytics with Hadoop an introduction for data scientists
title_auth	Data analytics with Hadoop an introduction for data scientists
title_exact_search	Data analytics with Hadoop an introduction for data scientists
title_full	Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim
title_fullStr	Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim
title_full_unstemmed	Data analytics with Hadoop an introduction for data scientists Benjamin Bengfort and Jenny Kim
title_short	Data analytics with Hadoop
title_sort	data analytics with hadoop an introduction for data scientists
title_sub	an introduction for data scientists
topic	Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing
topic_facet	Apache Hadoop Electronic data processing Distributed processing Cluster analysis Data processing Traitement réparti Classification automatique (Statistique) ; Informatique COMPUTERS ; Data Processing Cluster analysis ; Data processing Electronic data processing ; Distributed processing
work_keys_str_mv	AT bengfortbenjamin dataanalyticswithhadoopanintroductionfordatascientists AT kimjenny dataanalyticswithhadoopanintroductionfordatascientists

Verfügbarkeit

‌

Online lesen