Verfügbarkeit: Interactive Spark using PySpark | Technische Universität München

Interactive Spark using PySpark:

Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Beteiligte Personen:	Bengfort, Benjamin (VerfasserIn), Kim, Jenny (VerfasserIn)
Format:	Elektronisch E-Book
Sprache:	Englisch
Veröffentlicht:	[Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2016
Ausgabe:	1st edition
Schlagwörter:	Python (Computer program language) Python (Langage de programmation)
Links:	https://learning.oreilly.com/library/view/-/9781491965313/?ar
Zusammenfassung:	Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of Apache Spark accessible to data scientists who are more familiar with Python than Scala or Java. This also allows for reuse of a wide variety of Python libraries for machine learning, data visualization, numerical analysis, etc. What you'll learn--and how you can apply it Compare the different components provided by Spark, and what use cases they fit. Learn how to use RDDs (resilient distributed datasets) with PySpark. Write Spark applications in Python and submit them to the cluster as Spark jobs. Get an introduction to the Spark computing framework. Apply this approach to a worked example to determine the most frequent airline delays in a specific month and year. This lesson is for you because ... You're a data scientist, familiar with Python coding, who needs to get up and running with PySpark You're a Python developer who needs to leverage the distributed computing resources available on a Hadoop cluster, without learning Java or Scala first Prerequisites Familiarity with writing Python applications Some familiarity with bash command-line operations Basic understanding of how to use simple functional programming constructs in Python, such as closures, lambdas, maps, etc. Materials or downloads needed in advance Apache Spark This lesson is taken from Data Analytics with Hadoop by Jenny Kim and Benjamin Bengfort.
Umfang:	1 Online-Ressource (20 Seiten)
ISBN:	9781491965313 1491965312

Internformat

MARC


LEADER	00000cam a22000002 4500
001	ZDB-30-ORH-048560308
003	DE-627-1
005	20240228120417.0
007	cr uuu---uuuuu
008	191206s2016 xx \|\|\|\|\|o 00\| \|\|eng c
020			\|a 9781491965313 \|9 978-1-4919-6531-3
020			\|a 1491965312 \|9 1-4919-6531-2
035			\|a (DE-627-1)048560308
035			\|a (DE-599)KEP048560308
035			\|a (ORHE)9781491965313
035			\|a (DE-627-1)048560308
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
100	1		\|a Bengfort, Benjamin \|e VerfasserIn \|4 aut
245	1	0	\|a Interactive Spark using PySpark \|c Bengfort, Benjamin
250			\|a 1st edition
264		1	\|a [Erscheinungsort nicht ermittelbar] \|b O'Reilly Media, Inc. \|c 2016
300			\|a 1 Online-Ressource (20 Seiten)
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
520			\|a Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of Apache Spark accessible to data scientists who are more familiar with Python than Scala or Java. This also allows for reuse of a wide variety of Python libraries for machine learning, data visualization, numerical analysis, etc. What you'll learn--and how you can apply it Compare the different components provided by Spark, and what use cases they fit. Learn how to use RDDs (resilient distributed datasets) with PySpark. Write Spark applications in Python and submit them to the cluster as Spark jobs. Get an introduction to the Spark computing framework. Apply this approach to a worked example to determine the most frequent airline delays in a specific month and year. This lesson is for you because ... You're a data scientist, familiar with Python coding, who needs to get up and running with PySpark You're a Python developer who needs to leverage the distributed computing resources available on a Hadoop cluster, without learning Java or Scala first Prerequisites Familiarity with writing Python applications Some familiarity with bash command-line operations Basic understanding of how to use simple functional programming constructs in Python, such as closures, lambdas, maps, etc. Materials or downloads needed in advance Apache Spark This lesson is taken from Data Analytics with Hadoop by Jenny Kim and Benjamin Bengfort.
650		0	\|a Python (Computer program language)
650		4	\|a Python (Langage de programmation)
650		4	\|a Python (Computer program language)
700	1		\|a Kim, Jenny \|e VerfasserIn \|4 aut
966	4	0	\|l DE-91 \|p ZDB-30-ORH \|q TUM_PDA_ORH \|u https://learning.oreilly.com/library/view/-/9781491965313/?ar \|m X:ORHE \|x Aggregator \|z lizenzpflichtig \|3 Volltext
912			\|a ZDB-30-ORH
912			\|a ZDB-30-ORH
951			\|a BO
912			\|a ZDB-30-ORH
049			\|a DE-91

Datensatz im Suchindex

DE-BY-TUM_katkey	ZDB-30-ORH-048560308
_version_	1821494850788261888
adam_text
any_adam_object
author	Bengfort, Benjamin Kim, Jenny
author_facet	Bengfort, Benjamin Kim, Jenny
author_role	aut aut
author_sort	Bengfort, Benjamin
author_variant	b b bb j k jk
building	Verbundindex
bvnumber	localTUM
collection	ZDB-30-ORH
ctrlnum	(DE-627-1)048560308 (DE-599)KEP048560308 (ORHE)9781491965313
edition	1st edition
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02920cam a22003852 4500</leader><controlfield tag="001">ZDB-30-ORH-048560308</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20240228120417.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">191206s2016 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781491965313</subfield><subfield code="9">978-1-4919-6531-3</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1491965312</subfield><subfield code="9">1-4919-6531-2</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)048560308</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP048560308</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781491965313</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)048560308</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bengfort, Benjamin</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Interactive Spark using PySpark</subfield><subfield code="c">Bengfort, Benjamin</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Erscheinungsort nicht ermittelbar]</subfield><subfield code="b">O'Reilly Media, Inc.</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (20 Seiten)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of Apache Spark accessible to data scientists who are more familiar with Python than Scala or Java. This also allows for reuse of a wide variety of Python libraries for machine learning, data visualization, numerical analysis, etc. What you'll learn--and how you can apply it Compare the different components provided by Spark, and what use cases they fit. Learn how to use RDDs (resilient distributed datasets) with PySpark. Write Spark applications in Python and submit them to the cluster as Spark jobs. Get an introduction to the Spark computing framework. Apply this approach to a worked example to determine the most frequent airline delays in a specific month and year. This lesson is for you because ... You're a data scientist, familiar with Python coding, who needs to get up and running with PySpark You're a Python developer who needs to leverage the distributed computing resources available on a Hadoop cluster, without learning Java or Scala first Prerequisites Familiarity with writing Python applications Some familiarity with bash command-line operations Basic understanding of how to use simple functional programming constructs in Python, such as closures, lambdas, maps, etc. Materials or downloads needed in advance Apache Spark This lesson is taken from Data Analytics with Hadoop by Jenny Kim and Benjamin Bengfort.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Python (Langage de programmation)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kim, Jenny</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781491965313/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection>
id	ZDB-30-ORH-048560308
illustrated	Not Illustrated
indexdate	2025-01-17T11:20:56Z
institution	BVB
isbn	9781491965313 1491965312
language	English
open_access_boolean
owner	DE-91 DE-BY-TUM
owner_facet	DE-91 DE-BY-TUM
physical	1 Online-Ressource (20 Seiten)
psigel	ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH
publishDate	2016
publishDateSearch	2016
publishDateSort	2016
publisher	O'Reilly Media, Inc.
record_format	marc
spelling	Bengfort, Benjamin VerfasserIn aut Interactive Spark using PySpark Bengfort, Benjamin 1st edition [Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2016 1 Online-Ressource (20 Seiten) Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Apache Spark is an in-memory framework that allows data scientists to explore and interact with big data much more quickly than with Hadoop. Python users can work with Spark using an interactive shell called PySpark. Why is it important? PySpark makes the large-scale data processing capabilities of Apache Spark accessible to data scientists who are more familiar with Python than Scala or Java. This also allows for reuse of a wide variety of Python libraries for machine learning, data visualization, numerical analysis, etc. What you'll learn--and how you can apply it Compare the different components provided by Spark, and what use cases they fit. Learn how to use RDDs (resilient distributed datasets) with PySpark. Write Spark applications in Python and submit them to the cluster as Spark jobs. Get an introduction to the Spark computing framework. Apply this approach to a worked example to determine the most frequent airline delays in a specific month and year. This lesson is for you because ... You're a data scientist, familiar with Python coding, who needs to get up and running with PySpark You're a Python developer who needs to leverage the distributed computing resources available on a Hadoop cluster, without learning Java or Scala first Prerequisites Familiarity with writing Python applications Some familiarity with bash command-line operations Basic understanding of how to use simple functional programming constructs in Python, such as closures, lambdas, maps, etc. Materials or downloads needed in advance Apache Spark This lesson is taken from Data Analytics with Hadoop by Jenny Kim and Benjamin Bengfort. Python (Computer program language) Python (Langage de programmation) Kim, Jenny VerfasserIn aut
spellingShingle	Bengfort, Benjamin Kim, Jenny Interactive Spark using PySpark Python (Computer program language) Python (Langage de programmation)
title	Interactive Spark using PySpark
title_auth	Interactive Spark using PySpark
title_exact_search	Interactive Spark using PySpark
title_full	Interactive Spark using PySpark Bengfort, Benjamin
title_fullStr	Interactive Spark using PySpark Bengfort, Benjamin
title_full_unstemmed	Interactive Spark using PySpark Bengfort, Benjamin
title_short	Interactive Spark using PySpark
title_sort	interactive spark using pyspark
topic	Python (Computer program language) Python (Langage de programmation)
topic_facet	Python (Computer program language) Python (Langage de programmation)
work_keys_str_mv	AT bengfortbenjamin interactivesparkusingpyspark AT kimjenny interactivesparkusingpyspark

Verfügbarkeit

‌

Online lesen