Learning Spark, 2nd Edition:
Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition show...
Gespeichert in:
Beteiligte Personen: | , , , |
---|---|
Körperschaft: | |
Format: | Elektronisch E-Book |
Sprache: | Englisch |
Veröffentlicht: |
[Erscheinungsort nicht ermittelbar]
O'Reilly Media, Inc.
2020
|
Ausgabe: | 2nd edition. |
Links: | https://learning.oreilly.com/library/view/-/9781492050032/?ar |
Zusammenfassung: | Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you'll be able to: Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets Peek under the hood of the Spark SQL engine to understand Spark transformations and performance Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow Use open source Pandas framework Koalas and Spark for data transformation and feature engineering. |
Beschreibung: | Online resource; Title from title page (viewed January 25, 2020) |
Umfang: | 1 Online-Ressource (300 Seiten) |
Internformat
MARC
LEADER | 00000cam a22000002 4500 | ||
---|---|---|---|
001 | ZDB-30-ORH-048571644 | ||
003 | DE-627-1 | ||
005 | 20240228120852.0 | ||
007 | cr uuu---uuuuu | ||
008 | 191206s2020 xx |||||o 00| ||eng c | ||
035 | |a (DE-627-1)048571644 | ||
035 | |a (DE-599)KEP048571644 | ||
035 | |a (ORHE)9781492050032 | ||
035 | |a (DE-627-1)048571644 | ||
040 | |a DE-627 |b ger |c DE-627 |e rda | ||
041 | |a eng | ||
100 | 1 | |a Damji, Jules |e VerfasserIn |4 aut | |
245 | 1 | 0 | |a Learning Spark, 2nd Edition |c Damji, Jules |
250 | |a 2nd edition. | ||
264 | 1 | |a [Erscheinungsort nicht ermittelbar] |b O'Reilly Media, Inc. |c 2020 | |
300 | |a 1 Online-Ressource (300 Seiten) | ||
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Online resource; Title from title page (viewed January 25, 2020) | ||
520 | |a Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you'll be able to: Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets Peek under the hood of the Spark SQL engine to understand Spark transformations and performance Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow Use open source Pandas framework Koalas and Spark for data transformation and feature engineering. | ||
700 | 1 | |a Lee, Denny |e VerfasserIn |4 aut | |
700 | 1 | |a Wenig, Brooke |e VerfasserIn |4 aut | |
700 | 1 | |a Das, Tathagata |e VerfasserIn |4 aut | |
710 | 2 | |a Safari, an O'Reilly Media Company. |e MitwirkendeR |4 ctb | |
966 | 4 | 0 | |l DE-91 |p ZDB-30-ORH |q TUM_PDA_ORH |u https://learning.oreilly.com/library/view/-/9781492050032/?ar |m X:ORHE |x Aggregator |z lizenzpflichtig |3 Volltext |
912 | |a ZDB-30-ORH | ||
912 | |a ZDB-30-ORH | ||
951 | |a BO | ||
912 | |a ZDB-30-ORH | ||
049 | |a DE-91 |
Datensatz im Suchindex
DE-BY-TUM_katkey | ZDB-30-ORH-048571644 |
---|---|
_version_ | 1821494849615953920 |
adam_text | |
any_adam_object | |
author | Damji, Jules Lee, Denny Wenig, Brooke Das, Tathagata |
author_corporate | Safari, an O'Reilly Media Company |
author_corporate_role | ctb |
author_facet | Damji, Jules Lee, Denny Wenig, Brooke Das, Tathagata Safari, an O'Reilly Media Company |
author_role | aut aut aut aut |
author_sort | Damji, Jules |
author_variant | j d jd d l dl b w bw t d td |
building | Verbundindex |
bvnumber | localTUM |
collection | ZDB-30-ORH |
ctrlnum | (DE-627-1)048571644 (DE-599)KEP048571644 (ORHE)9781492050032 |
edition | 2nd edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02500cam a22003732 4500</leader><controlfield tag="001">ZDB-30-ORH-048571644</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20240228120852.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">191206s2020 xx |||||o 00| ||eng c</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)048571644</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP048571644</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781492050032</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)048571644</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Damji, Jules</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Learning Spark, 2nd Edition</subfield><subfield code="c">Damji, Jules</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">2nd edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Erscheinungsort nicht ermittelbar]</subfield><subfield code="b">O'Reilly Media, Inc.</subfield><subfield code="c">2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (300 Seiten)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Online resource; Title from title page (viewed January 25, 2020)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you'll be able to: Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets Peek under the hood of the Spark SQL engine to understand Spark transformations and performance Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow Use open source Pandas framework Koalas and Spark for data transformation and feature engineering.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lee, Denny</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wenig, Brooke</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Das, Tathagata</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Safari, an O'Reilly Media Company.</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781492050032/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection> |
id | ZDB-30-ORH-048571644 |
illustrated | Not Illustrated |
indexdate | 2025-01-17T11:20:55Z |
institution | BVB |
language | English |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource (300 Seiten) |
psigel | ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | O'Reilly Media, Inc. |
record_format | marc |
spelling | Damji, Jules VerfasserIn aut Learning Spark, 2nd Edition Damji, Jules 2nd edition. [Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2020 1 Online-Ressource (300 Seiten) Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Online resource; Title from title page (viewed January 25, 2020) Data is getting bigger, arriving faster, and coming in varied formats-and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine-learning algorithms. Through discourse, code snippets, and notebooks, you'll be able to: Learn Python, SQL, Scala, or Java high-level APIs: DataFrames and Datasets Peek under the hood of the Spark SQL engine to understand Spark transformations and performance Inspect, tune, and debug your Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow Use open source Pandas framework Koalas and Spark for data transformation and feature engineering. Lee, Denny VerfasserIn aut Wenig, Brooke VerfasserIn aut Das, Tathagata VerfasserIn aut Safari, an O'Reilly Media Company. MitwirkendeR ctb |
spellingShingle | Damji, Jules Lee, Denny Wenig, Brooke Das, Tathagata Learning Spark, 2nd Edition |
title | Learning Spark, 2nd Edition |
title_auth | Learning Spark, 2nd Edition |
title_exact_search | Learning Spark, 2nd Edition |
title_full | Learning Spark, 2nd Edition Damji, Jules |
title_fullStr | Learning Spark, 2nd Edition Damji, Jules |
title_full_unstemmed | Learning Spark, 2nd Edition Damji, Jules |
title_short | Learning Spark, 2nd Edition |
title_sort | learning spark 2nd edition |
work_keys_str_mv | AT damjijules learningspark2ndedition AT leedenny learningspark2ndedition AT wenigbrooke learningspark2ndedition AT dastathagata learningspark2ndedition AT safarianoreillymediacompany learningspark2ndedition |