Delta Lake: the Definitive Guide
Analysis and machine learning models are only as good as the data they're built on. Querying processed data and getting insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance. This guide introduces you to De...
Saved in:
Main Authors: | , , |
---|---|
Corporate Author: | |
Format: | Electronic eBook |
Language: | English |
Published: |
[Erscheinungsort nicht ermittelbar]
O'Reilly Media, Inc.
2022
|
Edition: | 1st edition. |
Links: | https://learning.oreilly.com/library/view/-/9781098189549/?ar |
Summary: | Analysis and machine learning models are only as good as the data they're built on. Querying processed data and getting insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance. This guide introduces you to Delta Lake, an open-source format that enables building a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. Delta Lake enhances Apache Spark and makes it easy to store and manage massive amounts of complex data by supporting data integrity, data quality, and performance. Data engineers, data scientists, and data practitioners will learn how to build reliable data lakes and data pipelines at scale using Delta Lake. Understand key data reliability challenges and how to tackle them Learn how to use Delta Lake to realize data reliability improvements Concurrently run streaming and batch jobs against your data lake Execute update, delete, and merge commands against your data lake Use time travel to roll back and examine previous versions of your data Learn best practices to build effective, high-quality end-to-end data pipelines for real world use cases Integrate with other data technologies like Presto, Athena, Redshift and other BI tools Learn how thousands of companies are processing exabytes of data per month with their lakehouse architecture using Delta Lake. |
Item Description: | Online resource; Title from title page (viewed April 25, 2022) |
Physical Description: | 1 Online-Ressource (106 Seiten) |
Staff View
MARC
LEADER | 00000nam a22000002c 4500 | ||
---|---|---|---|
001 | ZDB-30-ORH-109653114 | ||
003 | DE-627-1 | ||
005 | 20241107103330.0 | ||
007 | cr uuu---uuuuu | ||
008 | 241107s2022 xx |||||o 00| ||eng c | ||
035 | |a (DE-627-1)109653114 | ||
035 | |a (DE-599)KEP109653114 | ||
035 | |a (ORHE)9781098189549 | ||
035 | |a (DE-627-1)109653114 | ||
040 | |a DE-627 |b ger |c DE-627 |e rda | ||
041 | |a eng | ||
100 | 1 | |a Lee, Denny |e VerfasserIn |4 aut | |
245 | 1 | 0 | |a Delta Lake |b the Definitive Guide |c Lee, Denny |
250 | |a 1st edition. | ||
264 | 1 | |a [Erscheinungsort nicht ermittelbar] |b O'Reilly Media, Inc. |c 2022 | |
300 | |a 1 Online-Ressource (106 Seiten) | ||
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Online resource; Title from title page (viewed April 25, 2022) | ||
520 | |a Analysis and machine learning models are only as good as the data they're built on. Querying processed data and getting insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance. This guide introduces you to Delta Lake, an open-source format that enables building a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. Delta Lake enhances Apache Spark and makes it easy to store and manage massive amounts of complex data by supporting data integrity, data quality, and performance. Data engineers, data scientists, and data practitioners will learn how to build reliable data lakes and data pipelines at scale using Delta Lake. Understand key data reliability challenges and how to tackle them Learn how to use Delta Lake to realize data reliability improvements Concurrently run streaming and batch jobs against your data lake Execute update, delete, and merge commands against your data lake Use time travel to roll back and examine previous versions of your data Learn best practices to build effective, high-quality end-to-end data pipelines for real world use cases Integrate with other data technologies like Presto, Athena, Redshift and other BI tools Learn how thousands of companies are processing exabytes of data per month with their lakehouse architecture using Delta Lake. | ||
700 | 1 | |a Das, Tathagata |e VerfasserIn |4 aut | |
700 | 1 | |a Jaiswal, Vini |e VerfasserIn |4 aut | |
710 | 2 | |a Safari, an O'Reilly Media Company. |e MitwirkendeR |4 ctb | |
966 | 4 | 0 | |l DE-91 |p ZDB-30-ORH |q TUM_PDA_ORH |u https://learning.oreilly.com/library/view/-/9781098189549/?ar |m X:ORHE |x Aggregator |z lizenzpflichtig |3 Volltext |
912 | |a ZDB-30-ORH | ||
951 | |a BO | ||
912 | |a ZDB-30-ORH | ||
049 | |a DE-91 |
Record in the Search Index
DE-BY-TUM_katkey | ZDB-30-ORH-109653114 |
---|---|
_version_ | 1831287140452925441 |
adam_text | |
any_adam_object | |
author | Lee, Denny Das, Tathagata Jaiswal, Vini |
author_corporate | Safari, an O'Reilly Media Company |
author_corporate_role | ctb |
author_facet | Lee, Denny Das, Tathagata Jaiswal, Vini Safari, an O'Reilly Media Company |
author_role | aut aut aut |
author_sort | Lee, Denny |
author_variant | d l dl t d td v j vj |
building | Verbundindex |
bvnumber | localTUM |
collection | ZDB-30-ORH |
ctrlnum | (DE-627-1)109653114 (DE-599)KEP109653114 (ORHE)9781098189549 |
edition | 1st edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02611nam a22003492c 4500</leader><controlfield tag="001">ZDB-30-ORH-109653114</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20241107103330.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">241107s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)109653114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP109653114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781098189549</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)109653114</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lee, Denny</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Delta Lake</subfield><subfield code="b">the Definitive Guide</subfield><subfield code="c">Lee, Denny</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Erscheinungsort nicht ermittelbar]</subfield><subfield code="b">O'Reilly Media, Inc.</subfield><subfield code="c">2022</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (106 Seiten)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Online resource; Title from title page (viewed April 25, 2022)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Analysis and machine learning models are only as good as the data they're built on. Querying processed data and getting insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance. This guide introduces you to Delta Lake, an open-source format that enables building a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. Delta Lake enhances Apache Spark and makes it easy to store and manage massive amounts of complex data by supporting data integrity, data quality, and performance. Data engineers, data scientists, and data practitioners will learn how to build reliable data lakes and data pipelines at scale using Delta Lake. Understand key data reliability challenges and how to tackle them Learn how to use Delta Lake to realize data reliability improvements Concurrently run streaming and batch jobs against your data lake Execute update, delete, and merge commands against your data lake Use time travel to roll back and examine previous versions of your data Learn best practices to build effective, high-quality end-to-end data pipelines for real world use cases Integrate with other data technologies like Presto, Athena, Redshift and other BI tools Learn how thousands of companies are processing exabytes of data per month with their lakehouse architecture using Delta Lake.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Das, Tathagata</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Jaiswal, Vini</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Safari, an O'Reilly Media Company.</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781098189549/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection> |
id | ZDB-30-ORH-109653114 |
illustrated | Not Illustrated |
indexdate | 2025-05-05T13:25:11Z |
institution | BVB |
language | English |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource (106 Seiten) |
psigel | ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH |
publishDate | 2022 |
publishDateSearch | 2022 |
publishDateSort | 2022 |
publisher | O'Reilly Media, Inc. |
record_format | marc |
spelling | Lee, Denny VerfasserIn aut Delta Lake the Definitive Guide Lee, Denny 1st edition. [Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2022 1 Online-Ressource (106 Seiten) Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Online resource; Title from title page (viewed April 25, 2022) Analysis and machine learning models are only as good as the data they're built on. Querying processed data and getting insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance. This guide introduces you to Delta Lake, an open-source format that enables building a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. Delta Lake enhances Apache Spark and makes it easy to store and manage massive amounts of complex data by supporting data integrity, data quality, and performance. Data engineers, data scientists, and data practitioners will learn how to build reliable data lakes and data pipelines at scale using Delta Lake. Understand key data reliability challenges and how to tackle them Learn how to use Delta Lake to realize data reliability improvements Concurrently run streaming and batch jobs against your data lake Execute update, delete, and merge commands against your data lake Use time travel to roll back and examine previous versions of your data Learn best practices to build effective, high-quality end-to-end data pipelines for real world use cases Integrate with other data technologies like Presto, Athena, Redshift and other BI tools Learn how thousands of companies are processing exabytes of data per month with their lakehouse architecture using Delta Lake. Das, Tathagata VerfasserIn aut Jaiswal, Vini VerfasserIn aut Safari, an O'Reilly Media Company. MitwirkendeR ctb |
spellingShingle | Lee, Denny Das, Tathagata Jaiswal, Vini Delta Lake the Definitive Guide |
title | Delta Lake the Definitive Guide |
title_auth | Delta Lake the Definitive Guide |
title_exact_search | Delta Lake the Definitive Guide |
title_full | Delta Lake the Definitive Guide Lee, Denny |
title_fullStr | Delta Lake the Definitive Guide Lee, Denny |
title_full_unstemmed | Delta Lake the Definitive Guide Lee, Denny |
title_short | Delta Lake |
title_sort | delta lake the definitive guide |
title_sub | the Definitive Guide |
work_keys_str_mv | AT leedenny deltalakethedefinitiveguide AT dastathagata deltalakethedefinitiveguide AT jaiswalvini deltalakethedefinitiveguide AT safarianoreillymediacompany deltalakethedefinitiveguide |