Preprocessing unstructured data for LLMs and RAG systems:
This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll di...
Gespeichert in:
Weitere beteiligte Personen: | |
---|---|
Format: | Elektronisch Video |
Sprache: | Englisch |
Veröffentlicht: |
[Birmingham, United Kingdom]
Packt Publishing
[2024]
|
Ausgabe: | [First edition]. |
Schlagwörter: | |
Links: | https://learning.oreilly.com/library/view/-/9781836642930/?ar |
Zusammenfassung: | This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll dive into data preprocessing techniques, tackling challenges like content extraction, cleaning, and data normalization, making your data ready for advanced AI models. As you progress, the course provides hands-on experience with various document types such as PDFs, HTML, and PPTX files. You'll learn to transform these unstructured formats into structured data that AI systems can easily process. Advanced modules cover chunking, metadata extraction, and handling complex documents using cutting-edge techniques like visual transformers and document layout detectors. The final section guides you in building a complete RAG system using the skills acquired throughout the course. You'll preprocess diverse documents, implement semantic similarity searches, and save elements to a vector database. By the end, you'll be equipped to create intelligent data pipelines and interact with your documents using AI, significantly enhancing your data-driven projects. |
Beschreibung: | Online resource; title from title details screen (O'Reilly, viewed October 9, 2024) |
Umfang: | 1 Online-Ressource (1 video file (3 hr., 2 min.)) sound, color. |
ISBN: | 9781836642930 1836642938 |
Internformat
MARC
LEADER | 00000ngm a22000002 4500 | ||
---|---|---|---|
001 | ZDB-30-ORH-109654609 | ||
003 | DE-627-1 | ||
005 | 20241107103332.0 | ||
006 | m o | | | ||
007 | cr uuu---uuuuu | ||
008 | 241107s2024 xx ||| |o o ||eng c | ||
020 | |a 9781836642930 |c electronic video |9 978-1-83664-293-0 | ||
020 | |a 1836642938 |c electronic video |9 1-83664-293-8 | ||
035 | |a (DE-627-1)109654609 | ||
035 | |a (DE-599)KEP109654609 | ||
035 | |a (ORHE)9781836642930 | ||
035 | |a (DE-627-1)109654609 | ||
040 | |a DE-627 |b ger |c DE-627 |e rda | ||
041 | |a eng | ||
082 | 0 | |a 006.3/5 |2 23/eng/20241009 | |
245 | 1 | 0 | |a Preprocessing unstructured data for LLMs and RAG systems |
250 | |a [First edition]. | ||
264 | 1 | |a [Birmingham, United Kingdom] |b Packt Publishing |c [2024] | |
300 | |a 1 Online-Ressource (1 video file (3 hr., 2 min.)) |b sound, color. | ||
336 | |a zweidimensionales bewegtes Bild |b tdi |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Online resource; title from title details screen (O'Reilly, viewed October 9, 2024) | ||
520 | |a This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll dive into data preprocessing techniques, tackling challenges like content extraction, cleaning, and data normalization, making your data ready for advanced AI models. As you progress, the course provides hands-on experience with various document types such as PDFs, HTML, and PPTX files. You'll learn to transform these unstructured formats into structured data that AI systems can easily process. Advanced modules cover chunking, metadata extraction, and handling complex documents using cutting-edge techniques like visual transformers and document layout detectors. The final section guides you in building a complete RAG system using the skills acquired throughout the course. You'll preprocess diverse documents, implement semantic similarity searches, and save elements to a vector database. By the end, you'll be equipped to create intelligent data pipelines and interact with your documents using AI, significantly enhancing your data-driven projects. | ||
650 | 0 | |a Natural language processing (Computer science) | |
650 | 0 | |a Artificial intelligence | |
650 | 0 | |a Machine learning | |
650 | 4 | |a Traitement automatique des langues naturelles | |
650 | 4 | |a Intelligence artificielle | |
650 | 4 | |a Apprentissage automatique | |
650 | 4 | |a artificial intelligence | |
650 | 4 | |a Instructional films | |
650 | 4 | |a Nonfiction films | |
650 | 4 | |a Internet videos | |
650 | 4 | |a Films de formation | |
650 | 4 | |a Films autres que de fiction | |
650 | 4 | |a Vidéos sur Internet | |
700 | 1 | |a Dichone, Paulo |e MitwirkendeR |4 ctb | |
710 | 2 | |a Packt Publishing, |e Verlag |4 pbl | |
966 | 4 | 0 | |l DE-91 |p ZDB-30-ORH |q TUM_PDA_ORH |u https://learning.oreilly.com/library/view/-/9781836642930/?ar |m X:ORHE |x Aggregator |z lizenzpflichtig |3 Volltext |
912 | |a ZDB-30-ORH | ||
935 | |c vide | ||
951 | |a BO | ||
912 | |a ZDB-30-ORH | ||
049 | |a DE-91 |
Datensatz im Suchindex
DE-BY-TUM_katkey | ZDB-30-ORH-109654609 |
---|---|
_version_ | 1821494925578993664 |
adam_text | |
any_adam_object | |
author2 | Dichone, Paulo |
author2_role | ctb |
author2_variant | p d pd |
author_facet | Dichone, Paulo |
building | Verbundindex |
bvnumber | localTUM |
collection | ZDB-30-ORH |
ctrlnum | (DE-627-1)109654609 (DE-599)KEP109654609 (ORHE)9781836642930 |
dewey-full | 006.3/5 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3/5 |
dewey-search | 006.3/5 |
dewey-sort | 16.3 15 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | [First edition]. |
format | Electronic Video |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03182ngm a22005412 4500</leader><controlfield tag="001">ZDB-30-ORH-109654609</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20241107103332.0</controlfield><controlfield tag="006">m o | | </controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">241107s2024 xx ||| |o o ||eng c</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781836642930</subfield><subfield code="c">electronic video</subfield><subfield code="9">978-1-83664-293-0</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1836642938</subfield><subfield code="c">electronic video</subfield><subfield code="9">1-83664-293-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)109654609</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP109654609</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)9781836642930</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)109654609</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3/5</subfield><subfield code="2">23/eng/20241009</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Preprocessing unstructured data for LLMs and RAG systems</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">[First edition].</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Birmingham, United Kingdom]</subfield><subfield code="b">Packt Publishing</subfield><subfield code="c">[2024]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (1 video file (3 hr., 2 min.))</subfield><subfield code="b">sound, color.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">zweidimensionales bewegtes Bild</subfield><subfield code="b">tdi</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Online resource; title from title details screen (O'Reilly, viewed October 9, 2024)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll dive into data preprocessing techniques, tackling challenges like content extraction, cleaning, and data normalization, making your data ready for advanced AI models. As you progress, the course provides hands-on experience with various document types such as PDFs, HTML, and PPTX files. You'll learn to transform these unstructured formats into structured data that AI systems can easily process. Advanced modules cover chunking, metadata extraction, and handling complex documents using cutting-edge techniques like visual transformers and document layout detectors. The final section guides you in building a complete RAG system using the skills acquired throughout the course. You'll preprocess diverse documents, implement semantic similarity searches, and save elements to a vector database. By the end, you'll be equipped to create intelligent data pipelines and interact with your documents using AI, significantly enhancing your data-driven projects.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Natural language processing (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Traitement automatique des langues naturelles</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Intelligence artificielle</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apprentissage automatique</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Instructional films</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Nonfiction films</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Internet videos</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Films de formation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Films autres que de fiction</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Vidéos sur Internet</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dichone, Paulo</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Packt Publishing,</subfield><subfield code="e">Verlag</subfield><subfield code="4">pbl</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/9781836642930/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="935" ind1=" " ind2=" "><subfield code="c">vide</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection> |
id | ZDB-30-ORH-109654609 |
illustrated | Not Illustrated |
indexdate | 2025-01-17T11:22:07Z |
institution | BVB |
isbn | 9781836642930 1836642938 |
language | English |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource (1 video file (3 hr., 2 min.)) sound, color. |
psigel | ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH |
publishDate | 2024 |
publishDateSearch | 2024 |
publishDateSort | 2024 |
publisher | Packt Publishing |
record_format | marc |
spelling | Preprocessing unstructured data for LLMs and RAG systems [First edition]. [Birmingham, United Kingdom] Packt Publishing [2024] 1 Online-Ressource (1 video file (3 hr., 2 min.)) sound, color. zweidimensionales bewegtes Bild tdi rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Online resource; title from title details screen (O'Reilly, viewed October 9, 2024) This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll dive into data preprocessing techniques, tackling challenges like content extraction, cleaning, and data normalization, making your data ready for advanced AI models. As you progress, the course provides hands-on experience with various document types such as PDFs, HTML, and PPTX files. You'll learn to transform these unstructured formats into structured data that AI systems can easily process. Advanced modules cover chunking, metadata extraction, and handling complex documents using cutting-edge techniques like visual transformers and document layout detectors. The final section guides you in building a complete RAG system using the skills acquired throughout the course. You'll preprocess diverse documents, implement semantic similarity searches, and save elements to a vector database. By the end, you'll be equipped to create intelligent data pipelines and interact with your documents using AI, significantly enhancing your data-driven projects. Natural language processing (Computer science) Artificial intelligence Machine learning Traitement automatique des langues naturelles Intelligence artificielle Apprentissage automatique artificial intelligence Instructional films Nonfiction films Internet videos Films de formation Films autres que de fiction Vidéos sur Internet Dichone, Paulo MitwirkendeR ctb Packt Publishing, Verlag pbl |
spellingShingle | Preprocessing unstructured data for LLMs and RAG systems Natural language processing (Computer science) Artificial intelligence Machine learning Traitement automatique des langues naturelles Intelligence artificielle Apprentissage automatique artificial intelligence Instructional films Nonfiction films Internet videos Films de formation Films autres que de fiction Vidéos sur Internet |
title | Preprocessing unstructured data for LLMs and RAG systems |
title_auth | Preprocessing unstructured data for LLMs and RAG systems |
title_exact_search | Preprocessing unstructured data for LLMs and RAG systems |
title_full | Preprocessing unstructured data for LLMs and RAG systems |
title_fullStr | Preprocessing unstructured data for LLMs and RAG systems |
title_full_unstemmed | Preprocessing unstructured data for LLMs and RAG systems |
title_short | Preprocessing unstructured data for LLMs and RAG systems |
title_sort | preprocessing unstructured data for llms and rag systems |
topic | Natural language processing (Computer science) Artificial intelligence Machine learning Traitement automatique des langues naturelles Intelligence artificielle Apprentissage automatique artificial intelligence Instructional films Nonfiction films Internet videos Films de formation Films autres que de fiction Vidéos sur Internet |
topic_facet | Natural language processing (Computer science) Artificial intelligence Machine learning Traitement automatique des langues naturelles Intelligence artificielle Apprentissage automatique artificial intelligence Instructional films Nonfiction films Internet videos Films de formation Films autres que de fiction Vidéos sur Internet |
work_keys_str_mv | AT dichonepaulo preprocessingunstructureddataforllmsandragsystems AT packtpublishing preprocessingunstructureddataforllmsandragsystems |