Verfügbarkeit: Transformers for Natural Language Processing and Computer Vision

Transformers for Natural Language Processing and Computer Vision: Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3

Gespeichert in:

Bibliographische Detailangaben
Beteilige Person:	Rothman, Denis (VerfasserIn)
Format:	Elektronisch E-Book
Sprache:	Englisch
Veröffentlicht:	Birmingham Packt Publishing, Limited 2024
Ausgabe:	3rd ed
Schlagwörter:	ChatGPT. Artificial intelligence-Data processing Cloud computing
Links:	https://portal.igpublish.com/iglibrary/search/PACKT0007072.html https://ebookcentral.proquest.com/lib/hwr/detail.action?docID=31196765 https://portal.igpublish.com/iglibrary/search/PACKT0007072.html https://portal.igpublish.com/iglibrary/search/PACKT0007072.html https://portal.igpublish.com/iglibrary/search/PACKT0007072.html
Umfang:	1 Online-Ressource (xxxii, 693 Seiten)
ISBN:	9781805123743

Internformat

MARC


LEADER	00000nam a2200000zc 4500
001	BV049873815
003	DE-604
005	20250214
007	cr\|uuu---uuuuu
008	240919s2024 xx o\|\|\|\| 00\|\|\| eng d
020			\|a 9781805123743 \|9 978-1-80512-374-3
035			\|a (ZDB-30-PQE)EBC31196765
035			\|a (ZDB-30-PAD)EBC31196765
035			\|a (ZDB-89-EBL)EBL31196765
035			\|a (OCoLC)1456030850
035			\|a (DE-599)BVBBV049873815
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-2070s \|a DE-706 \|a DE-91 \|a DE-573
082	0		\|a 658.0563
100	1		\|a Rothman, Denis \|e Verfasser \|4 aut
245	1	0	\|a Transformers for Natural Language Processing and Computer Vision \|b Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
250			\|a 3rd ed
264		1	\|a Birmingham \|b Packt Publishing, Limited \|c 2024
264		4	\|c ©2024
300			\|a 1 Online-Ressource (xxxii, 693 Seiten)
336			\|b txt \|2 rdacontent
337			\|b c \|2 rdamedia
338			\|b cr \|2 rdacarrier
650		4	\|a ChatGPT.
650		4	\|a Artificial intelligence-Data processing
650		4	\|a Cloud computing
776	0	8	\|i Erscheint auch als \|n Druck-Ausgabe \|a Rothman, Denis \|t Transformers for Natural Language Processing and Computer Vision \|d Birmingham : Packt Publishing, Limited,c2024 \|z 9781805128724
856	4	0	\|u https://portal.igpublish.com/iglibrary/search/PACKT0007072.html \|x Verlag \|z URL des Erstveröffentlichers \|3 Volltext
912			\|a ZDB-30-PQE
912			\|a ZDB-221-PDA
943	1		\|a oai:aleph.bib-bvb.de:BVB01-035213273
966	e		\|u https://portal.igpublish.com/iglibrary/search/PACKT0007072.html \|l DE-573 \|p ZDB-221-PDA \|x Verlag \|3 Volltext
966	e		\|u https://ebookcentral.proquest.com/lib/hwr/detail.action?docID=31196765 \|l DE-2070s \|p ZDB-30-PQE \|q HWR_PDA_PQE \|x Aggregator \|3 Volltext
966	e		\|u https://portal.igpublish.com/iglibrary/search/PACKT0007072.html \|l DE-91 \|p ZDB-221-PDA \|q TUM_Paketkauf_2025 \|x Verlag \|3 Volltext
966	e		\|u https://portal.igpublish.com/iglibrary/search/PACKT0007072.html \|l DE-706 \|p ZDB-221-PDA \|x Verlag \|3 Volltext

Datensatz im Suchindex

DE-BY-TUM_katkey	2839401
_version_	1824079657007316992
adam_text
any_adam_object
author	Rothman, Denis
author_facet	Rothman, Denis
author_role	aut
author_sort	Rothman, Denis
author_variant	d r dr
building	Verbundindex
bvnumber	BV049873815
collection	ZDB-30-PQE ZDB-221-PDA
contents	Cover -- Copyright -- Contributors -- Table of Contents -- Preface -- Chapter 1: What Are Transformers? -- How constant time complexity O(1) changed our lives forever -- O(1) attention conquers O(n) recurrent methods -- Attention layer -- Recurrent layer -- The magic of the computational time complexity of an attention layer -- Computational time complexity with a CPU -- Computational time complexity with a GPU -- Computational time complexity with a TPU -- TPU-LLM -- A brief journey from recurrent to attention -- A brief history -- From one token to an AI revolution -- From one token to everything -- Foundation Models -- From general purpose to specific tasks -- The role of AI professionals -- The future of AI professionals -- What resources should we use? -- Decision-making guidelines -- The rise of transformer seamless APIs and assistants -- Choosing ready-to-use API-driven libraries -- Choosing a cloud platform and transformer model -- Summary -- Questions -- References -- Further reading -- Chapter 2: Getting Started with the Architecture of the Transformer Model -- The rise of the Transformer: Attention Is All You Need -- The encoder stack -- Input embedding -- Positional encoding -- Sublayer 1: Multi-head attention -- Sublayer 2: Feedforward network -- The decoder stack -- Output embedding and position encoding -- The attention layers -- The FFN sublayer, the post-LN, and the linear layer -- Training and performance -- Hugging Face transformer models -- Summary -- Questions -- References -- Further reading -- Chapter 3: Emergent vs Downstream Tasks: The Unseen Depths of Transformers -- The paradigm shift: What is an NLP task? -- Inside the head of the attention sublayer of a transformer -- Exploring emergence with ChatGPT -- Investigating the potential of downstream tasks -- Evaluating models with metrics -- Accuracy score -- F1-score MCC -- Human evaluation -- Benchmark tasks and datasets -- Defining the SuperGLUE benchmark tasks -- Running downstream tasks -- The Corpus of Linguistic Acceptability (CoLA) -- Stanford Sentiment TreeBank (SST-2) -- Microsoft Research Paraphrase Corpus (MRPC) -- Winograd schemas -- Summary -- Questions -- References -- Further reading -- Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Gemini -- Defining machine translation -- Human transductions and translations -- Machine transductions and translations -- Evaluating machine translations -- Preprocessing a WMT dataset -- Preprocessing the raw data -- Finalizing the preprocessing of the datasets -- Evaluating machine translations with BLEU -- Geometric evaluations -- Applying a smoothing technique -- Translations with Google Trax -- Installing Trax -- Creating the Original Transformer model -- Initializing the model using pretrained weights -- Tokenizing a sentence -- Decoding from the Transformer -- De-tokenizing and displaying the translation -- Translation with Google Translate -- Translation with a Google Translate AJAX API Wrapper -- Implementing googletrans -- Translation with Gemini -- Gemini's potential -- Summary -- Questions -- References -- Further reading -- Chapter 5: Diving into Fine-Tuning through BERT -- The architecture of BERT -- The encoder stack -- Preparing the pretraining input environment -- Pretraining and fine-tuning a BERT model -- Fine-tuning BERT -- Defining a goal -- Hardware constraints -- Installing Hugging Face Transformers -- Importing the modules -- Specifying CUDA as the device for torch -- Loading the CoLA dataset -- Creating sentences, label lists, and adding BERT tokens -- Activating the BERT tokenizer -- Processing the data -- Creating attention masks -- Splitting the data into training and validation sets Converting all the data into torch tensors -- Selecting a batch size and creating an iterator -- BERT model configuration -- Loading the Hugging Face BERT uncased base model -- Optimizer grouped parameters -- The hyperparameters for the training loop -- The training loop -- Training evaluation -- Predicting and evaluating using the holdout dataset -- Exploring the prediction process -- Evaluating using the Matthews correlation coefficient -- Matthews correlation coefficient evaluation for the whole dataset -- Building a Python interface to interact with the model -- Saving the model -- Creating an interface for the trained model -- Interacting with the model -- Summary -- Questions -- References -- Further reading -- Chapter 6: Pretraining a Transformer from Scratch through RoBERTa -- Training a tokenizer and pretraining a transformer -- Building KantaiBERT from scratch -- Step 1: Loading the dataset -- Step 2: Installing Hugging Face transformers -- Step 3: Training a tokenizer -- Step 4: Saving the files to disk -- Step 5: Loading the trained tokenizer files -- Step 6: Checking resource constraints: GPU and CUDA -- Step 7: Defining the configuration of the model -- Step 8: Reloading the tokenizer in transformers -- Step 9: Initializing a model from scratch -- Exploring the parameters -- Step 10: Building the dataset -- Step 11: Defining a data collator -- Step 12: Initializing the trainer -- Step 13: Pretraining the model -- Step 14: Saving the final model (+tokenizer + config) to disk -- Step 15: Language modeling with FillMaskPipeline -- Pretraining a Generative AI customer support model on X data -- Step 1: Downloading the dataset -- Step 2: Installing Hugging Face transformers -- Step 3: Loading and filtering the data -- Step 4: Checking Resource Constraints: GPU and CUDA -- Step 5: Defining the configuration of the model Step 6: Creating and processing the dataset -- Step 7: Initializing the trainer -- Step 8: Pretraining the model -- Step 9: Saving the model -- Step 10: User interface to chat with the Generative AI agent -- Further pretraining -- Limitations -- Next steps -- Summary -- Questions -- References -- Further reading -- Chapter 7: The Generative AI Revolution with ChatGPT -- GPTs as GPTs -- Improvement -- Diffusion -- New application sectors -- Self-service assistants -- Development assistants -- Pervasiveness -- The architecture of OpenAI GPT transformer models -- The rise of billion-parameter transformer models -- The increasing size of transformer models -- Context size and maximum path length -- From fine-tuning to zero-shot models -- Stacking decoder layers -- GPT models -- OpenAI models as assistants -- ChatGPT provides source code -- GitHub Copilot code assistant -- General-purpose prompt examples -- Getting started with ChatGPT - GPT-4 as an assistant -- 1. GPT-4 helps to explain how to write source code -- 2. GPT-4 creates a function to show the YouTube presentation of GPT-4 by Greg Brockman on March 14, 2023 -- 3. GPT-4 creates an application for WikiArt to display images -- 4. GPT-4 creates an application to display IMDb reviews -- 5. GPT-4 creates an application to display a newsfeed -- 6. GPT-4 creates a k-means clustering (KMC) algorithm -- Getting started with the GPT-4 API -- Running our first NLP task with GPT-4 -- Steps 1: Installing OpenAI and Step 2: Entering the API key -- Step 3: Running an NLP task with GPT-4 -- Key hyperparameters -- Running multiple NLP tasks -- Retrieval Augmented Generation (RAG) with GPT-4 -- Installation -- Document retrieval -- Augmented retrieval generation -- Summary -- Questions -- References -- Further reading -- Chapter 8: Fine-Tuning OpenAI GPT Models -- Risk management Fine-tuning a GPT model for completion (generative) -- 1. Preparing the dataset -- 1.1. Preparing the data in JSON -- 1.2. Converting the data to JSONL -- 2. Fine-tuning an original model -- 3. Running the fine-tuned GPT model -- 4. Managing fine-tuned jobs and models -- Before leaving -- Summary -- Questions -- References -- Further reading -- Chapter 9: Shattering the Black Box with Interpretable Tools -- Transformer visualization with BertViz -- Running BertViz -- Step 1: Installing BertViz and importing the modules -- Step 2: Load the models and retrieve attention -- Step 3: Head view -- Step 4: Processing and displaying attention heads -- Step 5: Model view -- Step 6: Displaying the output probabilities of attention heads -- Streaming the output of the attention heads -- Visualizing word relationships using attention scores with pandas -- exBERT -- Interpreting Hugging Face transformers with SHAP -- Introducing SHAP -- Explaining Hugging Face outputs with SHAP -- Transformer visualization via dictionary learning -- Transformer factors -- Introducing LIME -- The visualization interface -- Other interpretable AI tools -- LIT -- PCA -- Running LIT -- OpenAI LLMs explain neurons in transformers -- Limitations and human control -- Summary -- Questions -- References -- Further reading -- Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models -- Matching datasets and tokenizers -- Best practices -- Step 1: Preprocessing -- Step 2: Quality control -- Step 3: Continuous human quality control -- Word2Vec tokenization -- Case 0: Words in the dataset and the dictionary -- Case 1: Words not in the dataset or the dictionary -- Case 2: Noisy relationships -- Case 3: Words in a text but not in the dictionary -- Case 4: Rare words -- Case 5: Replacing rare words Exploring sentence and WordPiece tokenizers to understand the efficiency of subword tokenizers for transformers
ctrlnum	(ZDB-30-PQE)EBC31196765 (ZDB-30-PAD)EBC31196765 (ZDB-89-EBL)EBL31196765 (OCoLC)1456030850 (DE-599)BVBBV049873815
dewey-full	658.0563
dewey-hundreds	600 - Technology (Applied sciences)
dewey-ones	658 - General management
dewey-raw	658.0563
dewey-search	658.0563
dewey-sort	3658.0563
dewey-tens	650 - Management and auxiliary services
discipline	Wirtschaftswissenschaften
edition	3rd ed
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000zc 4500</leader><controlfield tag="001">BV049873815</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20250214</controlfield><controlfield tag="007">cr\|uuu---uuuuu</controlfield><controlfield tag="008">240919s2024 xx o\|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781805123743</subfield><subfield code="9">978-1-80512-374-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC31196765</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PAD)EBC31196765</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-89-EBL)EBL31196765</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1456030850</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV049873815</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-2070s</subfield><subfield code="a">DE-706</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-573</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">658.0563</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Rothman, Denis</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Transformers for Natural Language Processing and Computer Vision</subfield><subfield code="b">Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">3rd ed</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham</subfield><subfield code="b">Packt Publishing, Limited</subfield><subfield code="c">2024</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">©2024</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (xxxii, 693 Seiten)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">ChatGPT.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial intelligence-Data processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cloud computing</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="a">Rothman, Denis</subfield><subfield code="t">Transformers for Natural Language Processing and Computer Vision</subfield><subfield code="d">Birmingham : Packt Publishing, Limited,c2024</subfield><subfield code="z">9781805128724</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007072.html</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-221-PDA</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-035213273</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007072.html</subfield><subfield code="l">DE-573</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/hwr/detail.action?docID=31196765</subfield><subfield code="l">DE-2070s</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">HWR_PDA_PQE</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007072.html</subfield><subfield code="l">DE-91</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="q">TUM_Paketkauf_2025</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://portal.igpublish.com/iglibrary/search/PACKT0007072.html</subfield><subfield code="l">DE-706</subfield><subfield code="p">ZDB-221-PDA</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection>
id	DE-604.BV049873815
illustrated	Not Illustrated
indexdate	2025-02-14T09:02:03Z
institution	BVB
isbn	9781805123743
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-035213273
oclc_num	1456030850
open_access_boolean
owner	DE-2070s DE-706 DE-91 DE-BY-TUM DE-573
owner_facet	DE-2070s DE-706 DE-91 DE-BY-TUM DE-573
physical	1 Online-Ressource (xxxii, 693 Seiten)
psigel	ZDB-30-PQE ZDB-221-PDA ZDB-30-PQE HWR_PDA_PQE ZDB-221-PDA TUM_Paketkauf_2025
publishDate	2024
publishDateSearch	2024
publishDateSort	2024
publisher	Packt Publishing, Limited
record_format	marc
spellingShingle	Rothman, Denis Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3 Cover -- Copyright -- Contributors -- Table of Contents -- Preface -- Chapter 1: What Are Transformers? -- How constant time complexity O(1) changed our lives forever -- O(1) attention conquers O(n) recurrent methods -- Attention layer -- Recurrent layer -- The magic of the computational time complexity of an attention layer -- Computational time complexity with a CPU -- Computational time complexity with a GPU -- Computational time complexity with a TPU -- TPU-LLM -- A brief journey from recurrent to attention -- A brief history -- From one token to an AI revolution -- From one token to everything -- Foundation Models -- From general purpose to specific tasks -- The role of AI professionals -- The future of AI professionals -- What resources should we use? -- Decision-making guidelines -- The rise of transformer seamless APIs and assistants -- Choosing ready-to-use API-driven libraries -- Choosing a cloud platform and transformer model -- Summary -- Questions -- References -- Further reading -- Chapter 2: Getting Started with the Architecture of the Transformer Model -- The rise of the Transformer: Attention Is All You Need -- The encoder stack -- Input embedding -- Positional encoding -- Sublayer 1: Multi-head attention -- Sublayer 2: Feedforward network -- The decoder stack -- Output embedding and position encoding -- The attention layers -- The FFN sublayer, the post-LN, and the linear layer -- Training and performance -- Hugging Face transformer models -- Summary -- Questions -- References -- Further reading -- Chapter 3: Emergent vs Downstream Tasks: The Unseen Depths of Transformers -- The paradigm shift: What is an NLP task? -- Inside the head of the attention sublayer of a transformer -- Exploring emergence with ChatGPT -- Investigating the potential of downstream tasks -- Evaluating models with metrics -- Accuracy score -- F1-score MCC -- Human evaluation -- Benchmark tasks and datasets -- Defining the SuperGLUE benchmark tasks -- Running downstream tasks -- The Corpus of Linguistic Acceptability (CoLA) -- Stanford Sentiment TreeBank (SST-2) -- Microsoft Research Paraphrase Corpus (MRPC) -- Winograd schemas -- Summary -- Questions -- References -- Further reading -- Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Gemini -- Defining machine translation -- Human transductions and translations -- Machine transductions and translations -- Evaluating machine translations -- Preprocessing a WMT dataset -- Preprocessing the raw data -- Finalizing the preprocessing of the datasets -- Evaluating machine translations with BLEU -- Geometric evaluations -- Applying a smoothing technique -- Translations with Google Trax -- Installing Trax -- Creating the Original Transformer model -- Initializing the model using pretrained weights -- Tokenizing a sentence -- Decoding from the Transformer -- De-tokenizing and displaying the translation -- Translation with Google Translate -- Translation with a Google Translate AJAX API Wrapper -- Implementing googletrans -- Translation with Gemini -- Gemini's potential -- Summary -- Questions -- References -- Further reading -- Chapter 5: Diving into Fine-Tuning through BERT -- The architecture of BERT -- The encoder stack -- Preparing the pretraining input environment -- Pretraining and fine-tuning a BERT model -- Fine-tuning BERT -- Defining a goal -- Hardware constraints -- Installing Hugging Face Transformers -- Importing the modules -- Specifying CUDA as the device for torch -- Loading the CoLA dataset -- Creating sentences, label lists, and adding BERT tokens -- Activating the BERT tokenizer -- Processing the data -- Creating attention masks -- Splitting the data into training and validation sets Converting all the data into torch tensors -- Selecting a batch size and creating an iterator -- BERT model configuration -- Loading the Hugging Face BERT uncased base model -- Optimizer grouped parameters -- The hyperparameters for the training loop -- The training loop -- Training evaluation -- Predicting and evaluating using the holdout dataset -- Exploring the prediction process -- Evaluating using the Matthews correlation coefficient -- Matthews correlation coefficient evaluation for the whole dataset -- Building a Python interface to interact with the model -- Saving the model -- Creating an interface for the trained model -- Interacting with the model -- Summary -- Questions -- References -- Further reading -- Chapter 6: Pretraining a Transformer from Scratch through RoBERTa -- Training a tokenizer and pretraining a transformer -- Building KantaiBERT from scratch -- Step 1: Loading the dataset -- Step 2: Installing Hugging Face transformers -- Step 3: Training a tokenizer -- Step 4: Saving the files to disk -- Step 5: Loading the trained tokenizer files -- Step 6: Checking resource constraints: GPU and CUDA -- Step 7: Defining the configuration of the model -- Step 8: Reloading the tokenizer in transformers -- Step 9: Initializing a model from scratch -- Exploring the parameters -- Step 10: Building the dataset -- Step 11: Defining a data collator -- Step 12: Initializing the trainer -- Step 13: Pretraining the model -- Step 14: Saving the final model (+tokenizer + config) to disk -- Step 15: Language modeling with FillMaskPipeline -- Pretraining a Generative AI customer support model on X data -- Step 1: Downloading the dataset -- Step 2: Installing Hugging Face transformers -- Step 3: Loading and filtering the data -- Step 4: Checking Resource Constraints: GPU and CUDA -- Step 5: Defining the configuration of the model Step 6: Creating and processing the dataset -- Step 7: Initializing the trainer -- Step 8: Pretraining the model -- Step 9: Saving the model -- Step 10: User interface to chat with the Generative AI agent -- Further pretraining -- Limitations -- Next steps -- Summary -- Questions -- References -- Further reading -- Chapter 7: The Generative AI Revolution with ChatGPT -- GPTs as GPTs -- Improvement -- Diffusion -- New application sectors -- Self-service assistants -- Development assistants -- Pervasiveness -- The architecture of OpenAI GPT transformer models -- The rise of billion-parameter transformer models -- The increasing size of transformer models -- Context size and maximum path length -- From fine-tuning to zero-shot models -- Stacking decoder layers -- GPT models -- OpenAI models as assistants -- ChatGPT provides source code -- GitHub Copilot code assistant -- General-purpose prompt examples -- Getting started with ChatGPT - GPT-4 as an assistant -- 1. GPT-4 helps to explain how to write source code -- 2. GPT-4 creates a function to show the YouTube presentation of GPT-4 by Greg Brockman on March 14, 2023 -- 3. GPT-4 creates an application for WikiArt to display images -- 4. GPT-4 creates an application to display IMDb reviews -- 5. GPT-4 creates an application to display a newsfeed -- 6. GPT-4 creates a k-means clustering (KMC) algorithm -- Getting started with the GPT-4 API -- Running our first NLP task with GPT-4 -- Steps 1: Installing OpenAI and Step 2: Entering the API key -- Step 3: Running an NLP task with GPT-4 -- Key hyperparameters -- Running multiple NLP tasks -- Retrieval Augmented Generation (RAG) with GPT-4 -- Installation -- Document retrieval -- Augmented retrieval generation -- Summary -- Questions -- References -- Further reading -- Chapter 8: Fine-Tuning OpenAI GPT Models -- Risk management Fine-tuning a GPT model for completion (generative) -- 1. Preparing the dataset -- 1.1. Preparing the data in JSON -- 1.2. Converting the data to JSONL -- 2. Fine-tuning an original model -- 3. Running the fine-tuned GPT model -- 4. Managing fine-tuned jobs and models -- Before leaving -- Summary -- Questions -- References -- Further reading -- Chapter 9: Shattering the Black Box with Interpretable Tools -- Transformer visualization with BertViz -- Running BertViz -- Step 1: Installing BertViz and importing the modules -- Step 2: Load the models and retrieve attention -- Step 3: Head view -- Step 4: Processing and displaying attention heads -- Step 5: Model view -- Step 6: Displaying the output probabilities of attention heads -- Streaming the output of the attention heads -- Visualizing word relationships using attention scores with pandas -- exBERT -- Interpreting Hugging Face transformers with SHAP -- Introducing SHAP -- Explaining Hugging Face outputs with SHAP -- Transformer visualization via dictionary learning -- Transformer factors -- Introducing LIME -- The visualization interface -- Other interpretable AI tools -- LIT -- PCA -- Running LIT -- OpenAI LLMs explain neurons in transformers -- Limitations and human control -- Summary -- Questions -- References -- Further reading -- Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models -- Matching datasets and tokenizers -- Best practices -- Step 1: Preprocessing -- Step 2: Quality control -- Step 3: Continuous human quality control -- Word2Vec tokenization -- Case 0: Words in the dataset and the dictionary -- Case 1: Words not in the dataset or the dictionary -- Case 2: Noisy relationships -- Case 3: Words in a text but not in the dictionary -- Case 4: Rare words -- Case 5: Replacing rare words Exploring sentence and WordPiece tokenizers to understand the efficiency of subword tokenizers for transformers ChatGPT. Artificial intelligence-Data processing Cloud computing
title	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_auth	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_exact_search	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_full	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_fullStr	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_full_unstemmed	Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
title_short	Transformers for Natural Language Processing and Computer Vision
title_sort	transformers for natural language processing and computer vision explore generative ai and large language models with hugging face chatgpt gpt 4v and dall e 3
title_sub	Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3
topic	ChatGPT. Artificial intelligence-Data processing Cloud computing
topic_facet	ChatGPT. Artificial intelligence-Data processing Cloud computing
url	https://portal.igpublish.com/iglibrary/search/PACKT0007072.html
work_keys_str_mv	AT rothmandenis transformersfornaturallanguageprocessingandcomputervisionexploregenerativeaiandlargelanguagemodelswithhuggingfacechatgptgpt4vanddalle3

Verfügbarkeit

Bestellen (Login erforderlich)

Online lesen