Verfügbarkeit: How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms

Gespeichert in:

Bibliographische Detailangaben
Beteiligte Personen:	Kowalski, Nicolas (VerfasserIn), Antoniotti, Axel (VerfasserIn)
Körperschaft:	Safari, an O'Reilly Media Company (MitwirkendeR)
Format:	Elektronisch Video
Sprache:	Englisch
Veröffentlicht:	[Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2020
Ausgabe:	1st edition.
Schlagwörter:	Machine learning Artificial intelligence Apprentissage automatique Intelligence artificielle artificial intelligence Electronic videos
Links:	https://learning.oreilly.com/library/view/-/0636920372547/?ar
Zusammenfassung:	When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions-in particular, opening it to new technologies such as TensorFlow. Unfortunately, even for simple logistic regression models and small neural networks, Criteo's initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds. Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model. Prerequisite knowledge A basic understanding of how TensorFlow and TensorFlow Serving work Experience optimizing TensorFlow models for serving (useful but not required) What you'll learn Understand how to optimize a TensorFlow model before serving it online Discover how to profile a TensorFlow model with a complex preprocessing architecture Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA.
Beschreibung:	Online resource; Title from title screen (viewed February 28, 2020)
Umfang:	1 Online-Ressource (1 video file, approximately 38 min.)
Format:	Mode of access: World Wide Web.

Internformat

MARC


LEADER	00000cgm a22000002c 4500
001	ZDB-30-ORH-050573659
003	DE-627-1
005	20240228121008.0
006	m o \| \|
007	cr uuu---uuuuu
008	200324s2020 xx \|\|\| \|o o \|\|eng c
035			\|a (DE-627-1)050573659
035			\|a (DE-599)KEP050573659
035			\|a (ORHE)0636920372547
035			\|a (DE-627-1)050573659
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
082	0		\|a E VIDEO
100	1		\|a Kowalski, Nicolas \|e VerfasserIn \|4 aut
245	1	0	\|a How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms \|c Kowalski, Nicolas
250			\|a 1st edition.
264		1	\|a [Erscheinungsort nicht ermittelbar] \|b O'Reilly Media, Inc. \|c 2020
264		2	\|a Boston, MA \|b Safari.
300			\|a 1 Online-Ressource (1 video file, approximately 38 min.)
336			\|a zweidimensionales bewegtes Bild \|b tdi \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
500			\|a Online resource; Title from title screen (viewed February 28, 2020)
520			\|a When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions-in particular, opening it to new technologies such as TensorFlow. Unfortunately, even for simple logistic regression models and small neural networks, Criteo's initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds. Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model. Prerequisite knowledge A basic understanding of how TensorFlow and TensorFlow Serving work Experience optimizing TensorFlow models for serving (useful but not required) What you'll learn Understand how to optimize a TensorFlow model before serving it online Discover how to profile a TensorFlow model with a complex preprocessing architecture Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA.
538			\|a Mode of access: World Wide Web.
650		0	\|a Machine learning
650		0	\|a Artificial intelligence
650		4	\|a Apprentissage automatique
650		4	\|a Intelligence artificielle
650		4	\|a artificial intelligence
650		4	\|a Electronic videos
700	1		\|a Antoniotti, Axel \|e VerfasserIn \|4 aut
710	2		\|a Safari, an O'Reilly Media Company. \|e MitwirkendeR \|4 ctb
966	4	0	\|l DE-91 \|p ZDB-30-ORH \|q TUM_PDA_ORH \|u https://learning.oreilly.com/library/view/-/0636920372547/?ar \|m X:ORHE \|x Aggregator \|z lizenzpflichtig \|3 Volltext
912			\|a ZDB-30-ORH
935			\|c vide
951			\|a BO
912			\|a ZDB-30-ORH
049			\|a DE-91

Datensatz im Suchindex

DE-BY-TUM_katkey	ZDB-30-ORH-050573659
_version_	1835903263868715008
adam_text
any_adam_object
author	Kowalski, Nicolas Antoniotti, Axel
author_corporate	Safari, an O'Reilly Media Company
author_corporate_role	ctb
author_facet	Kowalski, Nicolas Antoniotti, Axel Safari, an O'Reilly Media Company
author_role	aut aut
author_sort	Kowalski, Nicolas
author_variant	n k nk a a aa
building	Verbundindex
bvnumber	localTUM
collection	ZDB-30-ORH
ctrlnum	(DE-627-1)050573659 (DE-599)KEP050573659 (ORHE)0636920372547
dewey-raw	E VIDEO
dewey-search	E VIDEO
edition	1st edition.
format	Electronic Video
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03586cgm a22004692c 4500</leader><controlfield tag="001">ZDB-30-ORH-050573659</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20240228121008.0</controlfield><controlfield tag="006">m o \| \| </controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">200324s2020 xx \|\|\| \|o o \|\|eng c</controlfield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)050573659</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP050573659</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)0636920372547</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)050573659</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">E VIDEO</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Kowalski, Nicolas</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms</subfield><subfield code="c">Kowalski, Nicolas</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Erscheinungsort nicht ermittelbar]</subfield><subfield code="b">O'Reilly Media, Inc.</subfield><subfield code="c">2020</subfield></datafield><datafield tag="264" ind1=" " ind2="2"><subfield code="a">Boston, MA</subfield><subfield code="b">Safari.</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (1 video file, approximately 38 min.)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">zweidimensionales bewegtes Bild</subfield><subfield code="b">tdi</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Online resource; Title from title screen (viewed February 28, 2020)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions-in particular, opening it to new technologies such as TensorFlow. Unfortunately, even for simple logistic regression models and small neural networks, Criteo's initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds. Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model. Prerequisite knowledge A basic understanding of how TensorFlow and TensorFlow Serving work Experience optimizing TensorFlow models for serving (useful but not required) What you'll learn Understand how to optimize a TensorFlow model before serving it online Discover how to profile a TensorFlow model with a complex preprocessing architecture Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA.</subfield></datafield><datafield tag="538" ind1=" " ind2=" "><subfield code="a">Mode of access: World Wide Web.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apprentissage automatique</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Intelligence artificielle</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Electronic videos</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Antoniotti, Axel</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Safari, an O'Reilly Media Company.</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/0636920372547/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="935" ind1=" " ind2=" "><subfield code="c">vide</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection>
id	ZDB-30-ORH-050573659
illustrated	Not Illustrated
indexdate	2025-06-25T12:16:30Z
institution	BVB
language	English
open_access_boolean
owner	DE-91 DE-BY-TUM
owner_facet	DE-91 DE-BY-TUM
physical	1 Online-Ressource (1 video file, approximately 38 min.)
psigel	ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH
publishDate	2020
publishDateSearch	2020
publishDateSort	2020
publisher	O'Reilly Media, Inc.
record_format	marc
spelling	Kowalski, Nicolas VerfasserIn aut How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms Kowalski, Nicolas 1st edition. [Erscheinungsort nicht ermittelbar] O'Reilly Media, Inc. 2020 Boston, MA Safari. 1 Online-Ressource (1 video file, approximately 38 min.) zweidimensionales bewegtes Bild tdi rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Online resource; Title from title screen (viewed February 28, 2020) When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions-in particular, opening it to new technologies such as TensorFlow. Unfortunately, even for simple logistic regression models and small neural networks, Criteo's initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds. Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model. Prerequisite knowledge A basic understanding of how TensorFlow and TensorFlow Serving work Experience optimizing TensorFlow models for serving (useful but not required) What you'll learn Understand how to optimize a TensorFlow model before serving it online Discover how to profile a TensorFlow model with a complex preprocessing architecture Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA. Mode of access: World Wide Web. Machine learning Artificial intelligence Apprentissage automatique Intelligence artificielle artificial intelligence Electronic videos Antoniotti, Axel VerfasserIn aut Safari, an O'Reilly Media Company. MitwirkendeR ctb
spellingShingle	Kowalski, Nicolas Antoniotti, Axel How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms Machine learning Artificial intelligence Apprentissage automatique Intelligence artificielle artificial intelligence Electronic videos
title	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms
title_auth	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms
title_exact_search	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms
title_full	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms Kowalski, Nicolas
title_fullStr	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms Kowalski, Nicolas
title_full_unstemmed	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms Kowalski, Nicolas
title_short	How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms
title_sort	how criteo optimized and sped up its tensorflow models by 10x and served them under 5 ms
topic	Machine learning Artificial intelligence Apprentissage automatique Intelligence artificielle artificial intelligence Electronic videos
topic_facet	Machine learning Artificial intelligence Apprentissage automatique Intelligence artificielle artificial intelligence Electronic videos
work_keys_str_mv	AT kowalskinicolas howcriteooptimizedandspedupitstensorflowmodelsby10xandservedthemunder5ms AT antoniottiaxel howcriteooptimizedandspedupitstensorflowmodelsby10xandservedthemunder5ms AT safarianoreillymediacompany howcriteooptimizedandspedupitstensorflowmodelsby10xandservedthemunder5ms

Verfügbarkeit

‌

Online lesen