Skip to search
Skip to main content
Search in
Keyword
Title (keyword)
Author (keyword)
Subject (keyword)
Title starts with
Subject (browse)
Author (browse)
Author (sorted by title)
Call number (browse)
search for
Search
Advanced Search
Bookmarks
(
0
)
Princeton University Library Catalog
Start over
Cite
Send
to
SMS
Email
EndNote
RefWorks
RIS
Printer
Bookmark
How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms / Nicolas Kowalski.
Author
Kowalski, Nicolas
[Browse]
Format
Video/Projected medium
Language
English
Εdition
1st edition.
Published/Created
O'Reilly Media, Incorporated, 2020.
Description
1 online resource.
Details
Subject(s)
Artificial intelligence
[Browse]
Machine learning
[Browse]
Author
Antoniotti, Axel
[Browse]
Related name
Safari, an O'Reilly Media Company
[Browse]
Library of Congress genre(s)
Video recordings
[Browse]
Series
Safari Books Online (Series)
[More in this series]
Summary note
When you access a web page, bidders such as Criteo must determine in a few dozens of milliseconds if they want to purchase the advertising space on the page. At that moment, a real-time auction takes place, and once you remove all the communication exchange delays, it leaves a handful of milliseconds to compute exactly how much to bid. In the past year, Criteo has put a large amount of effort into reshaping its in-house machine learning stack responsible for making such predictions-in particular, opening it to new technologies such as TensorFlow. Unfortunately, even for simple logistic regression models and small neural networks, Criteo's initial TensorFlow implementations saw inference time increase by 100, going from 300 microseconds to 30 milliseconds. Nicolas Kowalski and Axel Antoniotti outline how Criteo approached this issue, discussing how Criteo profiled its model to understand its bottleneck; why commonly shared solutions such as optimizing TensorFlow build for the target hardware, freezing and cleaning up the model, and using accelerated linear algebra (XLA) ended up being lackluster; and how Criteo rewrote is models from scratch, reimplementing cross-features and hashing functions using low-level TF operations in order to factorize as much as possible all TensorFlow nodes in its model. Prerequisite knowledge A basic understanding of how TensorFlow and TensorFlow Serving work Experience optimizing TensorFlow models for serving (useful but not required) What you'll learn Understand how to optimize a TensorFlow model before serving it online Discover how to profile a TensorFlow model with a complex preprocessing architecture Learn how and when to replace feature columns with custom cross-features and hashing functions to factorize and drastically reduce the number of nodes in the model This session is from the 2019 O'Reilly TensorFlow World Conference in Santa Clara, CA.
Copyright note
Copyright © O'Reilly Media, Incorporated.
Issuing body
Made available through: Safari, an O'Reilly Media Company.
Source of description
Online resource; Title from title screen (viewed February 28, 2020)
Participant(s)/Performer(s)
Presenter, Nicolas Kowalski; Axel Antoniotti.
OCLC
1143018298
Other standard number
0636920372547
Statement on responsible collection description
Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage.
Read more...
Other views
Staff view
Ask a Question
Suggest a Correction
Supplementary Information
Other versions
How Criteo optimized and sped up its TensorFlow models by 10x and served them under 5 ms / Nicolas Kowlaski, Axel Antoniotti.
id
99130929495406421