Professional Documents
Culture Documents
LinkedIns Vision
Create economic opportunity for every member of the
global workforce
Find
work
Realize
your
dream
job
Be
great
at
what
you
do
LinkedIns Vision
Overview
Infrastructure scaling
Developer productivity scaling
Result quality scaling
LinkedIn:
100s of Millions
Lucene
Galene
(Lucene based)
Galene
(Custom)
1.
2.
BLAH BLAH
Jeff
Jeff
Reid
BLAH
BLAH BLAH
Reid
1
2
Inverted Index (with Posting Lists)
Forward Index
Live
Update
Snapshot
Base
Index
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
3
In-Memory
Live
Updates
B1
S1
S2
L1
L2
L3
Going Forward
Very efficient custom index in C++
Base index build can be run in a distributed manner
BSL supported at a more fundamental level
Faceting
Faceting
Types of facets supported:
discoverable
(e.g.
current
company)
sta9c
values
(e.g.
network)
supplied
values
(e.g.
my
groups)
Legacy stack had no early termination allowing for exact facet counting (at a
cost)
Current Galene stack applies heuristics to determine counts in an approximate
manner
Going forward, custom posting list format will encode facet details for more
efficient facet count estimation
Relevance framework
Relevance Framework
Infrastructure to support common scoring needs
Provides framework to evaluate relevance changes
Enables rapid iterations over relevance experiments
Allows relevance engineers to focus on building features
Rewriter
State
Query
Rewriter
Module
Rewriter
Module
Rewriter
Module
DATA
MODEL
DATA
MODEL
DATA
MODEL
Rewri4en
Query
Top
Results
Rewri4en
Query
Retrieve
a
Document
INDEX
Score
the
Document
Top
Results
From
Shard
Summary
Infrastructure scaling
Developer productivity scaling
Result quality scaling
30