> ](
J z
@dexamplesparetos.psfloglog plots jhttp://www.math.uvic.ca/faculty/reed/PREpowerlaws.pdfZ linear, slope aloglogpareto.psv<nonlinear, concave near mean loglognormal.ps*dollar bill migrationthttp://www.kitp.ucsb.edu/community/news/Hufnagel_Paper.pdfsimulationhttp://ccl.northwestern.edu/netlogo/models/run.cgi?PreferentialAttachment.820.527/00DArialgs MS0LLuԖ0ԖDComic Sans MS0LLuԖ0ԖB DWingdings MS0LLuԖ0Ԗ0DSymbolgs MS0LLuԖ0Ԗ
A.
@n?" dd@ @@`` !#%;1234OPQRSTUVWXY?Z[:\]^'nopqr0AA @kf1 ʚ;ʚ;g4kdkd@0tppp@<4ddddA10Lhu0___PPT10
___PPT9df?
%%N?Long Tails and Navigation(8Networked Life
CIS 112
Spring 2008
Prof. Michael Kearns
9Z9v7>One More (Structural) Property& @
(
$(RA properly tuned amodel can simultaneously explain
small diameter
high clustering coefficient
other models can, too (e.g. cycle+random rewirings)
But what about connectors and heavytailed degree distributions?
amodel and simple variants will not explain this
intuitively, no bias towards large degree evolves
all vertices are created equal
Can concoct many bad generative models to explain
generate NW according to ErdosRenyi, reject if tails not heavy
describe fixed NWs with heavy tails
all connected to v1; N/2 connected to v2; etc.
not clear we can get a precise power law
not modeling variation
why would the world evolve this way?
As always, we want a natural model
:4P_PAPP2PdPoP%P%PP _A aIe
%%P{ +@2Quantifying Connectors:HeavyTailed Distributions33(
HHeavytailed Distributions$Pareto or power law distributions:
for random variables assuming integer values > 0
probability of value x ~ 1/x^a
typically 0 < a < 2; smaller a gives heavier tail
here are some examples
sometimes also referred to as being scalefree
For binomial, normal, and Poisson distributions the tail probabilities approach 0 exponentially fast
Inverse polynomial decay vs. inverse exponential decay
What kind of phenomena does this distribution model?
What kind of process would generate it?
$PPP N*$R
cd
T0X%Distributions vs. Data$All these distributions are idealized models
In practice, we do not see distributions, but data
Thus, there will be some largest value we observe
Also, can be difficult to eyeball data and choose model
So how do we distinguish between Poisson, power law, etc?
Typical procedure:
might restrict our attention to a range of values of interest
accumulate counts of observed data into equalsized bins
look at counts on a loglog plot
note that
power law:
log(Pr[X = x]) = log(1/x^a) = a log(x)
linear, slope a
Normal:
log(Pr[X = x]) = log(a exp(x^2/b)) = log(a) x^2/b
nonlinear, concave near mean
Poisson:
log(Pr[X = x]) = log(exp(l) l^x/x!)
also nonlinear
Let s look at the assigned paper on dollar bill migrationPPP:P PTP
P6P:P.""<
T
:
8
T0
T0?]T0Y&Zipf s Law$Look at the frequency of English words:
the is the most common, followed by of , to , etc.
claim: frequency of the nth most common ~ 1/n (power law, a = 1)
General theme:
rank events by their frequency of occurrence
resulting distribution often is a power law!
Other examples:
North America city sizes
personal income
file sizes
genus sizes (number of species)
let s look at loglog plots of these
People seem to dither over exact form of these distributions
e.g. value of a
but not over heavy tails
f(PyPPZPPyP=P+P(ry=*wqfT0A2Generating HeavyTailed Degrees: (Just) One Model33
x8Preferential Attachment(Let s warm up with a little Matlab demo&
Start with (say) two vertices connected by an edge
For i = 3 to N:
for each 1 <= j < i, let d(j) be degree of vertex j (so far)
let Z = S d(j) (sum of all degrees so far)
add new vertex i with k edges back to {1,& ,i1}:
i is connected back to j with probability d(j)/Z
might need to do some smoothing to deal with zerodegree vertices
Vertices j with high degree are likely to get more links!
Rich get richer or Matthew Effect
Natural model for many processes:
hyperlinks on the web
new business and social contacts
transportation networks
Generates a power law distribution of degrees
exponent depends on value of k
Let s look at the NetLogo simulation
nmPPuPPOP.PP%PPmESu.PO.%rd+yx
T0z96Two Out of Three Isn t Bad& (Preferential attachment explains
heavytailed degree distributions
small diameter (~log(N), via hubs )
Will not generate high clustering coefficient
no bias towards local connectivity, but towards hubs
Can we simultaneously capture all three properties?
probably, but we ll stop here
soon there will be a fourth property anyway&
>!G.54K!G&5
KT:Search and Navigation(
};Finding the Short Paths.((Milgram s experiment, Columbia Small Worlds, ER, amodel&
all emphasize existence of short paths between pairs
How do individuals find short paths?
in an incremental, nextstep fashion
using purely local information about the NW and location of target
note: shortest path might require taking steps away from the target!
This is not a structural question, but an algorithmic one
statics vs. dynamics
Navigability may impose additional restrictions on formation model!
Briefly investigate two alternatives:
a local/longdistance mixture model [Kleinberg]
a social identity model [Watts, Dodd, Newman]
;P5P%PP:PPjPaP2
]a$ u<"Kleinberg s Model(Start with an n by n grid of vertices (so N = n^2)
add local connections: all vertices within grid distance p (e.g. 2)
add distant connections:
q additional connections
probability of connection at distance d: ~ (1/d)^r
c.f. dollar bill migration paper
so full model given by choice of p, q and r
large r: heavy bias towards more local longdistance connections
small r: approach uniformly random
Kleinberg s question:
what value of r permits effective search?
Assume parties know only:
grid address of target
addresses of their own direct links
Algorithm: pass message to neighbor closest to target
63]L!*;60#B!"; ,=$Kleinberg s Result(,Intuition:
if r is too large (strong local bias), then longdistance connections never help much; short paths may not even exist
if r is too small (no local bias), we may quickly get close to the target; but then we ll have to use local links to finish
think of a transport system with only longhaul jets or donkey carts
effective search requires a delicate mixture of link distances
The result (informally):
r = 2 is the only value that permits rapid navigation (~log(N) steps)
any other value of r will result in time ~ N^c for 0 < c <= 1
N^c >> log(N) for large N
a critical value phenomenon or knife s edge ; very sensitive
contrast with 1/d^(1.59) from dollar bill migration paper
Note: locality of information crucial to this argument
centralized algorithm may compute short paths at r < 2
can recognize when backwards steps are beneficial
Later in the course: What happens when distanced edges cost d^r?
PPEP?PPPPxP7PkPBPP`jE%
//hk8b4e>Navigation via Identity(*Watts et al.:
we don t navigate social networks by purely geographic information
we don t use any single criterion; recall Dodds et al. on Columbia SW
different criteria used a different points in the chain
Represent individuals by a vector of attributes
profession, religion, hobbies, education, background, etc&
attribute values have distances between them (treestructured)
distance between individuals: minimum distance in any attribute
only need one thing in common to be close!
Algorithm:
given attribute vector of target
forward message to neighbor closest to target
Permits fast navigation under broad conditions
not as sensitive as Kleinberg s model0O/&VgO/&~/lLZ[wy{~ 0` 33` Sf3f` 33g` f` www3PP` ZXdbmo` \ғ3y`Ӣ` 3f3ff` 3f3FKf` hk]wwwfܹ` ff>>\`Y{ff` R>& {p_/̴>?" dd@,?" dd@ " @ ` n?" dd@ @@``PR @ ` `p>>f(
6 `}
T Click to edit Master title style!
!
0ڍ `
RClick to edit Master text styles
Second level
Third level
Fourth level
Fifth level!
S
0\ ^ `
>*
0 ^
@*
0` ^ `
@*H
0h ? 3380___PPT10.f6 Default Design0zr0,
(
,
,
0Ǧ P
P*
,
0컦
R*
d
,
c$ ?
,
0<
0
RClick to edit Master text styles
Second level
Third level
Fourth level
Fifth level!
S
,
6\r _P
P*
,
6w _
R*
H
,0h ? 3380___PPT10.I<00(
x
c$ `
x
c$l PP`
H
0h ? 33___PPT10i.fP+D=' d
=
@B +
0N00(
x
c$ߦ @
x
c$pզ p
H
0h ? 33___PPT10i.fb+D=' d
=
@B +!
0N0 (
x
c$D @
H
0h ? 33___PPT10i.fb+D=' d
=
@B +
0N0$0(
$x
$ c$
x
$ c$ 00
H
$0h ? 33___PPT10i.fb+D=' d
=
@B +
0N0d0(
dx
d c$T `@
x
d c$, p
H
d0h ? 33___PPT10i.fb+D=' d
=
@B +
0N00l0(
lx
l c$D)
x
l c$* P
H
l0h ? 33___PPT10i.fb+D=' d
=
@B +!
0N0 (
x
c$@ @
H
0h ? 33___PPT10i.fb+D=' d
=
@B +w$
0N0\(
x
c$H
c$H 0<$0
H
0h ? 33;"3"___PPT10".fb+[W_D!' d
=
@B Dj!' =
@BA?%,( <+O%,(
<+DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*)%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<**]%(D>
' =%(D ' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*]m%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*m%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*7%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*7{%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*{%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*4%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*4L%(D' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*Lz%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*z%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(+8+0+0 +
0N000(
x
c$d
x
c$e 0
H
0h ? 33___PPT10i.fb+D=' d
=
@B +!
0N0 P(
x
c$0{ @
H
0h ? 33___PPT10i.fb+D=' d
=
@B +
0N0p0(
x
c$l
x
c$v `0
H
0h ? 33___PPT10i.fb+D=' d
=
@B +@
0N0:>r(
x
c$
c$ p<$0
~l
<
,$D0@ p
P
`2
0P
`2
0
p`2
00`2
0P
`2
00`2
00
p`2
00P
`2
000`2
0
pN p
P
`2
0P
`2
0
p`2
00`2
0P
`2
00`2
00
p`2
00P
`2
000`2
0
pN p
`2
0P
`2
0
p`2
00`2
0P
`2
00`2
00
p`2
00P
`2
000`2
!
0
pN p
"
`2
#
0P
`2
$
0
p`2
%
00`2
&
0P
`2
'
00`2
(
00
p`2
)
00P
`2
*
000`2
+
0
pZB

s*Do
pZB
.
s*DoP
ZB
0
s*Do
0
ZB
1B
s*Do
ZB
3
s*Do`P
ZB
4B
s*Dop
`ZB
6B
s*DoP
ZB
7
s*Do
Pp
8
BChDE(FAo
<,xX`h@0 @ S"
9
BChDEFAo,pX `h080@ S" P
0
:
BhCDEFAo,8Xp`hLx0@ S"
;
B`CDE(FAo
0@X```PH(0 @ S" x
8
=
BCDE4FAo
\p(PHPx@ d@ S" p@,$D 0H
>
BCDE@F AԔ0X Pp 8Phh8`
@ S" pP
,$D
0H
0h ? 33]$U$___PPT105$.fb+'6D#' d
=
@B D#' =
@BA?%,( <+O%,(
<+D' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*3%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*3w%(D4' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*<%(D' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*w%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D4' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*=%(D4' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*>%(Ds' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*)%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*)l%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*l%(D' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(Ds' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*$%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*$Z%(+8+0+0 +
0N09\(
x
c$ԯ ` p
c$ P
<$0
H
0h ? 33IA___PPT10!.fb+FD' d
=
@B Dx' =
@BA?%,( <+O%,(
<+D' =%(DM' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*D%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*D%(D>
' =%(D ' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* :%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*:x%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*x%(Ds' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* T%(DA' =%(D' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<*T%(+8+0+0 +
0N0(
x
c$̮ 0 @
x
c$\ͮ @0
W8
2
6֮
:all jobs 2
6خ
<
scientists2
6ݮ
:athletes 2
6Tڮ
; chemistry
2
6
P
4CS2
6
:baseball 2
6 pP
@@
8tennisZB
B
s*Do`
ZB
B
s*DopPZB
s*DoZB
s*Do
ZB
B
s*Do
ZB
B
s*Do`H
0h ? 33___PPT10i.fb+D=' d
=
@B +0P4(
4X
4 C,
4 Sx ,
0
H
40h ? 3380___PPT10.Sa%0 h((
h^
h S,
h c$Ę ,
0
H
h0h ? 3380___PPT10.Sa&0@p((
p^
p S,
p c$ ,
0
H
p0h ? 3380___PPT10.Sa70((
^
S,
c$ ,
0
H
0h ? 3380___PPT10.`880 ((
^
S,
c$짯 ,
0
H
0h ? 3380___PPT10.`890@((
^
S,
c$p ,
0
H
0h ? 3380___PPT10.`8:0`((
^
S,
c$ï ,
0
H
0h ? 3380___PPT10.`8;0((
^
S,
c$ɯ ,
0
H
0h ? 3380___PPT10.`8<0((
^
S,
c$ ,
0
H
0h ? 3380___PPT10.`8=0((
^
S,
c$` ,
0
H
0h ? 3380___PPT10.`8>0((
^
S,
d
c$d
,
0d
H
0h ? 3380___PPT10.`8@0((
^
S,
c$T ,
0
H
0h ? 3380___PPT10.`8A0 ((
^
S,
c$$ ,
0
H
0h ? 3380___PPT10.`8r ]HtJgLQX@w9zevp3\!%#E%e'I
)Pmr+>/1](
J z
@dexamplesparetos.psfloglog plots jhttp://www.math.uvic.ca/faculty/reed/PREpowerlaws.pdfZ l
Onscreen ShowUniversity of PA
ArialComic Sans MS
WingdingsSymbolDefault DesignLong Tails and Navigation One More (Structural) Property3Quantifying Connectors: HeavyTailed DistributionsHeavytailed DistributionsDistributions vs. DataZipfs Law3Generating HeavyTailed Degrees: (Just) One ModelPreferential AttachmentTwo Out of Three Isnt BadSearch and NavigationFinding the Short PathsKleinbergs ModelKleinbergs ResultNavigation via IdentityFonts UsedDesign Template
Slide Titlesd 8@_PID_HLINKSA$paretos.ps6http://www.math.uvic.ca/faculty/reed/PREpowerlaws.pdfloglogpareto.psloglognormal.ps;http://www.kitp.ucsb.edu/community/news/Hufnagel_Paper.pdfRhttp://ccl.northwestern.edu/netlogo/models/run.cgi?PreferentialAttachment.820.527&_㋎Michael KearnsMichael Kearns՜.+,D՜.+,
Onscreen ShowUniversity of PA
ArialComic Sans MS
WingdingsSymbolDefault DesignLong Tails and Navigation One More (Structural) Property3Quantifying Connectors: HeavyTailed DistributionsHeavytailed DistributionsDistributions vs. DataZipfs Law3Generating HeavyTailed Degrees: (Just) One ModelPreferential AttachmentTwo Out of Three Isnt BadSearch and NavigationFinding the Short PathsKleinbergs ModelKleinbergs ResultNavigation via IdentityFonts UsedDesign Template
Slide Titlesd 8@_PID_HLINKSA$paretos.ps6http://www.math.uvic.ca/faculty/reed/PREpowerlaws.pdfloglogpareto.psloglognormal.ps;http://www.kitp.ucsb.edu/community/news/Hufnagel_Paper.pdfRhttp://ccl.northwestern.edu/netlogo/models/run.cgi?PreferentialAttachment.820.527&_0Michael KearnsMichael Kearns
!"#$%&'()*+,./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{}~Root EntrydO)s
Current User4JSummaryInformation(PowerPoint Document(DocumentSummaryInformation8(inear, slope aloglogpareto.psv<nonlinear, concave near mean loglognormal.ps*dollar bill migrationthttp://www.kitp.ucsb.edu/community/news/Hufnagel_Paper.pdfsimulationhttp://ccl.northwestern.edu/netlogo/models/run.cgi?PreferentialAttachment.820.527/00DArialgs MS0LLuԖ0ԖDComic Sans MS0LLuԖ0ԖB DWingdings MS0LLuԖ0Ԗ0DSymbolgs MS0LLuԖ0Ԗ
A.
@n?" dd@ @@`` !#%;1234OPQRSTUVWXY?Z[:\]^'nopqr0AA @kf1 ʚ;ʚ;g4kdkd@0nppp@<4ddddA10Lhu0___PPT10
___PPT9df?
%%N?Long Tails and Navigation(8Networked Life
CIS 112
Spring 2008
Prof. Michael Kearns
9Z9v7>One More (Structural) Property& @
(
$(RA properly tuned amodel can simultaneously explain
small diameter
high clustering coefficient
other models can, too (e.g. cycle+random rewirings)
But what about connectors and heavytailed degree distributions?
amodel and simple variants will not explain this
intuitively, no bias towards large degree evolves
all vertices are created equal
Can concoct many bad generative models to explain
generate NW according to ErdosRenyi, reject if tails not heavy
describe fixed NWs with heavy tails
all connected to v1; N/2 connected to v2; etc.
not clear we can get a precise power law
not modeling variation
why would the world evolve this way?
As always, we want a natural model
:4P_PAPP2PdPoP%P%PP _A aIe
%%P{ +@2Quantifying Connectors:HeavyTailed Distributions33(
HHeavytailed Distributions$Pareto or power law distributions:
for random variables assuming integer values > 0
probability of value x ~ 1/x^a
typically 0 < a < 2; smaller a gives heavier tail
here are some examples
sometimes also referred to as being scalefree
For binomial, normal, and Poisson distributions the tail probabilities approach 0 exponentially fast
Inverse polynomial decay vs. inverse exponential decay
What kind of phenomena does this distribution model?
What kind of process would generate it?
$PPP N*$R
cd
T0X%Distributions vs. Data$All these distributions are idealized models
In practice, we do not see distributions, but data
Thus, there will be some largest value we observe
Also, can be difficult to eyeball data and choose model
So how do we distinguish between Poisson, power law, etc?
Typical procedure:
might restrict our attention to a range of values of interest
accumulate counts of observed data into equalsized bins
look at counts on a loglog plot
note that
power law:
log(Pr[X = x]) = log(1/x^a) = a log(x)
linear, slope a
Normal:
log(Pr[X = x]) = log(a exp(x^2/b)) = log(a) x^2/b
nonlinear, concave near mean
Poisson:
log(Pr[X = x]) = log(exp(l) l^x/x!)
also nonlinear
Let s look at the assigned paper on dollar bill migrationPPP:P PTP
P6P:P.""<
T
:
8
T0
T0?]T0Y&Zipf s Law$Look at the frequency of English words:
the is the most common, followed by of , to , etc.
claim: frequency of the nth most common ~ 1/n (power law, a = 1)
General theme:
rank events by their frequency of occurrence
resulting distribution often is a power law!
Other examples:
North America city sizes
personal income
file sizes
genus sizes (number of species)
let s look at loglog plots of these
People seem to dither over exact form of these distributions
e.g. value of a
but not over heavy tails
f(PyPPZPPyP=P+P(ry=*wqfT0A2Generating HeavyTailed Degrees: (Just) One Model33
x8Preferential Attachment(Let s warm up with a little Matlab demo&
Start with (say) two vertices connected by an edge
For i = 3 to N:
for each 1 <= j < i, let d(j) be degree of vertex j (so far)
let Z = S d(j) (sum of all degrees so far)
add new vertex i with k edges back to {1,& ,i1}:
i is connected back to j with probability d(j)/Z
might need to do some smoothing to deal with zerodegree vertices
Vertices j with high degree are likely to get more links!
Rich get richer or Matthew Effect
Natural model for many processes:
hyperlinks on the web
new business and social contacts
transportation networks
Generates a power law distribution of degrees
exponent depends on value of k
Let s look at the NetLogo simulation
nmPPuPPOP.PP%PPmESu.PO.%rd+yx
T0z96Two Out of Three Isn t Bad& (Preferential attachment explains
heavytailed degree distributions
small diameter (~log(N), via hubs )
Will not generate high clustering coefficient
no bias towards local connectivity, but towards hubs
Can we simultaneously capture all three properties?
probably, but we ll stop here
soon there will be a fourth property anyway&
>!G.54K!G&5
KT:Search and Navigation(
};Finding the Short Paths.((Milgram s experiment, Columbia Small Worlds, ER, amodel&
all emphasize existence of short paths between pairs
How do individuals find short paths?
in an incremental, nextstep fashion
using purely local information about the NW and location of target
note: shortest path might require taking steps away from the target!
This is not a structural question, but an algorithmic one
statics vs. dynamics
Navigability may impose additional restrictions on formation model!
Briefly investigate two alternatives:
a local/longdistance mixture model [Kleinberg]
a social identity model [Watts, Dodd, Newman]
;P5P%PP:PPjPaP2
]a$ u<"Kleinberg s Model(Start with an n by n grid of vertices (so N = n^2)
add local connections: all vertices within grid distance p (e.g. 2)
add distant connections:
q additional connections
probability of connection at distance d: ~ (1/d)^r
c.f. dollar bill migration paper
so full model given by choice of p, q and r
large r: heavy bias towards more local longdistance connections
small r: approach uniformly random
Kleinberg s question:
what value of r permits effective search?
Assume parties know only:
grid address of target
addresses of their own direct links
Algorithm: pass message to neighbor closest to target
63]L!*;60#B!"; ,=$Kleinberg s Result(,Intuition:
if r is too large (strong local bias), then longdistance connections never help much; short paths may not even exist
if r is too small (no local bias), we may quickly get close to the target; but then we ll have to use local links to finish
think of a transport system with only longhaul jets or donkey carts
effective search requires a delicate mixture of link distances
The result (informally):
r = 2 is the only value that permits rapid navigation (~log(N) steps)
any other value of r will result in time ~ N^c for 0 < c <= 1
N^c >> log(N) for large N
a critical value phenomenon or knife s edge ; very sensitive
contrast with 1/d^(1.59) from dollar bill migration paper
Note: locality of information crucial to this argument
centralized algorithm may compute short paths at r < 2
can recognize when backwards steps are beneficial
Later in the course: What happens when distanced edges cost d^r?
PPEP?PPPPxP7PkPBPP`jE%
//hk8b4e>Navigation via Identity(*Watts et al.:
we don t navigate social networks by purely geographic information
we don t use any single criterion; recall Dodds et al. on Columbia SW
different criteria used a different points in the chain
Represent individuals by a vector of attributes
profession, religion, hobbies, education, background, etc&
attribute values have distances between them (treestructured)
distance between individuals: minimum distance in any attribute
only need one thing in common to be close!
Algorithm:
given attribute vector of target
forward message to neighbor closest to target
Permits fast navigation under broad conditions
not as sensitive as Kleinberg s model0O/&VgO/&~/lLZ[wy{~r0:0{1
!"#$%&'()*+,./01235Oh+'0p`h
Networked Nature of SocietyCISMichael Kearns159Microsoft PowerPoint@@Pg@*sG`g 'y$xx'@BComic Sans MS. 2
&Long Tails and Navigation."System 9@BComic Sans MS. 332
;9Networked Life.@BComic Sans MS. 332
CCCIS 112.@BComic Sans MS. 332
K>Spring 2008i.@BComic Sans MS. 33%2
S1Prof. Michael Kearns.՜.+,D՜.+,