image Home       image Fowles,       image Fitzgerald,       image r04 06 (9)       image R 22MP (3)       image 45 (3)       

Linki

[ Pobierz całość w formacie PDF ]
Journal of Computer and System Sciences 72 (2006) 1077–1093
www.elsevier.com/locate/jcss
Inoculation strategies for victims of viruses
and the sum-of-squares partition problem

James Aspnes
1
, Kevin Chang
2
, Aleksandr Yampolskiy

,
3
Department of Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06520-8285, USA
Received 31 October 2004; received in revised form 9 February 2006
Available online 17 April 2006
Abstract
We propose a simple game for modeling containment of the spread of viruses in a graph of
n
nodes. Each node must choose
to either install anti-virus software at some known cost
C
, or risk infection and a loss
L
if a virus that starts at a random initial
point in the graph can reach it without being stopped by some intermediate node. We prove many game theoretic properties of the
model, including an easily applied characterization of Nash equilibria, culminating in our showing that a centralized solution can
give a much better total cost than an equilibrium solution. Though it is
NP
-hard to compute such a social optimum, we show that
the problem can be reduced to a previously unconsidered combinatorial problem that we call the
sum-of-squares partition problem
.
Using a greedy algorithm based on sparse cuts, we show that this problem can be approximated to within a factor of
O(
log
1
.
5
n)
.
©
2006 Elsevier Inc. All rights reserved.
Keywords:
Computer virus model; Economics of security; Security externalities; Price of anarchy; Sum-of-squares partition
1. Introduction
Consider a system in which
n
machines, each of which may or may not be protected against viruses, are connected
by a network in the form of a graph, and any virus that infects some machine eventually infects all of its unprotected
neighbors. If anti-virus software is available, a natural response would be to protect all the machines—but perhaps
the anti-virus software itself creates costs, both in money and time to purchase and install the software and in reduced
efficiency or usability of the protected machine. Suppose that protecting a machine by installing anti-virus software
costs the owner of the machine
C
, but that having the machine be infected costs
L
, which may or may not be greater
than
C
. If the virus spreads from some initial machine chosen uniformly at random, on which machines does it make
sense to install the anti-virus software?

A preliminary version of this paper appeared in the proceedings of Sixteenth Annual ACM–SIAM Symposium on Discrete Algorithms, January,
2005.
*
Corresponding author.
E-mail addresses:
aspnes@cs.yale.edu (J. Aspnes), kchang@cs.yale.edu (K. Chang), aleksandr.yampolskiy@yale.edu (A. Yampolskiy).
1
Supported in part by NSF grants CCR-0098078, CNS-0305258, and CNS-0435201.
2
Supported by NSF grant CCR-0331548.
3
Supported by NSF grants CCR-0098078, ANI-0207399, CNS-0305258, and CNS-0435201.
0022-0000/$ – see front matter © 2006 Elsevier Inc. All rights reserved.
doi:10.1016/j.jcss.2006.02.003
 1078
J. Aspnes et al. / Journal of Computer and System Sciences 72 (2006) 1077–1093
The answer will depend on whether we adopt the perspective of the owner of a single machine or of the society as a
whole. When the anti-virus software costs more than the loss from infection, no economically rational machine owner
will install the anti-virus software, every machine will be infected, and the system will incur a social cost of
Ln
.But
for many graphs, selective inoculation of a few central machines can limit the spread of infection to a small subset of
the graph, greatly reducing the total cost of infection in return for a small investment in anti-virus software. We can
ask how much of an improvement a centralized solution can provide, and how easy it is to find a good centralized
solution.
After discussing some previous work on related problems (in Section 2), we give a complete characterization of
the Nash equilibria for an anti-virus software installation game in which each machine’s owner separately chooses
whether or not to install the software, without regard to the effect on other machines. (This game is defined in Sec-
tion 3.) We show (in Section 4) that finding either the most or least expensive equilibrium is
NP
-hard, but that some
Nash equilibrium can be computed in
O(n
3
)
time and that any population of nodes will quickly converge to a Nash
equilibrium by updating their strategies locally based on the other nodes’ strategies. Unfortunately, the cost of any
such Nash equilibrium may be badly suboptimal; the
price of anarchy
forthisgameis
Θ(n)
in the worst case. This
shows that for many graphs and values of
C
and
L
, letting the users choose individually whether or not to inoculate
their machines will give bad results.
We then consider (in Section 5) the possibility of a centralized solution in which a dictator computes and enforces
an optimal inoculation plan. We show that essentially the same argument that shows that extreme Nash equilibria
are hard to find applies to the optimal solution as well. However, we show that the problem of finding an optimal
inoculation plan reduces to a graph partition problem in which we are asked to partition the graph by removing
m
nodes; the quality of the partition is measured by the sum of the squares of the sizes of its components. We give
(in Section 6) a polynomial-time approximation algorithm that removes
O(
log
1
.
5
n)m
nodes in order to achieve a
partition with quality within
O(
1
)
of the optimum. We complement our algorithm with results on the hardness of
approximation of the sum-of-squares partition problem.
Conclusions and open problems appear in Section 7.
2. Related work
In this section, we describe three classes of work related to this paper: virus propagation models, economic models
of investment in security, and game-theoretic models of security. We then discuss some work on graph partitioning
algorithms that are related to the sum-of-squares partition problem we consider in Section 6.
2.1. Virus propagation models
Traditional epidemiological models characterize the viral infection in terms of birth rate and death rate of the
virus [1,2]. Usually, these models assume that an infected individual is equally likely to infect any other individual in
the population; in contrast, computer viruses usually spread via localized interactions. Kephart and White extended the
traditional model by transferring it onto a directed random graph [3]. Later work (e.g., [4–6]) studied virus propagation
over other kinds of graphs, including Internet-like power-law graphs [7–9]. We do not restrict the network topology in
any way and consider a general undirected graph. Our model is in some ways closer to models in percolation theory
(see [10]): an infected node infects all of its unprotected neighbors, spreading infection throughout the graph until it
is blocked by an anti-virus software.
2.2. Economic models of security
Our work is motivated in part by an observation that security technologies exhibit network externalities [11].
Specifically, the benefit obtained by using security technology (anti-virus software in our case) does not accrue only
to the user of the security technology but rather to all users of the network. Aspnes et al. [12] also consider anti-virus
immunization, and proposed studying how to encourage highly connected nodes to use anti-viral techniques.
We assume that costs of installation and infection are known. Alternatively, one could use risk analysis to estimate
the costs and benefits from installing a security technology (see, for example, [13]), or estimate values based on
empirical studies of the costs of security breaches [14,15].
J. Aspnes et al. / Journal of Computer and System Sciences 72 (2006) 1077–1093
1079
2.3. Game-theoretic models of security
Application of game theory to network security has yielded interesting results [16–18]. For example, Bell uses a
simple game to study network reliability. In the game, the router tries to find a least cost path and a network tester
tries to maximize this cost by failing links [19]. Kunreuther and Heal recently introduced the notion of
interdependent
security
(IDS) games, in which decisions to adopt security technology by one agent affect other agents [20]. Kearns
and Ortiz subsequently extended their paper and gave an algorithm for finding approximate Nash equilibria in this
model [21].
Our work is similar to work on IDS games in certain respects: each agent in both our game and an IDS game
makes a decision whether or not to invest money in a security technology, and this decision affects other agents. The
main differences are that we assume that installing anti-virus software protects against all bad effects of viruses, while
the IDS work concentrates on negative side-effects of security breaches even on protected parties; and we assume a
restricted network topology that contains the spread of viruses, while the IDS work assumes a complete topology.
2.4. Graph partition problems
In Section 6, we describe and provide an approximate solution for a new graph partitioning problem. Previous work
on other forms of graph partitioning includes the approximation algorithm of Leighton and Rao [22] for
sparsest cut
,
from which the same authors derive a pseudo-approximation algorithm for
b-balanced cuts
, where each side of the cut
must have size
b
|
|
V
or greater. Arora et al. [23] recently improved the approximation ratios of these results. The case
of
b
1
/
2is
graph bisection
, for which Feige and Krauthgamer [24] give a good approximation algorithm. Even et
al. [25] give
O(
log
n)
-ratio pseudo-approximation algorithms for several balanced partitioning problems, including
the
ρ-separator problem
and the
k-balanced partitioning problem
.
=
3. Our model
We represent network topology by an undirected graph
G
=
(V , E)
, where
V
={
0
,
1
,...,n

1
}
is a set of network
hosts and
E
V
is a set of (bidirectional) communication links. Our basic model for installing anti-virus software
is a one-round game with the following features:

V
×
Players.
Our game has
n
players corresponding to nodes of the graph. Initially, all nodes are insecure and vulnerable
to infection.
Strategies.
We denote the
strategy
of
i
by
a
i
. Each node
i
has two possible actions:
do nothing
and risk being infected
or
inoculate itself
by installing anti-virus software. Node
i
’s strategy
a
i
is the probability that it inoculates
itself.
Nodes’ choices can be summarized in a
strategy profile
n
.If
a
i
is 0 or 1, we say that node
i
adopts a
pure strategy
; otherwise, its strategy is
mixed
. We call nodes that install anti-virus software
secure
and denote the set of such nodes by
I
a
∈[
0
,
1
]
a
. It is essentially
the network graph with secure nodes and their edges removed (see also Fig. 1). Note that both
I
a
a
. We associate an
attack graph G
=
G

I
a
with
a
and
G
a
are
random variables unless all strategies are pure.
Attack model.
After the nodes made their choices, the adversary picks some node uniformly at random as a starting
point for infection. Infection then propagates through the network graph. Node
i
gets infected if it has no
anti-virus software installed and if any of its neighbors become infected.
Individual costs.
Suppose it costs
C
to install anti-virus software. If a node is infected, it suffers a loss equal to
L
.
For simplicity, we assume that both
C
and
L
are known and are the same for all nodes; we discuss possible
consequences of removing these assumptions in Section 7.
The cost of a mixed strategy
n
a
∈[
0
,
1
]
to node
i
is
=
+

·
cost
i
(
a)
a
i
C
(
1
a
i
)L
p
i
(
a).
Here
p
i
(
a
,
conditioned on the event
that node i does not install the anti-virus software
. It is equal to the probability that some vulnerable node
reachable from
i
without passing through a secure node is the initial point of infection. For pure strategies,
this is just
k
i
/n
, where
k
i
is the size of the connected component containing
i
in the attack graph
G
a)
is the probability of node
i
being infected given the strategy profile
a
.
1080
J. Aspnes et al. / Journal of Computer and System Sciences 72 (2006) 1077–1093
Fig. 1. Sample graph
G
and its attack graph
G
a
for
a
=
010100.
Social cost.
The total social cost of a strategy profile is the sum of the individual costs. For pure strategies, there is
a simple characterization of the total social cost in terms of the attack graph
G
a
. Because each node in the
same component of
G
a
has the same chance of infection, the
k
i
nodes in the
i
th component between them
face a loss of
k
i
·
(L/n)k
i
. So the social cost is
=
(Lk
i
/n)
n

1
n

1
l
L
n
k
i
,
cost
(
a)
=
cost
j
(
a)
=
a
j
C
+
(
1

a
j
)L
·
p
j
(
a)
=
C
|
I
|+
a
j
=
0
j
=
0
i
=
1
where
k
1
,k
2
,...,k
are the sizes of the components in
G
a
.
4. Nash equilibria
We consider first the choices that the nodes will make in the absence of coordination, by examining the Nash
equilibria of the game defined in Section 3. The assumption that the nodes will reach a Nash equilibrium is a very
strong one, as it requires assuming that the nodes are aware of each other’s choices to install or not and that the nodes
can evaluate
C
(printed on the box for the anti-virus software) and
L
(which is more problematic). It also assumes
that the nodes can compute a Nash equilibrium in a reasonable amount of time, which is not always possible for some
games. However, we can show that Nash equilibria for our game are easily characterized in terms of the sizes of the
components of the attack graph (Section 4.1), and that a population will converge to some Nash equilibrium quickly
even though finding the best or worst
pure
equilibrium as measured by total cost is
NP
-hard (Section 4.2).
We can further imagine that some of the difficulties of limited information could be overcome by considering an
iterated game where nodes pay
C
to rent the anti-virus software in each round and update their strategies based on
observations of losses to viruses and the strategies of other nodes in previous rounds; though we do not analyze this
multi-round game formally, a simplified version is implicit in our convergence result. We also show that the hardness
of finding the worst-case equilibrium does not prevent obtaining further information about its behavior; for example,
its total cost is non-decreasing as a function of the inoculation cost
C
(Section 4.3).
Unfortunately, selfish behavior proves to be highly undesirable, because the cost of a Nash equilibrium solution
may be very far from the social optimum. In Section 4.4, we prove that while the
price of anarchy
, defined as the ratio
of total cost between the worst Nash equilibrium and the social optimum never exceeds
n
, this bound is tight up to
constant factors for some graphs and choices of
C
and
L
.
4.1. Characterization of mixed and pure equilibria
A useful feature of the Nash equilibrium for our game is its simple characterization: there is always a component-
size threshold
t
Cn/L
such that each node will install the anti-virus software if it would otherwise end up in
a component of vulnerable nodes with expected size greater than
t
, and will not install the software if it would
otherwise end up in a component with expected size less than
t
. When the expected component size equals
t
, the node
is indifferent between installing and not installing and may adopt a mixed strategy. The threshold arises in a natural
=
J. Aspnes et al. / Journal of Computer and System Sciences 72 (2006) 1077–1093
1081
way: it is the break-even point at which the cost
C
of installing the software equals the expected loss
L(t/n)
of not
installing.
We define
a
[
i/x
]
to be the strategy vector that is identical to
a
, except the
i
th component
a
i
is replaced by
x
.Note
that attack graph
G
is the attack graph in which player
i
never installs the anti-virus software.
a
[
i/
0
]
Theorem 1
(Characterization of mixed equilibria). Suppose S(i) is the expected size of the insecure component that
contains node i of the attack graph G
(
i.e., S(i)
=
np
i
(
a)
)
.
a
[
i/
0
]
Fix C,L. Let the threshold be t
=
Cn/L. A strategy profile
a is a Nash equilibrium if and only if
(a)
for all i such that a
i
=
1
, S(i)
t
;
(b)
for all i such that a
i
=
;
(c)
for all i such that
0
<a
i
<
1
, S(i)
0
, S(i)
t
=
t .
Proof.
Suppose
a
is a Nash equilibrium and consider node
i
. The expected cost to node
i
is
a
i
C
+
(
1

a
i
)(L/n)S(i)
.
(1) Suppose
a
i
=
0. Then node
i
has expected cost
(L/n)S(i)
.If
(L/n)S(i) > C
, then
i
will want find the situation
a
i
=
1 with cost
C
preferable. Thus, we must have
S(i)
CL/n
=
t
.
(2) Suppose
a
i
=
1. Then node
i
has expected cost
C
.If
(L/n)S(i) < C
, then
i
would find the situation
a
i
=
0 with
expected cost
(L/n)S(i) < C
preferable. Thus, we must have
S(i)
CL/n
=
t
.
(3) Suppose 0
<a
i
<
1. If
(L/n)S(i) > C
, then
i
will find the situation
a
i
=
1 preferable. If
(L/n)S(i) < C
, then
i
will find the situation
a
i
=
=
=
0 preferable. Thus, we must have
S(i)
CL/n
t
.
Thus,
a
satisfies condition (a), (b), and (c) above.
Conversely, suppose
a
satisfies conditions (a), (b) and (c) of the theorem. Consider node
i
.
(1) Suppose
a
i
=
0. Then node
i
will have expected cost
(L/n)S(i) < C
, and thus will not want to switch to a
different
a
i
that puts any weight on installing at cost
C
.
(2) Suppose
a
i
=
1. Then node
i
will have cost
C
, and thus will not want to switch to a different
a
i
that puts any
weight on being insecure at expected cost
(L/n)S(i)
C
.
(3) Suppose 0
<a
i
<
1. Then node
i
will have expected cost
a
i
C
+
(
1

a
i
)(L/n)S(i)
=
C
. Switching to any other
strategy will have the same expected cost.
Thus,
a
is a Nash equilibrium.
A special case of Theorem 1 is the following characterization for pure Nash equilibria. Because nodes in a pure
Nash equilibrium do not make randomized choices, the attack graph is not a random object, but a determined graph.
We have the same threshold conditions as before, but the removal of randomness simplifies the situation.
Corollary 2
(Characterization of pure equilibria). Fix C,L. Let the threshold be t
=
Cn/L. A strategy profile
a is a
pure Nash equilibrium if and only if
(a)
every component in attack graph G
a
has size at most t
;
(b)
inserting any secure node j

I
a
and its edges into G
a
yields a component of size at least t .
=
=
=
=
For example, let
C
0
.
5 and
L
1, and consider the graph in Fig. 1. The threshold for this graph is
t
Cn/L
3.
Then Corollary 2 tells us that pure strategy
a
=
010100 must be a Nash equilibrium for these
C
and
L
.
4.2. Computing pure Nash equilibria
Designing algorithms for finding mixed Nash equilibria or proving hardness results for finding optimized mixed
equilibria would most likely involve estimating or otherwise manipulating the expected value of the sizes of com-
ponents in the attack graph, which is at the very least a non-trivial problem. Furthermore, in the absence of central
[ Pobierz całość w formacie PDF ]

  • zanotowane.pl
  • doc.pisz.pl
  • pdf.pisz.pl
  • zolka.keep.pl