˖ڍመڙጲ
竞争双边拍卖定价策略的博弈论分析
石兵 ¹,黄亚龙 ¹, 王瑾雯 ¹
¹ 武汉理工大学计算机科学与技术学院,武汉,430070
摘要:本论文使用博弈论对竞争双边拍卖市场机制中的定价策略进行分析,在此设定下,交易
智能体可以任意选择市场,其市场选择和报价策略与市场的定价策略互相影响,因此本论文使
用一种基于 fictitious play 的联合学习算法来分析此问题。本论文分析了双竞争拍卖市场如何
在差异定价和均衡定价策略下选择合适的定价策略并设置合适的 k。通过数值实验发现,当双
市场均使用同种定价策略时,他们可以共存,但当使用不同种的定价策略时,差异定价获胜,
并吸引所有的交易智能体,此时双市场不能共存。
关键词:人工智能,多智能体系统,定价策略,交易策略,纳什均衡
中图分类号: TP18
A Game-Theoretic Analysis of Pricing Strategies
in Competing Double Auctions
SHI Bing¹,HUANG Yalong¹, WANG Jinwen¹
¹ School of Computer Science and Technology, Wuhan University of Technology, Wuhan,
430070
Abstract: In this paper, we use game theory to analyse the pricing policies of competing
double auction marketplaces, where traders can choose to participate in any of the markets.
The market selection and bidding strategies of traders are affected by the pricing policies and
vice versa, and so we propose a co-learning algorithm based on fictitious play to analyse this
problem. In more detail, we investigate a setting with two competing marketplaces who can
adopt one of two types of pricing policies: equilibrium k pricing policy and discriminatory k
pricing policy, where k is a parameter which determines the transaction price. We find that,
when both marketplaces use the same type of pricing policy, they can co-exist in equilibrium,
and they have an extreme bias to one side the market (buyers or sellers) when setting k.
When both marketplaces adopt different types of policies, we find that all traders tend to
converge to the discriminatory k pricing policy and so the two competing marketplaces can no
longer co-exist.
Key words: Artificial Intelligence, Multi-Agent Systems, Prcing Strategy, Trading Strategy,
Nash Equilibrium
Foundations: National Natural Science Foundation of China (No. 61402344), Specialized Research Fund for the Doctoral
Program of Higher Education of China (No. 20120092120033)
Author Introduction: Correspondence author: Shi Bing (1982-), male, associate professor, major research direction:
multi-agent systems. Huang Yalong (1992-), male, master student, major research direction: multi-agent systems. Wang
Jinweng (1995-), female, master student, major research direction: multi-agent systems.
- 1 -
˖ڍመڙጲ
0 Introduction
A double auction marketplace allows multiple buyers and sellers to trade goods simulta-
neously [1]. As a highly efficient market mechanism, it has been widely adopted for trading
commodities and financial instruments, but also by online exchanges (. Internet Ad ex-
changes). Due to the globalised economy, such marketplaces typically do not exist in isolation
and compete with other marketplaces. Hence, they require effective market policies to attract
both sides of the market, . buyers and sellers. Indeed, marketplaces which work well in an
isolation may not attract sufficient trade when faced with competition. In particular, market-
places need to decide when to clear the markets (. the clearing policy) and how to set the
transaction prices (. the pricing policy) [2]. The transaction price is always set inbetween
a buyer’s bid offer (its willingness to pay) and a seller’s ask offer (its minimum selling price).
Depending on the pricing policy, there could be a bias in favour of the seller or the buyer.
Arguably, the pricing policy is the most important when considering competition since this can
significantly affect a trader’s profits and, thus, their choice of marketplaces. Therefore, in this
paper, we will analyse how double auction marketplaces determine the transaction prices when
they compete against each other.
Specifically, in this paper, we considering so-called clearing houses, where matching of
buyers and sellers occurs once all traders have submitted their offers. For such markets the
pricing policies can be divided into two categories: discriminatory pricing and uniform pricing
[2]. A discriminatory pricing policy may set different prices for different transactions during a
single clearing event, whereas a uniform pricing policy sets the same price for all simultaneous
transactions. A typical example the former is the so-called discriminatory k pricing policy.
Here, the marketplace sorts the bids descendingly and the asks ascendingly, and matches the
buyer with the vth highest bid with the seller with the vth lowest ask if the ask is not greater
than the bid. The transaction price is set at point inbetween the bid and ask as determined
by a parameter k 2 [0; 1] where, in the extreme cases, if k = 1, the bid is used and, if k = 0,
the ask is used. In contrast, a uniform pricing policy typically uses the equilibrium price [3]
as the transaction price. The equilibrium price is determined as follows. We define the ask
quote as the minimum of the lowest tentatively matchable bid and the lowest unmatchable ask,
and define the bid quote as the the maximum of the highest tentatively matchable ask and the
highest unmatchable bid. Then the bid quote and ask quote define the range of the equilibrium
price (note that the bid quote is not greater than the ask quote). Similar to the disciminatory
pricing policy, a parameter k is used to determine the transaction price, but now using the
ask and bid quotes instead. When k = 0, the bid quote is used as the equilibrium price, and
when k = 1, the ask quote is used. In this paper, we refer this policy as the equilibrium
k pricing policy. We assume that marketplaces adopt either equilibrium k pricing policy or
discriminatory k pricing policy, and they can set the values of k 2 [0; 1].
- 2 -
˖ڍመڙጲ
A number of existing works have analysed the pricing policies for double auction market-
places. The work by [4] was one of the first to game-theoretically analyse a k-pricing policy for
bilateral trade. The efficiency of markets using a discriminatory pricing policy was experimen-
tally analysed in [5]. A genetic algorithm was used in [6] to find the best price to maximise the
market’s allocative efficiency and traders’ market power when the marketplace uses a discrimi-
natory pricing policy. A uniform pricing policy was proposed in [3] to choose the bid/ask quote
as the transaction price. There exist many works which analyse pricing policies in different
types of double auctions, such as [7, 8, 9]. However, crucially, none of these papers analyse
competing double auctions. In [10], researchers designed pricing policies empirically in a setting
with competing marketplaces. However, there is no theoretical analysis. In contrast, in this
paper we apply a game-theoretic approach for deriving the equilibria of both the traders and
the market policies. Similar to this work, the work in [11] analysed the Bayes-Nash equilibrium
bidding strategy in double auctions, but did not analyse the pricing policy.
Specifically, our analysis considers heterogeneous traders with continuous, privately-known
types (the type describes the trader’s monetary value for the item it is selling or buying), and we
focus on the competition between two double auction marketplaces. We propose a co-learning
algorithm based on fictitious play to analyse how marketplaces set transaction prices and how
traders select marketplaces and submit offers in Bayes-Nash equilibrium, in which each agent
maximises its expected payoff given the beliefs about the other agent’s types and the strategies
played by the other agents. In so doing, this is the first work which analyses transaction prices
in the competing environment from a theoretical perspective while taking into account the
interactions between traders’ market selection and bidding strategies and marketplaces’ pricing
strategies.
In more detail, this paper advances the state of art in the following ways. We derive
the equations to calculate the expected utilities of traders and marketplaces. We then use
a co-learning algorithm based on fictitious play to anlayse how competing marketplaces set
transaction prices and how traders select marketplace and submit offers in equilibrium. Firstly,
we analyse the case where both marketplaces use the same type of pricing policy and we find
that, for this setting, both marketplaces can co-exist in equilibrium. In order to attract traders,
each marketplace has an extreme bias in favour of one of the sides of the market (., buyers
or sellers) when setting transaction prices. We then extend the analysis to the cases where
marketplaces can use different types of pricing policies. We find that the discriminatory k
pricing policy has a better performance than the equilibrium k pricing policy. As a result, all
traders will converge to the former, which means that the two competing marketplaces can no
longer co-exist.
The remainder of the paper is structured as follows. In Section 1, we describe the formal
model and derive the expected utility equations. In Section 2, we introduce the co-learning
- 3 -
˖ڍመڙጲ
algorithm. In Section 3, we use this algorithm to analyse the pricing strategies. Finally, we
conclude in Section 4.
1 The Framework
In this section, we first introduce the basic settings for analysing our problems, and then
derive the equations to calculate the expected utilities of the traders and marketplaces.
Basic Settings
Let B = f1; 2; :::Bg and S = f1; 2; :::Sg denote the set of buyers and sellers respectively. We
assume that each buyer and each seller can trade one unit of a good and all goods are identical.
Each buyer and seller has a type denoted by �b and �s respectively. The type of a buyer is its
limit price, ., the highest price it is willing to buy the item for, and the type of a seller is its
cost price, ., the lowest price it is willing to sell the item for. We assume that the types of
all buyers are drawn from a cumulative distribution function F b, with support [0; 1], and
the types of all sellers are drawn from a cumulative distribution function F s, with support
[0; 1]. The distributions F b and F s are assumed to be common knowledge and differentiable.
The probability density functions are f b and f s respectively. In our setting, the type of each
specific trader is not known to other traders, and so constitutes private information.
In addition, we assume that there exist two competing double auction marketplaces M =
f1; 2g, that offer places for trade and determine the transaction prices for the buyers and sellers.
The marketplace is cleared when all traders have submitted their offers, and can adopt one of
two types of pricing policies: equilibrium k pricing policy and discriminatory k pricing policy.
We now describe how traders select marketplaces and submit offers. The discretise the offer
space and the allowed offers of buyers and sellers are given by the set = f0; 1
D
; 2
D
; :::; D�1
D
; 1g[
f g, . there are D+1 possible offers from 0 to 1 with step size 1/D (D is a natural number),
and means not submitting an offer in the marketplace (. not choosing the marketplace).
In addition to choosing how much to bid, traders can also choose where to bid. We refer to
this combination as a trader’s action. For a buyer this is given by a tuple �b = hdb1; db2i where
the buyer bids dbm 2 in marketplace m if dbm 6= , and does not choose marketplace m if
dbm = . Similarly, a seller’s action is given by �s = hds1; ds2i. In our setting, traders are only
allowed to bid in one marketplace at a time. Therefore at least one of the bids in the tuple
is equal to . The set of all allowed actions constitutes the action space, which is defined as
� = f� 2 M : 9i 2 M; 8j 6= i; di 2 and dj = g. Note that both buyers and sellers have
the same action space.
A trader’s action choice depends on its type. Hence, a strategy, is defined as a mapping
from the set of types to the action space. Formally, we use �b : [0; 1]! � and �s : [0; 1]! � to
- 4 -
˖ڍመڙጲ
denote the buyer and the seller trading strategies respectively. Note that the expected utility
of a trader is directly dependent on its beliefs about other traders’ action choices. Therefore,
instead of looking at traders’ strategies, in what follows, the expected utility is expressed directly
in terms of traders’ action distributions. Specifically, we use !bi to denote the probability of
action �bi being chosen by a buyer, and use !si to denote the probability of action �si being
chosen by a seller. Furthermore, we use
b =
�
!b1; !
b
2; :::; !
b
j�j
�
, Pj�ji=1 !bi = 1, to represent the
probability distribution of buyers’ actions, and
s = (!s1; !s2; :::; !sj�j) for the sellers’ action
distribution. Note that, given a trader’s strategy and the type distribution function, we can
derive the probability of a certain action being played by the trader. Specifically, we use
��b(�bi ) � [0; 1] to denote the set of buyer types using action �bi . Then the probability of
the action �bi being played is !bi =
R
x2��b(�bi ) f
b(x)dx. The calculation of the seller action
distribution is analogous.
The marketplace’s pricing strategy is defined as follows. We use P = fequilibrium k
pricing policy, discriminatory k pricing policyg to denote the allowed pricing policies. For the
pricing parameter k 2 [0; 1], we discretise it from 0 to 1 with step size K. Then the allowed
space of pricing parameter k is K = f0; 1
K
; 2
K
; :::; K�1
K
; 1g. When setting the transaction price,
the marketplace needs to select a pricing policy from P, and choose k from K. Therefore,
the pricing action of marketplace m can be defined as �km = hp; ki 2 �P where �P = P � K
is the allowed pricing action space. Now the mixed pricing strategy of marketplace m can be
defined as m : �P ! [0; 1] where
P
�km2 �P m(
�km) = 1. The pricing actions of both marketplaces
constitute a pricing action tuple �k = h�k1; �k2i 2 �K where �K = �P � �P is the allowed pricing action
tuple space.
The Trader’s Expected Utility
We now derive equations to calculate expected utilities of traders given fixed pricing poli-
cies. In what follows, we derive the expected utility of a buyer, but the seller’s is calculated
analogously. A buyer’s expected utility depends on its type, its own bid, marketplaces’ pricing
actions, and its beliefs about action choices of other traders. We calculate the expected utility
of a buyer with type �b when using �b = hdb1; db2i given the other buyers’ action distribution
b and the sellers’ action distribution
s, and given the marketplaces’ pricing action tuple
�k = h�k1; �k2i.
For the buyer’s action �b = hdb1; db2i, when db1 = db2 = , . the buyer does not enter
any marketplace, the buyer’s expected utility is zero. When the buyer chooses to only enter
marketplace m (. dbm 6= ), its expected utility over action �b is the expected profits when
the buyer bid dbm in marketplace m with pricing action �km, which is computed as follows.
Firstly, we need to sort the buyers’ bids descendingly and sellers’ asks ascendingly in the
marketplaces. The reason is as follows. When the marketplace adopts discriminatory k pricing
- 5 -
˖ڍመڙጲ
policy, we need to know the position of the buyer’s bid in the marketplace, which determines
its matching with sellers. When the marketplace adopts equilibrium k pricing policy, we need
to sort them to find the ask quote and bid quote, in order to decide the equilibrium price range.
Specifically, we use a j�j-tuple �x = hx1; :::xj�ji 2 X to represent the number of buyers choosing
different actions, where xi is the number of buyers choosing action �bi , X is the set of all such
possible tuples and we havePj�ji=1 xi = B�1 (note that we need to exclude the buyer for which
we are calculating the expected utility). The probability of exactly xi buyers choosing action
�bi is
�
!bi
�xi , and the probability of this tuple appearing is:
�b(�x) =
B � 1
x1; :::; xj�j
!
�
j�jY
i=1
�
!bi
�xi
(1)
Now for a particular �x, we determine the buyer’s position as follows. Firstly, we obtain the
number of other buyers whose bids are greater than the buyer’s bid dbm, which is given by:
X>(�x; dbm) =
X
�bi2�:dbim>dbm
xi (2)
where dbim is the bid placed by the ith action �bi in marketplace m. Similarly, we use X=(�x; dbm)
to represent the number of buyers whose bids are equal to the buyer’s bid (excluding the buyer
itself):
X=(�x; dbm) =
X
�bi2�:dbim=dbm
xi (3)
Due to having discrete bids and given X>(�x; dbm) buyers bidding higher than the buyer’s bid
dbm and X=(�x; dbm) buyers bidding equal to dbm, the buyer’s position v�x given �x in marketplace
m could be anywhere from X>(�x; dbm) + 1 to X>(�x; db) + X=(�x; dbm) + 1, which constitutes
the buyer’s position range. We use V�x = fX>(�x; dbm) + 1; :::; X>(�x; dbm) + X=(�x; dbm) + 1g to
denote the position range. Since X=(�x; dbm) + 1 buyers have the same bid, a tie-breaking rule
is needed to determine the buyer’s position. We adopt a standard rule where each of these
possible positions occurs with equal probability, . 1/(X=(�x; dbm) + 1).
The buyer’s expected utility also depends on sellers’ action choices. Specifically, we use
a j�j-tuple �y = hy1; :::yj�ji 2 Y to represent the number of sellers choosing different actions,
where yi is the number of sellers choosing action �si , and Y is the set of all such possible tuples
and we have Pj�ji=1 yi = S. The probability of this tuple appearing is:
�s(�y) =
S
y1; :::; yj�j
!
�
j�jY
i=1
�
!si
�yi
(4)
Now given the buyer’s positions v�x and the number of sellers choosing different actions
�y, we are ready to calculate the buyer’s expected utility. When the pricing action �km chooses
discriminatory k pricing policy and sets the pricing parameter as km, given the tuple �y, we can
sort the asks of the sellers in marketplace m descendingly. Then the ask which is vth�x highest
- 6 -
˖ڍመڙጲ
will be matched with the buyer’s bid. We denote this ask as dsm. Now the buyer’s expected
utility can be calculated:
U(v�x; �y; �
b
; d
b
m;
B
;
S
; �k) =
(
�b � TP if dbm � dsm
0 if dbm < dsm
where TP = dbm � km + dsm � (1� km) is the transaction price.
When the pricing action �km chooses the equilibrium k pricing policy and sets the pricing
parameter as km, the calculation is as follows. Given the number of buyers choosing different
actions �x and the number of sellers choosing different actions �y, we sort the bid descendingly
and asks ascendingly, and then obtain the bid quote and ask quote in marketplace m, which
are denoted as Qbm and Qsm respectively. Then the equilibrium price of the marketplace is
EP = Qsm � km +Qbm � (1� km). Furthermore, according to the sorted bids and asks, we can
get the exact number of transactions that will be made in the marketplace, which is denoted
as T . Now the buyer’s expected utility is given by:
U(v�x; �y; �
b
; d
b
m;
B
;
S
; �k) =
(
0 if v�x > T
�b � EP if v�x � T
Finally, by considering all possible numbers of sellers choosing different actions, all possible
positions and all possible numbers of buyers choosing different actions, the buyer’s expected
utility is given by:
~U(�
b
; �
b
;
b
;
s
; �k) =
X
�x2X
�
b
(�x)�
X
v�x2V�x
1
X=(�x; db) + 1
�
X
�y2Y
�
s
(�y)� U(v�x; �y; �b; db;
B;
S ; �k) (5)
The Marketplace’s Expected Utility
After deriving equations to calculate the expected utilities of traders, we now calculate the
expected utilities of marketplaces. In this paper we assume that the marketplace’s expected
utility is equal to the expected number of transactions made in that marketplace, which means
marketplaces aim to maximise the number of transactions when making decisions. In the
following, we calculate the expected utility of marketplace m when it takes the pricing action
�km. Firstly, we calculate its expected utility given the action distributions of buyers and sellers,
b and
s, which are conditional on the pricing action tuple �k. In the following, we use
bj�k
and
sj�k to represent
b and
s conditional on the pricing action tuple �k respectively.
Intuitively, we can see that the expected number of transactions depends on the number
of traders choosing each allowed action. Similarly, we use a j�j-tuple �x = hx1; :::; xj�ji 2 X 0,Pj�j
i=1 xi = B, to denote the number of buyers choosing different actions, where xi is the exact
number of buyers choosing action �bi , and X 0 is the set of all such possible tuples. We use
�y = hy1; :::; yj�ji 2 Y,
Pj�j
i=1 yi = S, to denote the number of sellers choosing different actions.
Given the number of buyers and sellers choosing different actions, �x and �y, we will know what
exact bids and asks are placed in the marketplace. Then the marketplace’s expected number of
transactions is calculated as follows. We sort the bids descendingly and asks ascendingly in the
- 7 -
˖ڍመڙጲ
marketplace, and will know how many transactions will be made. Specifically, we assume that
there are T transactions made in the marketplace. Furthermore, the probability of �x appearing
is:
%b(�x) =
B
x1; :::; xj�j
!
�
j�jY
i=1
�
!bi
�xi
(6)
and the probability of �y appearing is:
%s(�y) =
S
y1; :::; yj�j
!
�
j�jY
i=1
�
!si
�yi
(7)
At this moment, we can get the marketplace’s expected number of transactions given the
conditional action distributions
bj�k and
sj�k:
~U
�
bj�k;
sj�k� = X
�x2X 0
%b(�x)�
X
�y2Y
%s(�y)� T (8)
Then the expected utility of marketplace m taking pricing action �km is:
~U
�
�km
�
=
X
�km02 �P
m0(�km0)� ~U
�
bj�k;
sj�k� (9)
where �k = h�km; �km0i, �km0 is the pricing action of marketplace m0.
2 A Co-Learning Algorithm
In this section, we introduce the co-learning algorithm based on fictitious play to derive
the Bayes-Nash equilibrium pricing strategies of marketplaces and trading strategies of traders.
In the standard FP algorithm [12, 13], opponents are assumed to play a mixed strategy. By
observing relative frequencies of different actions, the player can estimate their opponents’
mixed strategies, and take the best response to those strategies. The observed frequencies of
opponents’ actions are termed FP beliefs. In each round, all players estimate their opponents’
mixed strategies and update their FP beliefs, and play a best response to their FP beliefs. All
players continually iterate this process until it converges, which reaches a Bayes-Nash equi-
librium. The standard FP algorithm is suitable for analysing marketplaces’ pricing strategies
since the observed pricing action frequency can be considered as the mixed pricing strategies.
However, the standard one is not suitable for analysing Bayesian games with incomplete in-
formation about traders’ types. By observing the frequency of opponents’ actions, we cannot
know the actual strategy of a player since we do not know which type performs which action.
The same as the work done in [11], we use a generalised fictitious play algorithm [14] to analyse
traders’ strategies with continuous types and a finite action space. Moreover, in reality, it is
impossible to run the algorithm to convergence since it involves an infinite number of iteration
- 8 -
˖ڍመڙጲ
rounds. Therefore, it is often used to approximate the Bayes-Nash equilibrium (. deriving
the �-Bayes-Nash equilibrium) by running the fictitious play algorithm for a limited number of
rounds.
We now introduce the co-learning algorithm based on the generalised fictitious play al-
gorithm which takes into account the interaction of trading strategies of traders and pricing
strategies of marketplaces in detail. Firstly, we initialise one set of FP beliefs for the traders
and one set of FP beliefs for the marketplaces. Secondly, according to the FP beliefs of traders,
marketplaces predicate traders’ behavior when they adopt each possible pricing action tuple.
Specifically, they compute the best response action to maximise the buyer’s expected utility un-
der each possible pricing action tuple �k (and do similar for sellers), and then predicate traders’
FP beliefs conditional on the corresponding pricing action Thirdly, based on the pred-
icated FP beliefs of traders conditional on each possible pricing action tuple, the marketplace
can find the best response pricing action to maximise the number of transactions. This pricing
action is actually chosen by the marketplace in the current FP iteration round. According to
the actual pricing actions chosen by marketplaces in the current round, traders compute the
best response actions, and update FP beliefs. Finally, we check whether it is converged, and if
not, repeat this process from the second step. The sketch of this algorithm is shown in Figure
1. In the following we introduce this algorithm in detail.
Recall that we use
b and
s to denote the probability distributions of buyers’ and sellers’
actions respectively. In the co-learning algorithm, we use them to represent FP beliefs about
the buyers’ and sellers’ actions respectively. We also use 1 and 2 to represent the FP beliefs
of pricing actions of marketplace 1 and 2 respectively.
In the beginning of each iteration round, according to traders’ FP beliefs
b,
s, each
marketplace computes traders’ best response action of market selection and bidding under
each possible pricing action tuple. From this, marketplaces can predicate FP beliefs of traders
for the next iteration round when they adopt a specific pricing action tuple. Based on the
predicated FP beliefs, the marketplace can compute its expected number of transactions when
adopting a specific pricing action, and then can determine the best pricing action to maximise
the number of transactions. In more detail, marketplaces compute buyers’ best response against
each possible pricing action tuple �k = h�k1; �k2i as follows (and similarly for sellers):
�
b�
(�
bj
b;
s; �k) = argmax�b2� ~U(�b; �b;
b;
s; �k) (10)
where �b� is the best response action of the buyer with type �b given FP beliefs
b,
s and
pricing action tuple �k. From the buyer’s expected utility equations (see Equation 5) we note
that the buyer’s expected utility ~U(�b; �b;
b;
s; �k) is linear in its type �b for a given action.
Given this, and given a finite number of actions, the best response function is the upper envelope
of a finite set of linear functions, and thus is piecewise linear. Note that for each possible �k, we
need to calculate the best response action for traders.
1Note that at this step, the marketplaces do not actually determine the pricing actions, but only predicate traders’ behavior
under each possible pricing action tuple. Based on the prediction of traders’ FP beliefs, they can choose the best pricing
action to maximise the number of transactions.
- 9 -
˖ڍመڙጲ
1. Initialise FP beliefs of traders’ actions and FP beliefs
of marketplaces’ pricing actions.
2. Marketplaces compute traders’ best response actions
under each possible pricing action tuple, and then can
predicate FP beliefs of traders conditional on each
possible pricing action tuple.
4. According to the actually chosen pricing actions in this
round, traders compute the best response actions and
update the FP beliefs.
if converged
4. Reach Nash equilibrium.
Yes
No
3. According to the predicated FP beliefs of traders
conditional on each possible action tuple, each
marketplace can find the best response pricing action to
maximise the number of transactions. This pricing action
is actually chosen by the marketplace in the current FP
iteration round. Then marketplaces update their FP beliefs.
图 1: The co-learning algorithm.
Based on the computed best response actions for traders, marketplaces can predicate FP
beliefs of traders, which is conditional on �k. From the set of piecewise linear functions, we
obtain the sets of types corresponding to the best response actions. Based on this, we can
calculate the best response action distribution of buyers, which is done as follows. Given that
the upper envelope is a piece-wise linear function, the set of buyer types corresponding to the
best response action �bi is ��b(�bi ). Then the probability of action �bi being played by a buyer is
!bi =
R
x2��b(�bi ) f
b(x)dx. By calculating the probability of each action being used, marketplaces
can obtain the current best response action distribution of buyers, denoted as
bbrj�k. They can
predicate the FP beliefs of buyers’ actions given �k, which is given by:
bpredicatedj�k =
�
� + 1
�
b� +
1
� + 1
�
bbrj�k
where
bpredicatedj�k is the predicated FP beliefs of the buyers’ actions when marketplaces adopt
- 10 -
˖ڍመڙጲ
pricing action tuple �k,
b� is the FP beliefs in the current iteration round � , and
bbrj�k is
the probability distribution of best response actions conditional on �k against FP beliefs
b� .
The computation of the sellers’ best response function and the predication of belief updates is
analogous.
After predicating the traders’ FP beliefs conditional on each possible �k, each marketplace
now computes the best response pricing action against the predicated FP beliefs conditional
on �k,
bpredicatedj�k and
spredicatedj�k, which is:
�k�m = argmax�km2 �P ~U(�km) (11)
where �k�m is the best response action for marketplace m, and ~U(�km) is the expected number of
transactions for marketplace m when adopting the pricing action �km given the predicated FP
beliefs of traders
bpredicatedj�k and
spredicatedj�k. Now marketplaces can actually determine the
best pricing action for the current iteration round. Then the FP beliefs of marketplace m can
be updated as:
m(�k) =
(
�� m(�k)
�+1
if �k 6= �k�m
�� m(�k)+1
�+1
if �k = �k�m
Now according to the actual chosen pricing actions by marketplaces, which constitute a
pricing action tuple �k�, buyers compute the best response actions as follows:
�
b�
(�
bj
b;
s; �k�) = argmax�b2� ~U(�b; �b;
b;
s; �k�) (12)
The probability of action �b�i being played by a buyer is !bi =
R
x2��b(�b�i ) f
b(x)dx. By calculating
the probability of each action being used, we obtain the current best response action distribution
of buyers, denoted as
bbrj�k�. Then traders update FP beliefs of buyers for the next iteration
round:
b�+1 =
�
� + 1
�
b� +
1
� + 1
�
bbrj�k�
The computation of the sellers’ best response function and belief updates is analogous.
Finally, if the difference between the expected utility of a buyer(seller) in the current
round and in the next round is not greater than � for each possible pricing action tuple, and
the difference between the expected utility of a marketplace in the current round and in the
next round is not greater than �, the co-learning process converges. If not, we have to repeat
this process.
3 Analysis of Competing Pricing Strategy
Based on the above co-learning algorithm, we now investigate the pricing strategies and
trading strategies in equilibrium. The following analysis empirically shows that, in our setting,
the FP algorithm consistently converges, which allows the derivation of the Bayes-Nash equi-
librium For illustrative purposes, in the following analysis, we show our results in a
2Note that, in general, the FP algorithm is not guaranteed to converge [15].
- 11 -
˖ڍመڙጲ
0
1
0 1
Bid
s/A
sks
Types
Bids in Marketplace 1
Asks in Marketplace 1
Bids in Marketplace 2
Asks in Marketplace 2
图 2: The traders’ Bayes-Nash equilibrium strategies when both marketplaces adopt equilibrium
k pricing policy.
specific setting with 5 buyers and 5 sellers, and 11 allowable bids(asks) plus . Furthermore, we
assume that both buyer and seller types are independently drawn from a uniform distribution.
In addition, we set � = 0:00001. The pricing parameter k is discretised from 0 to 1 with step
size .
Homogeneous Pricing Policy
We start by analysing the cases where both marketplaces adopt the same type of pricing
policy. We first analyse the case where both marketplaces use equilibrium k pricing policy.
Dependent on the initial FP beliefs of traders’ actions and marketplaces’ pricing actions, mar-
ketplaces may converge two pure-strategy Bayes-Nash equilibria where marketplace 1 sets k = 0
and marketplace 2 sets k = 1, or marketplace 1 sets k = 1 and marketplace 2 sets k = 0. Specif-
ically, when marketplace 1 chooses k = 0 and marketplace 2 chooses k = 1, marketplace 1 sets
the bid quote and marketplace 2 sets the ask quote as the equilibrium prices. This means
that marketplace 1 has an extreme bias to buyers when setting the uniform transaction price,
and marketplace 2 has an extreme bias to sellers. Furthermore, traders’ Bayes-Nash equilibria
are shown in Figure 2. We find that, in equilibrium, traders distribute in both marketplaces
(. both marketplaces can co-exist). This is different from the work in [11], where traders
converge to one marketplace. The co-existence is caused by the differentiation of competing
marketplaces adopting different pricing parameters. Specifically, because of the biased pricing
parameter k, marketplace 1 attracts buyers with high types, and marketplace 2 attracts sellers
- 12 -
˖ڍመڙጲ
0
1
0 1
Bid
s/A
sks
Types
Bids in Marketplace 1
Asks in Marketplace 1
Bids in Marketplace 2
Asks in Marketplace 2
图 3: The traders’ strategies in Bayes-Nash equilibrium when both marketplaces adopt dis-
criminatory k pricing policy.
with low types. However, marketplace 1 fails to attract buyers with low types because the
internal competition among buyers drive these buyers to leave (it is difficult for them to trans-
act when competing against buyers with high types), and marketplace 2 fails to attract sellers
with high types as well. Furthermore, we find that traders’ offers are close to their types. This
is because when using the uniform transaction price, shading offers does not guarantee more
profits for traders, but may reduce the probability of transacting. Moreover, the co-existence
shows that due to severe competition between marketplaces, it is difficult for one of the market-
places to attract all traders. Therefore they have to attract one side of traders by setting biased
transaction prices. This is a bit similar to the work in [16] where the competing marketplace
attracts one side of traders by charging a negative registration fee to subsidise them.
The second case is that both marketplaces adopt the discriminatory k pricing policy.
Dependent on the initial FP beliefs, in Bayes-Nash equilibrium, either marketplace 1 sets k = 0
and marketplace 2 sets k = 1, or the reverse. When marketplace 1 sets k = 0 and marketplace 2
sets k = 1, marketplace 1 has a bias towards buyers and marketplace 2 has a bias towards sellers
when setting transaction prices. Traders’ Bayes-Nash equilibrium market selection and bidding
strategies are shown in Figure 3. We still find that traders spread out over both marketplaces,
depending on their type. Buyers with high types prefer marketplace 1, and sellers with low types
prefer marketplace 2. Furthermore, we find buyers in marketplace 2 and sellers in marketplace
1 shade their offers. This is because in the discriminatory k pricing policy, by unilaterally
shading their offers, the transaction price will be moved towards the other side, and the traders
- 13 -
˖ڍመڙጲ
0
1
0 1
Bid
s/A
sks
Types
Bids in Marketplace 1
Asks in Marketplace 1
图 4: The traders’ strategies in Bayes-Nash equilibrium when marketplaces 1 adopts discrimi-
natory k pricing policy and marketplace 2 adopts equilibrium k pricing policy.
can obtain a higher profit. Furthermore, in contrast to the work in [6] where k is set to
to maximise the marketplace’s allocative efficiency and traders’ efficiency, in the competing
environment, marketplaces have to attract one side of traders by setting k in an extreme way
(. k = 0 or 1).
Heterogeneous Pricing Policy
We now consider the setting where both marketplaces use different pricing policies. First
we consider a setting where we fix the types of policies, . one market (marketplace 1) uses
a discriminatory k pricing policy, and the other market (marketplace 2) uses an equilibrium k
pricing policy. Depending on the initial FP beliefs, all traders converge to marketplace 1 or 2
in equilibrium, . the marketplaces no longer co-exist. In particular, depending on the initial
FP beliefs, in most cases traders will converge to marketplace with the discriminatory k pricing
policy. The reason for the superior result of the discriminatory k pricing policy is as follows.
When using a discriminatory k pricing policy, the marketplace sets the transaction price in
a range between the matched bid and ask. This range is usually larger than the equilibrium
price range. This means that marketplace 1 can provide a larger bias towards one side of
traders than marketplace 2 using equilibrium k pricing policy. Then, by attracting one side of
traders, marketplace 1 eventually attracts all traders. Dependent on the initial FP beliefs, in
equilibrium, marketplace 1 sets either k = 1 or k = 0. When k = 1, the traders’ Bayes-Nash
equilibrium bidding strategies are shown in Figure 4. We can see that buyers have to shade
- 14 -
˖ڍመڙጲ
0
1
0 50 100 150 200 250 300 350 400 450 500 550
k=
k=
Pro
bab
ilit
y o
f P
rici
ng
Str
ate
gie
s
Iteration Rounds
(a) Marketplace 1 using discriminatory k pricing policy
0
1
0 50 100 150 200 250 300 350 400 450 500 550
k=
k=
k=
Pro
bab
ilit
y o
f P
rici
ng
Str
ate
gie
s
Iteration Rounds
(b) Marketplace 2 using equilibrium k pricing policy
图 5: Dynamic changes of marketplaces’ FP beliefs when marketplace 1 uses the discriminatory
k pricing policy, marketplace 2 uses the equilibrium k pricing policy, and initial FP beliefs are
set uniformly.
their offers in order to retain profits since, when k = 1, marketplace 1 uses the buyer’s bid as
the transaction price. In addition, we should note that, although the sellers with types greater
than do place an ask in the marketplace, they do not have any opportunity to trade since
their asks are always greater than the highest bid (. ).
To better understand the behaviour of the marketplaces, we analyse the dynamics and how
marketplaces adapt their pricing strategies to beat each other. To illustrate, we consider the
setting where the initial FP beliefs are set uniformly (. initially traders and marketplaces
have no bias to any specific action, and each action is chosen by agents with equal probability).
The results are shown in Figure 5. From Figure 5(a), we can see that the probability of setting
k = 1 or k = 0 is increased. This means that in the first several rounds marketplace 1 with
the discriminatory k pricing policy may choose k = 1 or k = 0 as the best response pricing
action to attract one side of traders, and then prefers to attract sellers by setting k = 1. In
contrast, in the first several rounds of Figure 5(b), we can see that because, marketplace 1 has
attracted most sellers, marketplace 2 prefers to attract buyers by setting k = 0 or k = 0:1
- 15 -
˖ڍመڙጲ
0
0 50 100 150 200 250 300 350 400
discriminatory k pricing,k=
discriminatory k pricing,k=
Pr
ob
ab
ili
ty
o
f P
ric
in
g
St
ra
te
gi
es
Iteration Rounds
图 6: Dynamic changes of marketplaces’ FP beliefs when they can adopt both types of pricing
policies and initial FP beliefs are set uniformly.
(we can see that the probability of k = 0 and k = 0:1 is increased). However, because of the
relatively weaker ability of setting biased price to one side of traders, marketplace 2 fails to
attract buyers. Eventually, marketplace 1 beats marketplace 2 and attract all traders.
In the above analysis, we assumed that marketplaces can only adopt one type of pricing
policy during the co-learning process. Now we extend this setting whereby both marketplaces
can choose either discriminatory k pricing policy or equilibrium k pricing policy. We find that,
in equilibrium, all marketplaces eventually adopt the discriminatory k pricing policy. The
results are therefore the same as the above case where initially both marketplaces can only
adopt discriminatory k pricing policy (see Figure 3). Again we can see that the discriminatory
k pricing policy outperforms equilibrium k pricing policy. We also analyse the dynamic changes
of marketplaces’ FP beliefs when initial FP beliefs are set equally. The result is shown in Figure
6. Note that, since both marketplaces are symmetric, their dynamic changes of FP beliefs are
the same. We can see that, in the first several rounds, the probability of choosing discriminatory
k pricing policy with k = 0 and k = 1 increases. This means that both marketplaces prefer
discriminatory k pricing policy, and eventually they set k = 0 and k = 1 with equal probability
(. ).
- 16 -
˖ڍመڙጲ
4 Conclusion
In this paper, we use a co-learning algorithm based on fictitious play to analyse the Bayes-
Nash equilibrium pricing strategies of marketplaces and market selection and bidding strategies
of traders in the context of two competing double auction marketplaces. We find that, when
the two marketplaces adopt the same type of pricing policy, they can co-exist in Bayes-Nash
equilibrium. Furthermore, the competing marketplaces attract traders by setting extremely
biased transaction prices towards either buyers or sellers (. the pricing parameter k is set
to 0 or 1). Furthermore, when both marketplaces adopt different types of pricing policies,
although the equilibrium k pricing policy sets a uniform price and seems to be fair for traders,
all traders are more likely to converge to the marketplace using the discriminatory k pricing
policy, which means that the discriminatory k pricing policy has better performance than the
equilibrium k pricing policy in the competing environment. These results can provide useful
insights for guiding the design of the competing pricing strategies.
In this paper, in order to obtain tractable results, our work makes a number of simplifying
assumptions that need to be addressed in the future. The first limitation is that, in order to
obtain an equilibrium solution in a reasonable time, we consider a relatively small number of
traders and marketplaces. A possible way to address this limitation is to consider a higher
level of abstraction in which individual agents represent many agents of the same type. The
second limitation is that our analysis is focused on the clearing house mechanism. In reality,
some marketplaces will adopt the continuous double auction mechanism, which means that the
marketplace will try and match traders as soon as new offers arrive. In the future, we would
like to extend our work to analyse how competing marketplaces set effective clearing policies
and how different clearing policies can affect traders’ market selection and bidding strategies.
参考文献(References)
[1] Friedman D, Rust J. The double auction market: institutions, theories and evidence,
volume XIV of Santa Fe Institute Studies in the Science of Complexity [M]. Berkeley,
USA: Perseus Publishing, 1993.
[2] Wurman PR, Wellman MP, Walsh WE. Specifying rules for electronic auctions [J]. AI
Magazine, 2002, 23(3): 15–23.
[3] Wurman PR, Walsh WE, Wellman MP. Flexible double auction for electronic commerce:
Theory and implementation [J]. Decision Support System, 1998, 24: 17–27.
[4] Satterthwaite MA, Williams SR. Bilateral trade with the sealed bid k-double auction:
Existence and efficiency [J]. Journal of Economic Theory, 1989, 48(1): 107–133.
- 17 -
˖ڍመڙጲ
[5] Nicolaisen J, Petrov V, Tesfatsion L. Market power and efficiency in a computational
electricity market with discriminatory double auction pricing [J]. IEEE Transactions on
Evolutionary Computation, 2001, 5(5): 504–523.
[6] Phelps S, McBurney P, Parsons S, et al. Applying genetic programming to economic mech-
anism design: Evolving a pricing rule for a continuous double auction [A]. Proceedings of
the 2nd International Conference on Autonomous Agents and Multi-Agent Systems [C].
Melbourne, Australia: IFAAMAS, 2003, 1096�1097.
[7] Chakraborty M, Das S, Peabody J. Price evolution in a continuous double auction pre-
diction market with a scoring-rule based market make [A]. Proceedings of the 29th AAAI
Conference on Artificial Intelligence [C]. Austin Texas, USA: AAAI, 2015, 835�841.
[8] Ji M, Li H. Exploring price fluctuations in a double auction market [J]. Computational
Economics, 2015, 1-21.
[9] Li W, Wang S, Cheng X. Truthful multi-attribute auction with discriminatory pricing in
cognitive radio networks [J]. ACM SIGMOBILE Mobile Computing and Communications
Review, 2014, 18(1): 3–13.
[10] Niu J, Cai K, Parsons S, et al. What the 2007 tac market design game tells us about
effective auction mechanisms [J]. Autonomous Agents and Multiagent Systems, 2010, 21:
172–203.
[11] Shi B, Gerding EH, Vytelingum P, et al. An equilibrium analysis of competing double auc-
tion marketplaces using fictitious play [A]. Proceedings of the 19th European Conference
on Artificial Intelligence [C]. Lisbon, Portugal: ECAI, 2010, 575�580.
[12] Brown GW, Iterative solution of games by fictitious play [J]. Activity Analysis of Produc-
tion and Allocation, 1951, 13(1): 374–376.
[13] Neumann , Brown GW. Solutions of games by differential equations [J]. Contributions
to the Theory of Games, 1950, 24: 73–79.
[14] Rabinovich Z, Naroditskiy V, Gerding EH, et al. Computing pure bayesian-nash equilibria
in games with finite actions and continuous types [J]. Artificial Intelligence, 2013, 195:
106–139.
[15] Vijay K, Sjöström T. On the convergence of fictitious play [J]. Mathematics of Operations
Research, 1998, 23(2): 479–511.
[16] Caillaud B, Jullien B. Chicken & egg: Competition among intermediation service providers
[J], The RAND Journal of Economics, 2003, 34(2): 309–328.
- 18 -