On Power-Law Distributed Balls in Bins and its Applications to View Size Estimation
ATSONIOS, Ioannis
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
BEAUMONT, Olivier
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
HANUSSE, Nicolas
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
See more >
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
ATSONIOS, Ioannis
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
BEAUMONT, Olivier
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
HANUSSE, Nicolas
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Reduce
Algorithmics for computationally intensive applications over wide scale distributed platforms [CEPAGE]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Language
en
Communication dans un congrès
This item was published in
ISAAC, 2011-12, Yokohama. 2011-12
English Abstract
The view size estimation plays an important role in query optimization. It has been observed that many data follow a power law distribution. In this paper, we consider the balls in bins problem where we place balls into ...Read more >
The view size estimation plays an important role in query optimization. It has been observed that many data follow a power law distribution. In this paper, we consider the balls in bins problem where we place balls into $N$ bins when the bin selection probabilities follow a power law distribution. As a generalization to the coupon collector's problem, we address the problem of determining the expected number of balls that need to be thrown in order to have at least one ball in each of the $N$ bins. We prove that $\Theta(\frac{N^\alpha \ln N}{c_N^{\alpha}})$ balls are needed to achieve this where $\alpha$ is the parameter of the power law distribution and $c_N^{\alpha}=\frac{\alpha-1}{\alpha-N^{\alpha-1}}$ for $\alpha \neq 1$ and $c_N^{\alpha}=\frac{1}{\ln N}$ for $\alpha=1$. Next, when fixing the number of balls that are thrown to $T$, we provide closed form upper and lower bounds on the expected number of bins that have at least one occupant. For $n$ large and $\alpha>1$, we prove that our bounds are tight up to a constant factor of $\left(\frac{\alpha}{\alpha-1}\right)^{1-\frac{1}{\alpha}} \leq e^{1/e} \simeq 1.4$.Read less <
English Keywords
Power Law
View Size Estimation
Balls into Bins
Origin
Hal imported