Sure. Basically, the assumption I make is that some stats are a linear function of size.
For example, let's take the HP stat. The two variables I am dealing with are called X and Y. X is the dependent variable (size here) and Y is the independent variable (HP).
X Y
100.000% 2877
119.580% 3287
124.091% 3383
122.502% 3350
125.109% 3405
125.370% 3410
120.301% 3305
127.900% 3461
125.089% 3404
121.724% 3335
For reason I will explain later, I took only the SA BM here and left out the QA BM.
With the data above, we can make a linear regression: Y=a*X+b. This can be done in Excel and the results we obtain is:
a=2098.8797959922
b=778.559245020504
This means that if you have the size, we can calculate the HP: HP=2098.8797959922*size+778.559245020504. As a sanity check, we can see that with MirageGaogamon BM (size=100%), we recover the actual size 2877.439041 (rounded down to 2877).
We can do linear regression with any data but it can be non-relevant if there is no linear relationship between two variables. That is why we need to look at some quality diagnostics. In this case, we have:
R-squared=99.99% --> the higher the better
p-values for a and b: 1.2789E-18 and 1.68007E-14 --> the lower the better
With the results we have here, we can conclude, we have perfect linear relationship (or almost nearly, depending on how picky you are). If you draw the points on a graph, you would see a straight line.
The reason why I left out QA is because, they have their own behaviour. If added to the sample, it would have caused distortion to the results and give lower quality diagnostics.
Other stats are a bit more difficult to deal with because they are more "discrete" while HP is more continuous. But the idea is the same.
More info at the usual place:
http://en.wikipedia.org/wiki/Linear_regression
Hope it helps
