How to calculate the intercept using numpy.linalg.lstsq

How to calculate the intercept using numpy.linalg.lstsq

After running a multiple linear regression using numpy.linalg.lstsq I get 4 arrays as described in the documentation, however it is not clear to me how do I get the intercept value. Does anyone know this? I'm new to statistical analysis.
Here is my model:
X1 = np.array(a)
X2 = np.array(b)
X3 = np.array(c)
X4 = np.array(d)
X5 = np.array(e)
X6 = np.array(f)
X1l = np.log(X1)
X2l = np.log(X2)
X3l = np.log(X3)
X6l = np.log(X6)
Y = np.array(g)

A = np.column_stack([X1l, X2l, X3l, X4, X5, X6l, np.ones(len(a), float)])
result = np.linalg.lstsq(A, Y)

This is a sample of what my model is generating:
(array([  654.12744154,  -623.28893569,   276.50269246,    11.52493817,
  49.92528734,  -375.43282832,  3852.95023087]), array([  4.80339071e+11]),
  7, array([ 1060.38693842,   494.69470547,   243.14700033,   164.97697748,
  58.58072929,    19.30593045,    13.35948642]))

I believe the intercept is the second array, still I'm not sure about that, as its value is just too high.


Answer 1:

The intersect is the coefficient that corresponds to the column of ones, which in this case is:


To make it clearer to see, consider your regression, which is something like:

y = c1*x1 + c2*x2 + c3*x3 + c4*x4 + m

written in matrix form as:

[[y1],      [[x1_1,  x2_1,  x3_1, x4_1, 1],      [[c1],
 [y2],       [x1_2,  x2_2,  x3_2, x4_2, 1],       [c2],
 [y3],  =    [x1_3,  x2_3,  x3_3, x4_3, 1],  *    [c3],
 ...                      ...                     [c4],
 [yn]]       [x1_n,  x2_n,  x3_n, x4_n, 1]]       [m]]


 Y = A * C

where A is the so called “Coefficient’ matrix and C the vector containing the solution for your regression. Note that m corresponds to the column of ones.