Linear regression - Advanced model - POSITRON: PythOn for Science In The Reblochon cOuNtry

Scope¶

A finite number $N$ of data points are available: find the best fit of a given parametrique function going trouth this $N$ points.

Matplotlib is building the font cache; this may take a moment.

The data set : a synthetique data set with some noise¶

Nb_data_point = 200  # number of points
xmin, xmax = -0.5, 5.5  # x range

x = np.linspace(xmin, xmax, Nb_data_point)
x = x + 0.2 * np.random.rand(Nb_data_point)  # add noise

y = x**4 - 12 * x**3 + 47 * x**2 - 60 * x
y = y + 1 * np.random.randn(Nb_data_point)  # add noise

fig = plt.figure()
plt.plot(x, y, ".", label="Data")
plt.legend()
plt.grid()

We would like to fit this data with a linear piecewise function¶

To do so we need first a base, let’s use linear piecewise function function use in the FE methode

def hat(x, xc, support=1):
    """
    Compute the hat function, which is a piecewise linear function that peaks at `xc` and
    decreases linearly to zero within the given `support`.

    Parameters:
    x (array-like): Input array of values.
    xc (float): Center of the peak.
    support (float, optional): Width of the support region. Default is 1.

    Returns:
    numpy.ndarray: Array of the same shape as `x`, containing the computed hat function values.
    """

    y = np.maximum(1 - np.abs((x - xc) / support), 0.0)
    return y

Bases function settings¶

Nf = 9  # number of bases function

xc_list = np.linspace(xmin, xmax, Nf)
xc_list.shape
support = (xmax - xmin) / (Nf - 1)

Draw the hat functions¶

N = 1000
x_hat = np.linspace(xmin, xmax, N)

fig = plt.figure()
for i in range(0, len(xc_list)):
    plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()

Fit the data using this base of functions¶

Determination of the coefficent by regression¶

# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
    X[i, :] = hat(x, xc_list[i], support)
    # X = np.append(X, [hat(x, xc_list[i], support)], axis=0)
X = X.T

# Construcion of the Y matrix
Y = y.T

# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

print("The fitted coeff:")
print(beta)

The fitted coeff:
[ 37.02385787 -16.274607   -25.90194326 -16.51738924  -3.97754072
   1.48193844   0.49586078  -3.29562981   9.43414501]

Draw the results¶

N = 100
xi = np.linspace(xmin, xmax, N)

yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
    yi = yi + beta[i] * hat(xi, xc_list[i], support)

fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()

Work with other bases¶

The Radial basis function (RBF) is a good candidate to fit the data. It will creat a smooth curve that will fit the data, but due to the non finite support of the RBF function, the matrix to invert will be ill conditionned.

The gaussian RBF is defined as follow:

\phi(x) = \exp\left(-\left(\frac{x-x_c}{lc}\right)^2\right)

(1)

def hat(x, xc, support=1):
    ### RBF
    y = np.exp(-((np.abs(x - xc) / support) ** 2))
    return y

fig = plt.figure()
for i in range(0, len(xc_list)):
    plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()

Fit the data using this base of functions¶

# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
    X[i, :] = hat(x, xc_list[i], support)
X = X.T

# Construcion of the Y matrix
Y = y.T

# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

print("The fitted coeff:")
print(beta)

The fitted coeff:
[ 40.3808368  -24.28863137 -10.00502152 -13.56302099   1.67378011
  -1.39221328   3.96758795  -7.79903348  11.16852575]

Draw the results¶

N = 100
xi = np.linspace(xmin, xmax, N)

yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
    yi = yi + beta[i] * hat(xi, xc_list[i], support)

fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()