Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Linear regression - Advanced model

Scope

  • A finite number NN of data points are available: find the best fit of a given parametrique function going trouth this NN points.

Source
# setup
%load_ext autoreload
%matplotlib ipympl
%autoreload 2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
Matplotlib is building the font cache; this may take a moment.

The data set : a synthetique data set with some noise

Nb_data_point = 200  # number of points
xmin, xmax = -0.5, 5.5  # x range

x = np.linspace(xmin, xmax, Nb_data_point)
x = x + 0.2 * np.random.rand(Nb_data_point)  # add noise

y = x**4 - 12 * x**3 + 47 * x**2 - 60 * x
y = y + 1 * np.random.randn(Nb_data_point)  # add noise
fig = plt.figure()
plt.plot(x, y, ".", label="Data")
plt.legend()
plt.grid()
Loading...

We would like to fit this data with a linear piecewise function

To do so we need first a base, let’s use linear piecewise function function use in the FE methode

def hat(x, xc, support=1):
    """
    Compute the hat function, which is a piecewise linear function that peaks at `xc` and
    decreases linearly to zero within the given `support`.

    Parameters:
    x (array-like): Input array of values.
    xc (float): Center of the peak.
    support (float, optional): Width of the support region. Default is 1.

    Returns:
    numpy.ndarray: Array of the same shape as `x`, containing the computed hat function values.
    """

    y = np.maximum(1 - np.abs((x - xc) / support), 0.0)
    return y

Bases function settings

Nf = 9  # number of bases function

xc_list = np.linspace(xmin, xmax, Nf)
xc_list.shape
support = (xmax - xmin) / (Nf - 1)

Draw the hat functions

N = 1000
x_hat = np.linspace(xmin, xmax, N)
fig = plt.figure()
for i in range(0, len(xc_list)):
    plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()
Loading...

Fit the data using this base of functions

Determination of the coefficent by regression

# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
    X[i, :] = hat(x, xc_list[i], support)
    # X = np.append(X, [hat(x, xc_list[i], support)], axis=0)
X = X.T

# Construcion of the Y matrix
Y = y.T

# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

print("The fitted coeff:")
print(beta)
The fitted coeff:
[ 37.02385787 -16.274607   -25.90194326 -16.51738924  -3.97754072
   1.48193844   0.49586078  -3.29562981   9.43414501]

Draw the results

N = 100
xi = np.linspace(xmin, xmax, N)

yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
    yi = yi + beta[i] * hat(xi, xc_list[i], support)
fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()
Loading...

Work with other bases

The Radial basis function (RBF) is a good candidate to fit the data. It will creat a smooth curve that will fit the data, but due to the non finite support of the RBF function, the matrix to invert will be ill conditionned.

The gaussian RBF is defined as follow:

ϕ(x)=exp((xxclc)2)\phi(x) = \exp\left(-\left(\frac{x-x_c}{lc}\right)^2\right)
def hat(x, xc, support=1):
    ### RBF
    y = np.exp(-((np.abs(x - xc) / support) ** 2))
    return y
fig = plt.figure()
for i in range(0, len(xc_list)):
    plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()
Loading...

Fit the data using this base of functions

# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
    X[i, :] = hat(x, xc_list[i], support)
X = X.T

# Construcion of the Y matrix
Y = y.T

# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

print("The fitted coeff:")
print(beta)
The fitted coeff:
[ 40.3808368  -24.28863137 -10.00502152 -13.56302099   1.67378011
  -1.39221328   3.96758795  -7.79903348  11.16852575]

Draw the results

N = 100
xi = np.linspace(xmin, xmax, N)

yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
    yi = yi + beta[i] * hat(xi, xc_list[i], support)
fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()
Loading...