Linear regression - Advanced model#
Note
this notebook folows the Linear regression - Linear model notebook.
Scope#
A finite number \(N\) of data points are available: find the best fit of a given parametrique function going trouth this \(N\) points.
Show code cell source
# setup
%load_ext autoreload
%matplotlib ipympl
%autoreload 2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
The data set : a synthetique data set with some noise#
Nb_data_point = 200 # number of points
xmin, xmax = -0.5, 5.5 # x range
x = np.linspace(xmin, xmax, Nb_data_point)
x = x + 0.2 * np.random.rand(Nb_data_point) # add noise
y = x**4 - 12 * x**3 + 47 * x**2 - 60 * x
y = y + 1 * np.random.randn(Nb_data_point) # add noise
fig = plt.figure()
plt.plot(x, y, ".", label="Data")
plt.legend()
plt.grid()
We would like to fit this data with a linear piecewise function#
To do so we need first a base, let’s use linear piecewise function function use in the FE methode
def hat(x, xc, support=1):
"""
Compute the hat function, which is a piecewise linear function that peaks at `xc` and
decreases linearly to zero within the given `support`.
Parameters:
x (array-like): Input array of values.
xc (float): Center of the peak.
support (float, optional): Width of the support region. Default is 1.
Returns:
numpy.ndarray: Array of the same shape as `x`, containing the computed hat function values.
"""
y = np.maximum(1 - np.abs((x - xc) / support), 0.0)
return y
Bases function settings#
Nf = 9 # number of bases function
xc_list = np.linspace(xmin, xmax, Nf)
xc_list.shape
support = (xmax - xmin) / (Nf - 1)
Draw the hat functions#
N = 1000
x_hat = np.linspace(xmin, xmax, N)
fig = plt.figure()
for i in range(0, len(xc_list)):
plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()
Fit the data using this base of functions#
Determination of the coefficent by regression#
# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
X[i, :] = hat(x, xc_list[i], support)
# X = np.append(X, [hat(x, xc_list[i], support)], axis=0)
X = X.T
# Construcion of the Y matrix
Y = y.T
# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))
print("The fitted coeff:")
print(beta)
The fitted coeff:
[ 36.31190064 -16.17145781 -25.34095591 -16.13646887 -4.61444401
1.66433493 0.44416121 -3.0103039 9.93476961]
Draw the results#
N = 100
xi = np.linspace(xmin, xmax, N)
yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
yi = yi + beta[i] * hat(xi, xc_list[i], support)
fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()
Work with other bases#
The Radial basis function (RBF) is a good candidate to fit the data. It will creat a smooth curve that will fit the data, but due to the non finite support of the RBF function, the matrix to invert will be ill conditionned.
The gaussian RBF is defined as follow: $\( \phi(x) = \exp\left(-\left(\frac{x-x_c}{lc}\right)^2\right) \)$
def hat(x, xc, support=1):
### RBF
y = np.exp(-((np.abs(x - xc) / support) ** 2))
return y
fig = plt.figure()
for i in range(0, len(xc_list)):
plt.plot(x_hat, hat(x_hat, xc_list[i], support))
plt.grid()
Fit the data using this base of functions#
# Construcion of the X matrix
X = np.zeros((Nf, len(x)))
for i in range(0, len(xc_list)):
X[i, :] = hat(x, xc_list[i], support)
X = X.T
# Construcion of the Y matrix
Y = y.T
# Computation of the least square estimator
beta = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))
print("The fitted coeff:")
print(beta)
The fitted coeff:
[ 38.26460372 -22.61872396 -10.79303158 -12.36359775 0.35024098
-0.52272126 3.49815618 -7.48561946 11.53980679]
Draw the results#
N = 100
xi = np.linspace(xmin, xmax, N)
yi = np.zeros(xi.shape)
for i in range(0, len(xc_list)):
yi = yi + beta[i] * hat(xi, xc_list[i], support)
fig = plt.figure()
plt.plot(x, y, "o", label="Data")
plt.plot(xi, yi, "r", label="Regression")
plt.legend()
plt.grid()