Adaptive filtering in Stock Market prediction: a different approach (2024)

Using LMS linear adaptive filter to predict Stock Market prices

Time Series

A time series is a series of data points indexed (or listed or graphed) in time order.

Being very important to fields like econometrics, statistics and meteorology, the study of time series is one of the challenging problems that motivates a lot of studies in signal processing, machine learning and data analysis in general, since it can be used for clustering, classification, and in this case, prediction.

Adaptive filtering in Stock Market prediction: a different approach (3)

For having a more practical situation, we will use ABEV3 (ON) stock prices during 178 days of 2019. If you think that this kind of data is difficult to have access, you are wrong. I got this data in the b3 website, the official Brazilian stock market, and it’s free :)

Adaptive filtering in Stock Market prediction: a different approach (4)

Nice, once we have the data in hands, let’s talk about the algorithm

LMS filter

The LMS filter is a kind of adaptive filter that is used for solving linear problems. The idea of the filter is to mimetize a system (finding the filter coefficients) by minimizing the least mean square of the error signal.

Adaptive filtering in Stock Market prediction: a different approach (5)

In general, we don’t know exactly if the problem can be solved very well with linear approach, so we usually test a linear and a non-linear algorithm. Since the internet always shows non-linear approaches, we will use LMS to prove that stock market prediction can be done with linear algorithms with a good precision.

But this filter mimetizes a system, that is, if we apply this filter in our data, we will have the filter coefficients trained, and when we input a new vector, our filter coefficients will output a response that the original system would (in the best case). So we just have to do a tricky modification for using this filter to predict data.

The system

First, we will delay our input vector by l positions, where l would be the quantity of days we want to predict, this l new positions will be filled by zeros.

Adaptive filtering in Stock Market prediction: a different approach (6)

When we apply the LMS filter, we will train the filter to the first 178 data. After that, we will set the error as zero, so the system will start to output the answers as the original system to the last l values. We will call the tricky modification as the LMSPred algorithm.

Adaptive filtering in Stock Market prediction: a different approach (7)

Finally, let’s code!

First we have to import the libraries we will use in this code:

import numpy as np
import matplotlib.pyplot as plt

The next step is the implementation of LMSPred, you can try by yourself while looking to the pseudocode, here is my implementation:

def lmsPred(x,l,u,N):
 xd= np.block([np.zeros((1,l)), x]).T
 y=np.zeros((len(xd),1))
 xn=np.zeros((N+1,1))
 xn = np.matrix(xn)
 wn=np.random.rand(N+1,1)/10
 M=len(xd)
 for n in range(0,M):
 xn = np.block([[xd[n]], [xn[0:N]]]);
 y[n]= np.matmul(wn.T, xn);
 if(n>M-l-1):
 e =0;
 else:
 e=int(x[n]-y[n]);
 wn = wn + 2*u*e*xn; return y,wn;

Now, we will define our vector x, that has the 178 values of ABEV3 (ON):

x = np.array([1655, 1648, 1615, 1638, 1685, 1729, 1754, 1770, 1780, 1785, 1800, 1800, 1754, 1718, 1716, 1795, 1787, 1797, 1751, 1811, 1845, 1864, 1809, 1875, 1822, 1871, 1867, 1839, 1859, 1849, 1819, 1832, 1815, 1832, 1832, 1839, 1849, 1836, 1723, 1683, 1637, 1669, 1659, 1711, 1700, 1690, 1666, 1676, 1731, 1719, 1700, 1698, 1672, 1652, 1699, 1654, 1675, 1683, 1682, 1677, 1684, 1732, 1744, 1735, 1769, 1755, 1725, 1706, 1742, 1753, 1705, 1708, 1750, 1767, 1772, 1831, 1829, 1835, 1847, 1795, 1792, 1806, 1765, 1792, 1749, 1730, 1701, 1694, 1661, 1664, 1649, 1649, 1709, 1721, 1721, 1706, 1722, 1731, 1726, 1743, 1755, 1742, 1735, 1741, 1764, 1761, 1765, 1772, 1768, 1785, 1764, 1780, 1805, 1820, 1845, 1830, 1817, 1810, 1805, 1789, 1781, 1813, 1887, 1900, 1900, 1894, 1902, 1869, 1820, 1825, 1810, 1799, 1825, 1809, 1799, 1803, 1796, 1949, 1980, 2050, 2034, 2013, 2042, 2049, 2016, 2048, 2063, 2017, 2007, 1948, 1938, 1901, 1878, 1890, 1911, 1894, 1880, 1847, 1833, 1809, 1817, 1815, 1855, 1872, 1838, 1852, 1880, 1869, 1872, 1887, 1882, 1891, 1937, 1910, 1915, 1943, 1926, 1935]);

For training the system, we will take the first 173 values, with a learning rate of 2^(-30), filter order N=60 and l=5 days of prediction.

x_train = x[0:173]
u = 2**(-30);
l=5;
N=60;
y,wn = lmsPred(x_train,l,u,N)

To visualize the input data and the learning curve, we will plot as follows:

plt.plot(x, color = 'black')
plt.plot(y, color = 'red')
plt.show()

And to evaluate the percentual error of our prediction:

pred = y[-l:]
realvalues = x[-l]
error = 100*(pred.T-realvalues)/realvalues
print(abs(error))

So, the full code:

import numpy as np
import matplotlib.pyplot as pltdef lmsPred(x,l,u,N):
 xd= np.block([np.zeros((1,l)), x]).T
 y=np.zeros((len(xd),1))
 xn=np.zeros((N+1,1))
 xn = np.matrix(xn)
 wn=np.random.rand(N+1,1)/10
 M=len(xd)
 for n in range(0,M):
 xn = np.block([[xd[n]], [xn[0:N]]]);
 y[n]= np.matmul(wn.T, xn);
 if(n>M-l-1):
 e =0;
 else:
 e=int(x[n]-y[n]);
 wn = wn + 2*u*e*xn; return y,wn;x = np.array([1655, 1648, 1615, 1638, 1685, 1729, 1754, 1770, 1780, 1785, 1800, 1800, 1754, 1718, 1716, 1795, 1787, 1797, 1751, 1811, 1845, 1864, 1809, 1875, 1822, 1871, 1867, 1839, 1859, 1849, 1819, 1832, 1815, 1832, 1832, 1839, 1849, 1836, 1723, 1683, 1637, 1669, 1659, 1711, 1700, 1690, 1666, 1676, 1731, 1719, 1700, 1698, 1672, 1652, 1699, 1654, 1675, 1683, 1682, 1677, 1684, 1732, 1744, 1735, 1769, 1755, 1725, 1706, 1742, 1753, 1705, 1708, 1750, 1767, 1772, 1831, 1829, 1835, 1847, 1795, 1792, 1806, 1765, 1792, 1749, 1730, 1701, 1694, 1661, 1664, 1649, 1649, 1709, 1721, 1721, 1706, 1722, 1731, 1726, 1743, 1755, 1742, 1735, 1741, 1764, 1761, 1765, 1772, 1768, 1785, 1764, 1780, 1805, 1820, 1845, 1830, 1817, 1810, 1805, 1789, 1781, 1813, 1887, 1900, 1900, 1894, 1902, 1869, 1820, 1825, 1810, 1799, 1825, 1809, 1799, 1803, 1796, 1949, 1980, 2050, 2034, 2013, 2042, 2049, 2016, 2048, 2063, 2017, 2007, 1948, 1938, 1901, 1878, 1890, 1911, 1894, 1880, 1847, 1833, 1809, 1817, 1815, 1855, 1872, 1838, 1852, 1880, 1869, 1872, 1887, 1882, 1891, 1937, 1910, 1915, 1943, 1926, 1935]);x_train = x[0:173]
u = 2**(-30);
l=5;
N=60;
y,wn = lmsPred(x_train,l,u,N)plt.plot(x, color = 'black')
plt.plot(y, color = 'red')
plt.show()pred = y[-l:]
realvalues = x[-l]
error = 100*(pred.T-realvalues)/realvalues
print(abs(error))

Results

Adaptive filtering in Stock Market prediction: a different approach (8)

We chose the black as the real data, and the red as our prediction, as you can see, initially they have a lot of difference, but closer to the value corresponding to the filter order (in this case, 60) the two curves are very close.

And, for this case, the percentual output accuracy per day is:

[[0.79837693 1.12168626 1.24557245 2.24050302 3.16604697]]

So, the 5th day has 3.16% of error, which is a pretty nice valueasweareusingaverysimplemethod.

It is important to highlight that stock market prediction is not so good for high values of l, since we want to analyse the stock market during a steady state regime, that is, without considering possible future crysis, politics problems and etc. Due to this, it’s safer to use stock market prediction for small values of l.