文章目录

  • LSTM 时间序列预测
    • 股票预测案例
      • 数据特征
      • 对收盘价(Close)单特征进行预测
        • 1. 导入数据
        • 2. 将股票数据收盘价(Close)进行可视化展示
        • 3. 特征工程
        • 4. 数据集制作
        • 5. 模型构建
        • 6. 模型训练
        • 7. 模型结果可视化
        • 8. 模型验证
      • 完整代码

LSTM 时间序列预测

股票预测案例

数据特征

  • Date:日期
  • Open:开盘价
  • High:最高价
  • Low:最低价
  • Close:收盘价
  • Adj Close:调整后的收盘价
  • Volume:交易量

对收盘价(Close)单特征进行预测

利用前n天的数据预测第n+1天的数据。

1. 导入数据

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScalerfilepath = 'D:/pythonProjects/LSTM/data/rlData.csv'
data = pd.read_csv(filepath)
# 将数据按照日期进行排序,确保时间序列递增
data = data.sort_values('Date')
# 打印前几条数据
print(data.head())
# 打印维度
print(data.shape)

2. 将股票数据收盘价(Close)进行可视化展示

# 设置画布大小
plt.figure(figsize=(15, 9))
plt.plot(data[['Close']])
plt.xticks(range(0, data.shape[0], 20), data['Date'].loc[::20], rotation=45)
plt.title("****** Stock Price", fontsize=18, fontweight='bold')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price (USD)', fontsize=18)
plt.savefig('StockPrice.jpg')
plt.show()

3. 特征工程

# 选取Close作为特征
price = data[['Close']]
# 打印相关信息
print(price.info())
打印结果如下:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 252 entries, 0 to 251
Data columns (total 1 columns):#   Column  Non-Null Count  Dtype
---  ------  --------------  -----  0   Close   252 non-null    float64
dtypes: float64(1)
memory usage: 3.9 KB
None
可以看出:price为DataFrame对象,以及其的结构和占用内存等信息
# 进行不同的数据缩放,将数据缩放到-1和1之间,归一化操作
scaler = MinMaxScaler(feature_range=(-1, 1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1, 1))
print(price['Close'].shape)
打印结果如下:
0      0.345034
1      0.324272
2      0.302654
3      0.320206
4      0.361515...
247    0.310788
248    0.255565
249    0.300514
250    0.311216
251    0.207192
Name: Close, Length: 252, dtype: float64
<class 'pandas.core.series.Series'>
Int64Index: 252 entries, 0 to 251
Series name: Close
Non-Null Count  Dtype
--------------  -----
252 non-null    float64
dtypes: float64(1)
memory usage: 3.9 KB
None
(252,)
可以看出:price['Close']的值,price['Close']为Series结构,price['Close']的结构为(252,)

4. 数据集制作

# 今天的收盘价预测明天的收盘价
# lookback表示观察的跨度
def split_data(stock, lookback):# 将stock转化为ndarray类型data_raw = stock.to_numpy()data = []# you can free play(seq_length)# 将data按lookback分组,data为长度为lookback的listfor index in range(len(data_raw) - lookback):data.append(data_raw[index: index + lookback])data = np.array(data);print(type(data))  # (232, 20, 1)# 按照8:2进行训练集、测试集划分test_set_size = int(np.round(0.2 * data.shape[0]))train_set_size = data.shape[0] - (test_set_size)x_train = data[:train_set_size, :-1, :]y_train = data[:train_set_size, -1, :]x_test = data[train_set_size:, :-1]y_test = data[train_set_size:, -1, :]return [x_train, y_train, x_test, y_test]lookback = 20
x_train, y_train, x_test, y_test = split_data(price, lookback)
print('x_train.shape = ', x_train.shape)
print('y_train.shape = ', y_train.shape)
print('x_test.shape = ', x_test.shape)
print('y_test.shape = ', y_test.shape)
打印结果如下:
x_train.shape =  (186, 19, 1)
y_train.shape =  (186, 1)
x_test.shape =  (46, 19, 1)
y_test.shape =  (46, 1)

5. 模型构建

import torch
import torch.nn as nnx_train = torch.from_numpy(x_train).type(torch.Tensor)
x_test = torch.from_numpy(x_test).type(torch.Tensor)
# 真实的数据
y_train_lstm = torch.from_numpy(y_train).type(torch.Tensor)
y_test_lstm = torch.from_numpy(y_test).type(torch.Tensor)
y_train_gru = torch.from_numpy(y_train).type(torch.Tensor)
y_test_gru = torch.from_numpy(y_test).type(torch.Tensor)# 输入的维度为1,只有Close收盘价
input_dim = 1
# 隐藏层特征的维度
hidden_dim = 32
# 循环的layers
num_layers = 2
# 预测后一天的收盘价
output_dim = 1
num_epochs = 100class LSTM(nn.Module):def __init__(self, input_dim, hidden_dim, num_layers, output_dim):super(LSTM, self).__init__()self.hidden_dim = hidden_dimself.num_layers = num_layersself.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x):h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outmodel = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.MSELoss()
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)

6. 模型训练

import timehist = np.zeros(num_epochs)
start_time = time.time()
lstm = []for t in range(num_epochs):y_train_pred = model(x_train)loss = criterion(y_train_pred, y_train_lstm)print("Epoch ", t, "MSE: ", loss.item())hist[t] = loss.item()optimiser.zero_grad()loss.backward()optimiser.step()training_time = time.time() - start_time
print("Training time: {}".format(training_time))predict = pd.DataFrame(scaler.inverse_transform(y_train_pred.detach().numpy()))
print(predict)  # 预测值
original = pd.DataFrame(scaler.inverse_transform(y_train_lstm.detach().numpy()))
print(original)  # 真实值
打印结果如下:其中预测值每次运行可能会有一定的差距,但不影响最后的结果。
torch.Size([186, 1])
Epoch  0 MSE:  0.19840142130851746
torch.Size([186, 1])
Epoch  1 MSE:  0.17595666646957397
torch.Size([186, 1])
Epoch  2 MSE:  0.1369851976633072
torch.Size([186, 1])
..          ...
Epoch  96 MSE:  0.008701297454535961
torch.Size([186, 1])
Epoch  97 MSE:  0.008698086254298687
torch.Size([186, 1])
Epoch  98 MSE:  0.008688708767294884
torch.Size([186, 1])
Epoch  99 MSE:  0.008677813224494457
Training time: 3.8846104145050050
0    196.654999
1    201.095291
2    200.198563
3    200.494003
4    195.120148
..          ...
181  171.398987
182  171.508194
183  172.992401
184  169.850494
185  165.566605[186 rows x 1 columns]0
0    201.999985
1    201.500015
2    201.740005
3    196.350006
4    198.999985
..          ...
181  171.919998
182  173.369995
183  170.169998
184  165.979996
185  160.470001[186 rows x 1 columns]

7. 模型结果可视化

import seaborn as sns
sns.set_style("darkgrid")fig = plt.figure()
fig.subplots_adjust(hspace=0.2, wspace=0.2)plt.subplot(1, 2, 1)
ax = sns.lineplot(x = original.index, y = original[0], label="Data", color='royalblue')
ax = sns.lineplot(x = predict.index, y = predict[0], label="Training Prediction (LSTM)", color='tomato')
print(predict.index)
print("aaaa")
print(predict[0])ax.set_title('Stock price', size = 14, fontweight='bold')
ax.set_xlabel("Days", size = 14)
ax.set_ylabel("Cost (USD)", size = 14)
ax.set_xticklabels('', size=10)plt.subplot(1, 2, 2)
ax = sns.lineplot(data=hist, color='royalblue')
ax.set_xlabel("Epoch", size = 14)
ax.set_ylabel("Loss", size = 14)
ax.set_title("Training Loss", size = 14, fontweight='bold')
fig.set_figheight(6)
fig.set_figwidth(16)
plt.show()

8. 模型验证

实际上是对结果的总结及进一步说明。

import math, time
from sklearn.metrics import mean_squared_error# make predictions
y_test_pred = model(x_test)# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_lstm.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_lstm.detach().numpy())# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
lstm.append(trainScore)
lstm.append(testScore)
lstm.append(training_time)
# shift train predictions for plotting
trainPredictPlot = np.empty_like(price)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[lookback:len(y_train_pred)+lookback, :] = y_train_pred# shift test predictions for plotting
testPredictPlot = np.empty_like(price)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(y_train_pred)+lookback-1:len(price)-1, :] = y_test_predoriginal = scaler.inverse_transform(price['Close'].values.reshape(-1,1))predictions = np.append(trainPredictPlot, testPredictPlot, axis=1)
predictions = np.append(predictions, original, axis=1)
result = pd.DataFrame(predictions)import plotly.express as px
import plotly.graph_objects as gofig = go.Figure()
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[0],mode='lines',name='Train prediction')))
fig.add_trace(go.Scatter(x=result.index, y=result[1],mode='lines',name='Test prediction'))
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[2],mode='lines',name='Actual Value')))
fig.update_layout(xaxis=dict(showline=True,showgrid=True,showticklabels=False,linecolor='white',linewidth=2),yaxis=dict(title_text='Close (USD)',titlefont=dict(family='Rockwell',size=12,color='white',),showline=True,showgrid=True,showticklabels=True,linecolor='white',linewidth=2,ticks='outside',tickfont=dict(family='Rockwell',size=12,color='white',),),showlegend=True,template = 'plotly_dark')annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,xanchor='left', yanchor='bottom',text='Results (LSTM)',font=dict(family='Rockwell',size=26,color='white'),showarrow=False))
fig.update_layout(annotations=annotations)fig.show()

完整代码

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as snsfilepath = 'D:/pythonProjects/LSTM/data/rlData.csv'
data = pd.read_csv(filepath)
data = data.sort_values('Date')
print(data.head())
print(data.shape)sns.set_style("darkgrid")
plt.figure(figsize=(15, 9))
plt.plot(data[['Close']])
plt.xticks(range(0, data.shape[0], 20), data['Date'].loc[::20], rotation=45)
plt.title("****** Stock Price", fontsize=18, fontweight='bold')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price (USD)', fontsize=18)
plt.show()# 1.特征工程
# 选取Close作为特征
price = data[['Close']]
print(price.info())from sklearn.preprocessing import MinMaxScaler
# 进行不同的数据缩放,将数据缩放到-1和1之间
scaler = MinMaxScaler(feature_range=(-1, 1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1, 1))
print(price['Close'].shape)# 2.数据集制作
# 今天的收盘价预测明天的收盘价
# lookback表示观察的跨度
def split_data(stock, lookback):data_raw = stock.to_numpy()data = []# print(data)# you can free play(seq_length)for index in range(len(data_raw) - lookback):data.append(data_raw[index: index + lookback])data = np.array(data);test_set_size = int(np.round(0.2 * data.shape[0]))train_set_size = data.shape[0] - (test_set_size)x_train = data[:train_set_size, :-1, :]y_train = data[:train_set_size, -1, :]x_test = data[train_set_size:, :-1]y_test = data[train_set_size:, -1, :]return [x_train, y_train, x_test, y_test]lookback = 20
x_train, y_train, x_test, y_test = split_data(price, lookback)
print('x_train.shape = ', x_train.shape)
print('y_train.shape = ', y_train.shape)
print('x_test.shape = ', x_test.shape)
print('y_test.shape = ', y_test.shape)# 注意:pytorch的nn.LSTM input shape=(seq_length, batch_size, input_size)
# 3.模型构建 —— LSTMimport torch
import torch.nn as nnx_train = torch.from_numpy(x_train).type(torch.Tensor)
x_test = torch.from_numpy(x_test).type(torch.Tensor)
y_train_lstm = torch.from_numpy(y_train).type(torch.Tensor)
y_test_lstm = torch.from_numpy(y_test).type(torch.Tensor)
y_train_gru = torch.from_numpy(y_train).type(torch.Tensor)
y_test_gru = torch.from_numpy(y_test).type(torch.Tensor)
# 输入的维度为1,只有Close收盘价
input_dim = 1
# 隐藏层特征的维度
hidden_dim = 32
# 循环的layers
num_layers = 2
# 预测后一天的收盘价
output_dim = 1
num_epochs = 100class LSTM(nn.Module):def __init__(self, input_dim, hidden_dim, num_layers, output_dim):super(LSTM, self).__init__()self.hidden_dim = hidden_dimself.num_layers = num_layersself.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)self.fc = nn.Linear(hidden_dim, output_dim)def forward(self, x):h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outmodel = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.MSELoss()
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)# 4.模型训练
import timehist = np.zeros(num_epochs)
start_time = time.time()
lstm = []for t in range(num_epochs):y_train_pred = model(x_train)loss = criterion(y_train_pred, y_train_lstm)print("Epoch ", t, "MSE: ", loss.item())hist[t] = loss.item()optimiser.zero_grad()loss.backward()optimiser.step()training_time = time.time() - start_time
print("Training time: {}".format(training_time))# 5.模型结果可视化predict = pd.DataFrame(scaler.inverse_transform(y_train_pred.detach().numpy()))
original = pd.DataFrame(scaler.inverse_transform(y_train_lstm.detach().numpy()))import seaborn as sns
sns.set_style("darkgrid")fig = plt.figure()
fig.subplots_adjust(hspace=0.2, wspace=0.2)plt.subplot(1, 2, 1)
ax = sns.lineplot(x = original.index, y = original[0], label="Data", color='royalblue')
ax = sns.lineplot(x = predict.index, y = predict[0], label="Training Prediction (LSTM)", color='tomato')
# print(predict.index)
# print(predict[0])ax.set_title('Stock price', size = 14, fontweight='bold')
ax.set_xlabel("Days", size = 14)
ax.set_ylabel("Cost (USD)", size = 14)
ax.set_xticklabels('', size=10)plt.subplot(1, 2, 2)
ax = sns.lineplot(data=hist, color='royalblue')
ax.set_xlabel("Epoch", size = 14)
ax.set_ylabel("Loss", size = 14)
ax.set_title("Training Loss", size = 14, fontweight='bold')
fig.set_figheight(6)
fig.set_figwidth(16)
plt.show()# 6.模型验证
# print(x_test[-1])
import math, time
from sklearn.metrics import mean_squared_error# make predictions
y_test_pred = model(x_test)# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_lstm.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_lstm.detach().numpy())# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
lstm.append(trainScore)
lstm.append(testScore)
lstm.append(training_time)# In[40]:# shift train predictions for plotting
trainPredictPlot = np.empty_like(price)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[lookback:len(y_train_pred)+lookback, :] = y_train_pred# shift test predictions for plotting
testPredictPlot = np.empty_like(price)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(y_train_pred)+lookback-1:len(price)-1, :] = y_test_predoriginal = scaler.inverse_transform(price['Close'].values.reshape(-1,1))predictions = np.append(trainPredictPlot, testPredictPlot, axis=1)
predictions = np.append(predictions, original, axis=1)
result = pd.DataFrame(predictions)import plotly.express as px
import plotly.graph_objects as gofig = go.Figure()
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[0],mode='lines',name='Train prediction')))
fig.add_trace(go.Scatter(x=result.index, y=result[1],mode='lines',name='Test prediction'))
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[2],mode='lines',name='Actual Value')))
fig.update_layout(xaxis=dict(showline=True,showgrid=True,showticklabels=False,linecolor='white',linewidth=2),yaxis=dict(title_text='Close (USD)',titlefont=dict(family='Rockwell',size=12,color='white',),showline=True,showgrid=True,showticklabels=True,linecolor='white',linewidth=2,ticks='outside',tickfont=dict(family='Rockwell',size=12,color='white',),),showlegend=True,template = 'plotly_dark')annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,xanchor='left', yanchor='bottom',text='Results (LSTM)',font=dict(family='Rockwell',size=26,color='white'),showarrow=False))
fig.update_layout(annotations=annotations)fig.show()

LSTM 时间序列预测+股票预测案例(Pytorch版)相关推荐

  1. 大数据毕业设计 基于时间序列的股票预测与分析系统 - 大数据分析

    文章目录 1 简介 2 时间序列的由来 2.1 四种模型的名称: 3 数据预览 4 理论公式 4.1 协方差 4.2 相关系数 4.3 scikit-learn计算相关性 5 金融数据的时序分析 5. ...

  2. lstm代码_只需5行代码!LSTM时间序列建模以及预测

    最近我在github上看到一个项目,项目内容是将深度学习方法(LSTM.RNN.GRU)进行时间序列建模的过程进行了封装,使得调用者调用者只需5行代码能完成时间序列建模以及预测的全过程. 项目本身是使 ...

  3. python深度学习之基于LSTM时间序列的股票价格预测

    1.本文是一篇LSTM处理时间序列的案例 我们先来看看数据集,这里包含了一只股票的开盘价,最高价,最低价,收盘价,交易量的信息. 本文基于LSTM对收盘价(close)进行预测 2. 单维对单步的预测 ...

  4. 只需5行代码! LSTM时间序列建模以及预测

    最近我在github上看到一个项目,项目内容是将深度学习方法(LSTM.RNN.GRU)进行时间序列建模的过程进行了封装,使得调用者调用者只需5行代码能完成时间序列建模以及预测的全过程. 项目本身是使 ...

  5. 利用LSTM实现预测时间序列(股票预测)

    目录 1. 作者介绍 2. tushare 简介 3. LSTM简介 3.1 循环神经网络 (Recurrent Neural Networks) 3.2 LSTM网络 3.2.1 LSTM的核心思想 ...

  6. 【数值预测案例】(5) LSTM 时间序列气温数据预测,附TensorFlow完整代码

    大家好,今天和各位分享一下如何使用循环神经网络 LSTM 完成有多个特征的气温预测.上一节中我介绍了 LSTM 的单个特征的预测,感兴趣的可以看一下:https://blog.csdn.net/dgv ...

  7. 【毕业设计】时间序列的股票预测与分析系统 - python 大数据

    文章目录 1 简介 2 时间序列的由来 2.1 四种模型的名称: 3 数据预览 4 理论公式 4.1 协方差 4.2 相关系数 4.3 scikit-learn计算相关性 5 金融数据的时序分析 5. ...

  8. 深度学习100例-循环神经网络(LSTM)实现股票预测 | 第10天

  9. MATLAB-基于长短期记忆网络(LSTM)的SP500的股票价格预测 股价预测 matlab实战 数据分析 数据可视化 时序数据预测 变种RNN 股票预测

    MATLAB-基于长短期记忆网络(LSTM)的SP500的股票价格预测 股价预测 matlab实战 数据分析 数据可视化 时序数据预测 变种RNN 股票预测 摘要 近些年,随着计算机技术的不断发展,神 ...

最新文章

  1. Spring Boot 监听 Redis Key 失效事件实现定时任务
  2. 近期AI领域8篇精选论文(附论文、代码)
  3. 屏幕边框闪光_写给想入手21:9的屏幕党,明基 EX3501R 真香跳坑指南
  4. C/C++ 头文件作用
  5. python学习第二十八节(进程,线程)
  6. C#自动弹出窗口并定时自动关闭
  7. 基础的VueJS面试题(附答案)
  8. C++ const对象仅在文件内有效
  9. Android持久化存储(4)greenDAO的使用
  10. Java基础 -- 冒泡排序算法(带详细注释)
  11. 域名虚拟主机管理系统linux,8 款顶级的虚拟主机管理系统
  12. excel计算式自动计算_全套Excel版工程自动计算表格+实用小工具,高效工作不加班...
  13. 马云把码云封了,中国最大的男性交友网站无法访问!!!
  14. google scholar 使用不了的问题——已解决
  15. 老年人-傻妞机器人安装及使用教程
  16. 量化投资学习——股票分红对期指的影响
  17. 如何查询相关基因及其相关的全部信息
  18. Linux:刚创建的普通用户不能使用Tab和上下左右键
  19. 芯片漫游指南(3)-- UVM通信
  20. 第四章 第一节:函数下

热门文章

  1. 通过analyzer分析dart代码
  2. 学术圈的人是如何赚钱的?
  3. nand flash 经典 全面 ------如何编写Linux下Nand Flash驱动
  4. 百面机器学习04-降维
  5. web前端 html+css+javascript 绿色的随行旅游网页设计实例 企业网站制作
  6. Vue 实现 Excel 导入功能
  7. 蓝桥杯每日一练:报时助手
  8. 如何使用CRM管理外贸客户资源?
  9. createjs之easeljs【游戏围住神经猫】
  10. Android逆向【4】:暴力破解APK签名校验,愉快的重新打包微信支付宝APK