LSTM Keras input shape confusion

I am trying to build a predictive model on stock prices. From what I’ve read, LSTM is a good layer to use. I can’t fully understand what my input_shape needs to be for my model though.

Here is the tail of my DataFrame

I then split the data into train / test

labels = df['close'].values
x_train_df = df.drop(columns=['close'])
x_train, x_test, y_train, y_test = train_test_split(x_train_df.values, labels, test_size=0.2, shuffle=False)

min_max_scaler = MinMaxScaler()
x_train = min_max_scaler.fit_transform(x_train)
x_test = min_max_scaler.transform(x_test)

print('y_train', y_train.shape)
print('y_test', y_test.shape)
print('x_train', x_train.shape)
print('x_test', x_test.shape)
print(x_train)

This yields:

enter image description here

Here’s where I am getting confused. Running the simple example, I get the following error:

ValueError: Input 0 of layer lstm_15 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 4026, 5]

I’ve tried various combinations of messing with the input_shape and have came to the conclusion, I have no idea how to determine the input shape.

model = Sequential()
model.add(LSTM(32, input_shape=(1, x_train.shape[0], x_train.shape[1])))

model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)

Given my dataframe, what should be my input_shape? I understand that the input shape is batch sizetimestepsdata dim. Just not clear how to map those words to my actual data as what I’ve thought the values were, are actually not.

I was thinking:

  • Batch Size: Number of records I’m passing in (4026)
  • Time Steps: 1 (I’m not sure if this is supposed to be the same value as batch size?)
  • Data Dimension: 1 since my data is 1 dimensional (I think?)

Leave a Comment