Parking Birmingham occupancy
Source of data: https://archive.ics.uci.edu/ml/datasets/Parking+Birmingham
import pandas as pd
df = pd.read_csv('c:/TF/ParkingBirmingham.csv')
df.LastUpdated = pd.to_datetime(df.LastUpdated)
df['month'] = df.LastUpdated.dt.month
df['hour'] = df.LastUpdated.dt.hour
df['weekday_name'] = df.LastUpdated.dt.weekday_name
df['weekday'] = df.LastUpdated.dt.weekday
df = df.loc[df['SystemCodeNumber']=='BHMMBMMBX01']
import tensorflow as tf
Step 1: Convert Data
We convert numeric variables in the correct Tensorflow format. Tensorflow provides a continuous variable conversion method: tf.feature_column.numeric_column ().
FEATURES = ['month', 'hour', 'weekday']
LABEL = 'Occupancy'
PKS = [tf.feature_column.numeric_column(k) for k in FEATURES]
Step 2: Defining the estimator
Tensorflow will automatically create a file called „ABC” in your working directory. You must use this path to access Tensorboard. The estimator applies to independent variables.
estimator = tf.estimator.LinearRegressor( feature_columns=PKS, model_dir="ABC")
To instruct Tensorflow how to feed the model, you can use pandas_input_fn. This object needs 5 parameters: x: function data y: label data batch_size: batch. Default 128 num_epoch: by default number of epochs 1 random: Random or not data. Default None
def get_input_fn(data_set, num_epochs=None, n_batch = 128, shuffle=True):
return tf.estimator.inputs.pandas_input_fn( x=pd.DataFrame({k: data_set[k].values for k in FEATURES}),
y = pd.Series(data_set[LABEL].values), batch_size=n_batch, num_epochs=num_epochs, shuffle=shuffle)
Step 3: Model training
- To feed the model you can use the function created above: get_input_fn.
- Then you instruct the model to iterate 1000 times.
- Remember that you do not specify the number of epochs (num_epochs).
- It is better to set the number of epochs to none and define the number of iterations.
To test the model, we must divide the data set into a test set and a training set.
print(df_train.shape, df_test.shape)
estimator.train(input_fn=get_input_fn(df_train, num_epochs=None, n_batch = 128, shuffle=False), steps=1000)
Step 4. Model evaluation
To enter a test set, use the following code:
ev = estimator.evaluate( input_fn=get_input_fn(df_test, num_epochs=1, n_batch = 128, shuffle=False))
Step 5. Calculation of R Square
Calculation of R Square parameter using Tensorflow
I make a prediction on a test set
y = estimator.predict(
n_batch = 256,
import itertools
predictions = list(p["predictions"] for p in itertools.islice(y, 1871))
#print("Predictions: {}".format(str(predictions)))
import numpy as np
conc = np.vstack(predictions)
ZHP = pd.DataFrame(conc)
ZHP.rename(columns={0:'y_pred'}, inplace=True)
kot = ZHP['y_pred'].values
kot = kot.astype('float32')
Now I’m creating a list of real y values from the test set.
y = df_test['Occupancy'].values
y = y.astype('float32')
Now I create a dataframe with y-real and y-predicted variables.
PZU = pd.DataFrame({'y': y, 'y_pred': kot })
def R_squared(y, y_pred):
residual = tf.reduce_sum(tf.square(tf.subtract(y,y_pred)))
total = tf.reduce_sum(tf.square(tf.subtract(y, tf.reduce_mean(y))))
r2 = tf.subtract(1.0, tf.div(residual, total))
return r2
To use this function, both variables must have the same data type.
residual = tf.reduce_sum(tf.square(tf.subtract(y,kot)))
total = tf.reduce_sum(tf.square(tf.subtract(y, tf.reduce_mean(y))))
r2 = tf.subtract(1.0, tf.div(residual, total))
sess = tf.Session()
a = sess.run(r2)
print('R Square parameter: ',a)
Calculation of R Square parameter using Pandas
PZU['SSE'] = (PZU['y'] - PZU['y_pred'])**2
Point 2. We calculate the average empirical value of y¶
PZU['ave_y'] = PZU['y'].mean()
Point 3. We calculate the difference between empirical values y and the average of empirical values y¶
PZU['SST'] = (PZU['y'] - PZU['ave_y'])**2
Point 4. We calculate the difference between sum of SST and sum of SSE
Sum_SST = PZU['SST'].sum()
print('Sum_SST :',Sum_SST)
Sum_SSE = PZU['SSE'].sum()
print('Sum_SSE :',Sum_SSE)
SSR = Sum_SST - Sum_SSE
Point 5. We calculate the R Square parameter
r2 = SSR/Sum_SST
print('R Square parameter: ',r2)