LinearRegressor with TensorFlow

Now it’s time to train our first model. Before that let me tell you what Tensor Flow is.
TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks. Now let’s jump to our ML model. In the last post we had loaded our linear data and also defined our features and labels.

Oh! Before proceeding we have to randomise our data. If Randomisation is not done our model may by-hard outputs rather than improving itself. To randomise data we do following

Now we’ll configure a linear regression model using LinearRegressor. We’ll train this model using the GradientDescentOptimizer, which implements Mini-Batch Stochastic Gradient Descent (SGD). The learning_rate argument controls the size of the gradient step. We do gradient clipping using ‘clip_gradients_by_norm’. Gradient clipping is done to make sure the magnitude of the gradients do not become too large during training, which can cause gradient descent to fail.

Now we configure our Linear Regressor with features and optimizer

Defining a Input Function

To import our data into our LinearRegressor, we need to define an input function, which instructs TensorFlow how to preprocess the data, as well as how to batch, shuffle, and repeat it during model training.

To do this First, we’ll convert our pandas feature data into a dict of NumPy arrays. We can then use the TensorFlow Dataset API to construct a dataset object from our data, and then break our data into batches of batch_size, to be repeated for the specified number of epochs (num_epochs).

So the complete code for input function will be this

Next Step is to Train our model

We can now call train() on our linear_regressor to train the model. We’ll wrap my_input_fn in a lambda so we can pass in my_feature and targets as arguments , and to start, we’ll train for 100 steps

Next lets Predict and evaluate our model

Hope all of you are understanding the topic. If you have any doubts the just comment I will get back to you soon. In the next post we will see how good our model is working. Note that this is very basic Linear Regression problem in further post I will be covering more complex one’s.

Link to my previous post : https://machinelearningpower.home.blog/2019/03/30/our-data-with-tensorflow-ml/

Link to my Blog : http://machinelearningpower.home.blog

Our Data with TensorFlow (ML)

Now we have imported our data so next step is to see what is our data made of. Before that let me give you a brief explanation about what are Features and Labels.

Features : Any Value in our data which is used/helpful in making predictions or any values in our data based on we can make good predictions are know as features. There can be one or many features in our data. They are usually represented by ‘x’.

Labels : Values which are to predicted are called Labels or Target values. These are usually represented by ‘y’.

Getting to know your Data

Before staring to write any code you should know what your aim/result. To know what your data is made of we will use some commands. let’s start

We can use Data_Frame.head() command to see some top rows and columns of our data.

Data_frame.head() to see top rows and columns of our data

We can also use Data_Frame.tail() command to see some bottom rows and columns of our data. Data_Frame.describe() shows some useful stats about our data.

Deciding our Features and Labels

Here I am going to use only one Feature which is total_rooms and our label which we are going to predict will be median_house_value. Now lets define our features and labels.

In order to import our training data into TensorFlow, we need to specify what type of data each feature contains. There are two main types of data we’ll use in this and future exercises: Categorical Data and Numeric Data

In TensorFlow, we indicate a feature’s data type using a construct called a feature column. Feature columns store only a description of the feature data; they do not contain the feature data itself.

To start, we’re going to use just one numeric input feature, total_rooms. The following code pulls the total_rooms data from our Data_Frame and defines the feature column using numeric_column, which specifies its data is numeric.(when using more than one feature use a comma to separate different features)

Defining Labels

>>targets = Data_Frame[“median_house_value”] Now we have also defined our Labels.

In the next post we will start writing our LinearRegressor model. Note that i have changed the data, it’s not the same data which I used in my previous blog.

Links

Link for the data : https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv

Link for my previous Blog Importing Data for ML : https://machinelearningpower.home.blog/2019/03/30/importing-data-ml/