Our Data with TensorFlow (ML)

Now we have imported our data so next step is to see what is our data made of. Before that let me give you a brief explanation about what are Features and Labels.

Features : Any Value in our data which is used/helpful in making predictions or any values in our data based on we can make good predictions are know as features. There can be one or many features in our data. They are usually represented by ‘x’.

Labels : Values which are to predicted are called Labels or Target values. These are usually represented by ‘y’.

Getting to know your Data

Before staring to write any code you should know what your aim/result. To know what your data is made of we will use some commands. let’s start

We can use Data_Frame.head() command to see some top rows and columns of our data.

Data_frame.head() to see top rows and columns of our data

We can also use Data_Frame.tail() command to see some bottom rows and columns of our data. Data_Frame.describe() shows some useful stats about our data.

Data_Frame.tail() command
Data_Frame.describe()

Deciding our Features and Labels

Here I am going to use only one Feature which is total_rooms and our label which we are going to predict will be median_house_value. Now lets define our features and labels.

In order to import our training data into TensorFlow, we need to specify what type of data each feature contains. There are two main types of data we’ll use in this and future exercises: Categorical Data and Numeric Data

In TensorFlow, we indicate a feature’s data type using a construct called a feature column. Feature columns store only a description of the feature data; they do not contain the feature data itself.

To start, we’re going to use just one numeric input feature, total_rooms. The following code pulls the total_rooms data from our Data_Frame and defines the feature column using numeric_column, which specifies its data is numeric.(when using more than one feature use a comma to separate different features)

Defining Labels

>>targets = Data_Frame[“median_house_value”] Now we have also defined our Labels.

In the next post we will start writing our LinearRegressor model. Note that i have changed the data, it’s not the same data which I used in my previous blog.

Links

Link for the data : https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv

Link for my previous Blog Importing Data for ML : https://machinelearningpower.home.blog/2019/03/30/importing-data-ml/

Setting up environment for ML

Now lets download and Install all the required software/packages to get started with our ML model.

First Download latest version of Python from https://www.python.org/downloads/ . Download according to your operating system. Second Download a IDLE to write our code. There are many IDLE’s but I suggest you to Anaconda. Download Anaconda from here https://www.anaconda.com/distribution/ . Open Anaconda and download Jupiter Notebook.

Now you are all set up to start coding your ML model.

You have to Install some Libraries into Anaconda. Open the Anaconda Prompt and type these commands to install required libraries. Pandas Library :
pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

< conda install -c anaconda pandas > type this in conda prompt to install pandas(without the brackets).

NumPy Library :
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

< conda install -c anaconda numpy > type this in conda prompt to install pandas(without the brackets).

That’s it for now if there are any other packages required I will guide those installation during our model.

STAY WITH ME TO LEARN MACHINE LEARNING also share this blog with your friends.