Building Multi-Layer Perceptron Neural Network Models
In this article, I am going to discuss Building Multi-Layer Perceptron Neural Network Models. Please read our previous article where we discussed Deep Neural Networks.
Building Multi-Layer Perceptron Neural Network Models
The deep learning Python Package Keras focuses on building models as a series of layers. You will learn in this post about the straightforward building blocks that Keras from TensorFlow may be used to build neural networks and basic deep learning models. They are installed together since Keras is a high-level API for TensorFlow. Since Tensorflow can be imported directly, you can utilize it.
The Sequential class defines the simplest model as a linear stack of Layers. But first, you need to import the library –
You can build a sequential model and include each layer, for instance:
from tensorflow.keras.models import Sequential model = Sequential() model.add(...) model.add(...)
Input Layer of Model –
Your model’s initial layer must define the input’s shape. The input_dim option defines this as the number of input attributes. An integer is expected for this argument.
For a Dense type layer, the input layer and first hidden layer are initialized together. For instance, if you have five inputs, and you want 11 neurons in your hidden layer, you may describe it as follows:
from keras.layers import Dense Dense(11, input_dim=8)
A few characteristics shared by layers of various types include their activation functions and weight initialization process.
Weight Initialization –
The init argument specifies the kind of initialization that will be applied to a layer. Initialization of layers typically takes the following forms:
- Uniform: Initialized weights have small, evenly distributed random values between 0 and 0.05.
- Normal: Weights are initially set to modest Gaussian random values in “normal” (zero mean and standard deviation of 0.05).
- Zero: The values of all weights are zero.
Activation Function –
A variety of common neuron activation functions, including softmax, rectified linear, tanh, and sigmoid, are supported by Keras. The activation argument, which accepts a string value, is often used to specify the kind of activation function a layer will utilize.
- You can utilize the ReLU activation function in hidden layers and output layers for dealing with Regression problems.
- ReLU is preferable in the hidden layers when dealing with Classification problems, while sigmoid or tanh activation functions should be utilized in the output layer.
Model Compilation –
Your model must be compiled after it has been defined. By doing this, you construct the effective structures that TensorFlow needs to train your model quickly and effectively. Your model is specifically turned into a graph by TensorFlow in order to facilitate training.
The compile() function is used to compile your model, and it accepts three crucial attributes:
- Optimizer for models.
- Loss function
model.compile(optimizer=…, loss=…, metrics=…)
The search method is utilized by the optimizer to update the weights in your model. By using the optimizer argument, you can create an optimizer object and provide it to the compile function. By doing so, you can customize the optimization process by giving it specific arguments like learning rate. For instance-
from tensorflow.keras.optimizers import SGD model.compile(optimizer=SGD())
You might select from a number of well-known gradient descent optimizers, including-
- SGD – stands for stochastic gradient descent with momentum support.
- RMSprop – adaptive learning rate optimization approach.
- Adam – Adam is an adaptive moment estimation technique that makes advantage of variable learning rates.
Loss Function –
The optimizer’s assessment of the model that is used to navigate the weight space is known as the loss function, also known as the objective function. The loss argument of the compile function allows you to specify the name of the loss function to use. Typical illustrations include:
- mse – stands for mean square error.
- For binary logarithmic loss, use binary_crossentropy
- For multi-class logarithmic loss, use categorical_crossentropy
Model Metrics –
During training, the model assesses metrics. The metrics that Keras offers for usage with regression problems are listed below –
- mean_squared_error or mse
- mean_absolute_error or mae
- mean_absolute_percentage_error or mape
The metrics that Keras offers for usage with regression problems are listed below –
Model Training –
The fit() function is used, for instance, to train the model –
model.fit(X, y, epochs=…, batch_size=…)
Both the batch size and the number of epochs is specified during training.
- The number of epochs (nb_epoch) indicates how frequently the training dataset is presented to the model.
- The batch size (batch_size) determines how many training examples are presented to the model before a weight update is carried out.
The model can also be briefly evaluated while being trained using the fit function. A history object comprising information and metrics calculated for the model at each epoch is returned by fitting the model. For graphing model performance, use this.
Once your model has been trained, you can use it to predict outcomes using test data or fresh data. Your trained model can produce a variety of distinct output kinds, each of which is generated using a separate function call on your model object. For instance:
- model.evaluate(): To determine the input data loss values.
- model.predict(): To produce network output for given data
You can complete your model after you are satisfied with it. You might want to output a model summary. You can display a summary of a model, for instance, by calling the summary function, as in:
In the next article, I am going to discuss Multi-Layer Perceptron Digit-Classifier using TensorFlow. Here, in this article, I try to explain the Building Multi-Layer Perceptron Neural Network Models. I hope you enjoy this Building Multi-Layer Perceptron Neural Network Models article. Please post your feedback, suggestions, and questions about this Building Multi-Layer Perceptron Neural Network Models article.
About the Author: Pranaya Rout
Pranaya Rout has published more than 3,000 articles in his 11-year career. Pranaya Rout has very good experience with Microsoft Technologies, Including C#, VB, ASP.NET MVC, ASP.NET Web API, EF, EF Core, ADO.NET, LINQ, SQL Server, MYSQL, Oracle, ASP.NET Core, Cloud Computing, Microservices, Design Patterns and still learning new technologies.