Activation Functions and Types of Neural Networks Activation Functions

 Activation Functions

Why do we need Activation Functions?

Without activation function, weight and bias would only have a linear transformation, or neural network is just a linear regression model, a linear equation is polynomial of one degree only which is simple to solve but limited in terms of ability to solve complex problems or higher degree polynomials. 

But opposite to that, the addition of activation function to neural network executes the non-linear transformation to input and make it capable to solve complex problems such as language translations and image classifications. 

not found

3 Types of Neural Networks Activation Functions:-

  • 1-Binary Step Function

  • 2-Linear Activation Function

  • 3-Non-Linear Activation Functions

Binary Step Function:-

This activation function very basic and it comes to mind every time if we try to bound output. It is basically a threshold base classifier, in this, we decide some threshold value to decide output that neuron should be activated or deactivated.

f(x) = 1 if x > 0  else 0 if x < 0


code in python:-
import math
import numpy as np
input = [0,2,1,3.3,-2.7,1.1,2.2]
for i in input:
if i>=0:
output.append(1)
else:
output.append(0)

2. Linear Activation Function:-

It is a simple straight line activation function where our function is directly proportional to the weighted sum of neurons or input. Linear activation functions are better in giving a wide range of activations and a line of a positive slope may increase the firing rate as the input rate increases.

Y = mZ       where m = slop of line or (x2-x1)/(y2-y1)

3-Non-Linear Activation Functions

1. ReLU( Rectified Linear unit) Activation function:-

Rectified linear unit or ReLU is most widely used activation function right now which ranges from 0 to infinity, All the negative values are converted into zero, and this conversion rate is so fast that neither it can map nor fit into data properly which creates a problem, but where there is a problem there is a solution.it used vrey more in heddin layer in deep learning

f(x) = x if x > 0  else 0 if x < 0

code in python

import math

import numpy as np

input = [0,2,1,3.3,-2.7,1.1,2.2]

for i in input:

if i>=0:

output.append(i)

else:

output.append(0)

or

output.append(max(0,i)

2. Leaky ReLU Activation Function

We needed the Leaky ReLU activation function to solve the ‘Dying ReLU’ problem, as discussed in ReLU, we observe that all the negative input values turn into zero very quickly and in the case of Leaky ReLU we do not make all negative inputs to zero but to a value near to zero which solves the major issue of ReLU activation function.

f(x) = x if x > 0  else a*x if x < 0 where a == slop or sin(angel betwen x axis and line)

code in python

import numpy as np

input = [0,2,1,3.3,-2.7,1.1,2.2]

for i in input:

if i>=0:

output.append(i)

else:

output.append(i*(y2*y1)/(x2/x1))

or

output.append(max(i*a,i)

3. Sigmoid Activation Function and Logistic Activation Function:-

This function takes any real value as input and outputs values in the range of 0 to 1. The larger the input (more positive), the closer the output value will be to 1, whereas the smaller the input (more negative), the closer the output will be to 0, as shown below..

f(x) = 1/(1+e(-x) )

code in python

import math

import numpy as np

input = [0,2,1,3.3,-2.7,1.1,2.2]

for i in input:

output = .append(1/1+math.exp**-1)

4. Hyperbolic Tangent Activation Function(Tanh):-

This activation function is slightly better than the sigmoid function, like the sigmoid function it is also used to predict or to differentiate between two classes but it maps the negative input into negative quantity only and ranges in between -1 to  1.

tanh(x)=(sinh(x)/cosh(x))=(e**2x−1)/(e**2x+1)=−i*tan(ix) .

code in python

import math

import numpy as np

input = [0,2,1,3.3,-2.7,1.1,2.2]

for i in input:

output = .append(tanh(X))


5. Softmax Activation Function:-

The softmax activation operation applies the softmax function to the channel dimension of the input data.it normalizes the value of the input data across the channel dimension such that it sums to one. You can regard the output of the softmax function as a probability distribution.

it divided into two function

1-normalizatin :- f(x) = epx-value/(sum epx-values )

2-exponential f(x) = exp**x

code in python

class softmax:

def forward(self,input):

epx-values = np.exp(input-np.max(input,axis=1,keepdims=true)

probobitities = exp-values/np.sum(exp.values,axis=1,keepdims=true)

self.output = probobitities 

6- Swish

It is a self-gated activation function developed by researchers at Google. Swish consistently matches or outperforms ReLU activation function on deep networks applied to various challenging domains such as image classification, machine translation etc.

f(x) = x*sigmoid(x)=x/(1+e**-x)

code in python

import math

import numpy as np

input = [0,2,1,3.3,-2.7,1.1,2.2]

for i in input:

output.append(x/1+math.exp**-1)


7-Scaled Exponential Linear Unit (SELU):-

SELU was defined in self-normalizing networks and takes care of internal normalization which means each layer preserves the mean and variance from the previous layers. SELU enables this normalization by adjusting the mean and variance. SELU has both positive and negative values to shift the mean, which was impossible for ReLU activation function as it cannot output negative values. Gradients can be used to adjust the variance. The activation function needs a region with a gradient larger than one to increase it.SELU is a relatively newer activation function and needs more papers on architectures such as CNNs and RNNs, where it is comparatively explored.

f(x) = x if x >= 0  else a*(e**x-1) if x < 0 where a == slop or sin(angel between x axis and line)

Please leave your comment to encourage us

Previous Post Next Post