Neural Network
Neural Networks are a specific subset of machine learning models inspired by the human brain, consisting of interconnected layers of nodes (neurons). They are composed of an input layer, one or more hidden layers, and an output layer.
There are various types of neural networks, such as feedforward neural networks (FNN), convolutional neural networks (CNN), deep neural networks (DNN) and recurrent neural networks (RNN).
The Spelled-out Intro to Neural Networks and Backpropagation
class Value: def __init__(self, data, _children=(), _op='', label=''): self.data = data self.grad = 0.0 self._backward = lambda: None self._prev = set(_children) self._op = _op self.label = label def __repr__(self): return f"Value(data={self.data})" def __add__(self, other): out = Value(self.data + other.data, (self, other), '+') def _backward(): self.grad += 1.0 * out.grad other.grad += 1.0 * out.grad out._backward = _backward return out def __mul__(self, other): out = Value(self.data * other.data, (self, other), '*') def _backward(): self.grad += other.data * out.grad other.grad += self.data * out.grad out._backward = _backward return out def tanh(self): x = self.data t = (math.exp(2*x) - 1)/(math.exp(2*x) + 1) out = Value(t, (self, ), 'tanh') def _backward(): self.grad += (1 - t**2) * out.grad out._backward = _backward return out def backward(self): topo = [] visited = set() def build_topo(v): if v not in visited: visited.add(v) for child in v._prev: build_topo(child) topo.append(v) build_topo(self) self.grad = 1.0 for node in reversed(topo): node._backward() a = Value(2.0, label='a') b = Value(-3.0, label='b') c = Value(10.0, label='c') e = a*b; e.label = 'e' d = e + c; d.label = 'd' f = Value(-2.0, label='f') L = d * f; L.label = 'L' L
from graphviz import Digraph def trace(root): # builds a set of all nodes and edges in a graph nodes, edges = set(), set() def build(v): if v not in nodes: nodes.add(v) for child in v._prev: edges.add((child, v)) build(child) build(root) return nodes, edges def draw_dot(root): dot = Digraph(format='svg', graph_attr={'rankdir': 'LR'}) # LR = left to right nodes, edges = trace(root) for n in nodes: uid = str(id(n)) # for any value in the graph, create a rectangular ('record') node for it dot.node(name = uid, label = "{ %s | data %.4f | grad %.4f }" % (n.label, n.data, n.grad), shape='record') if n._op: # if this value is a result of some operation, create an op node for it dot.node(name = uid + n._op, label = n._op) # and connect this node to it dot.edge(uid + n._op, uid) for n1, n2 in edges: # connect n1 to the op node of n2 dot.edge(str(id(n1)), str(id(n2)) + n2._op) return dot
draw_dot(L)
Types of Neural Networks
Feedforward Neural Network (FNN):
- Simplest type where connections between nodes do not form cycles
- Information moves in one direction
- Use Cases: Classification and regression
Deep Neural Network (DNN):
- Type of feedforward neural network with multiple hidden layers (hence "deep").
- DNNs can learn complex patterns in data by stacking layers of neurons.
- Use Cases: Speech recognition, Image classification, NLP, …
Convolutional Neural Network (CNN):
- Specialized type designed for processing structured grid data, such as images.
- Uses convolutional layers to automatically detect and learn spatial hierarchies of features (like edges, textures, and shapes).
- Use Cases: Image recognition, object detection, video analysis
Recurrent Neural Network (RNN):
- Designed for sequential data.
- Have connections that loop back on themselves
- Allows them to maintain a memory of previous inputs.
- Suitable for tasks where context is important.
- Use Cases: Time series prediction, NLP, and speech recognition.
Terminology
- MLP: Multi-Layer Perceptron, the classic type of neural networks
- Layer: A group of neurons
- Weights: Strengths of connections between neurons
- Bias: Inputs multiplied by weights and summed up with an output neuron's bias
- Epoch: Going once through the entire training data
- Activation Function: A nonlinear function that the output goes through
- Binary Classifiers: Take a float vector in and give an integer out
- Supervised Learning: Training by giving input and expected output
- Mini-Batching: Updating weights and biases several times during an epoch
- Learning Rate: How much to update the weights during training
- Gradient Descent: How neural networks are trained.
- Cost Function: Says how wrong the output is for that input.