andersch.dev

<2024-02-28 Wed>
[ ai ]

Neural Network

Neural Networks are a specific subset of machine learning models inspired by the human brain, consisting of interconnected layers of nodes (neurons). They are composed of an input layer, one or more hidden layers, and an output layer.

There are various types of neural networks, such as feedforward neural networks (FNN), convolutional neural networks (CNN), deep neural networks (DNN) and recurrent neural networks (RNN).

The Spelled-out Intro to Neural Networks and Backpropagation

https://github.com/karpathy/nn-zero-to-hero/blob/master/lectures/micrograd/micrograd_lecture_first_half_roughly.ipynb

class Value:

  def __init__(self, data, _children=(), _op='', label=''):
    self.data = data
    self.grad = 0.0
    self._backward = lambda: None
    self._prev = set(_children)
    self._op = _op
    self.label = label

  def __repr__(self):
    return f"Value(data={self.data})"

  def __add__(self, other):
    out = Value(self.data + other.data, (self, other), '+')

    def _backward():
      self.grad += 1.0 * out.grad
      other.grad += 1.0 * out.grad
    out._backward = _backward

    return out

  def __mul__(self, other):
    out = Value(self.data * other.data, (self, other), '*')

    def _backward():
      self.grad += other.data * out.grad
      other.grad += self.data * out.grad
    out._backward = _backward

    return out

  def tanh(self):
    x = self.data
    t = (math.exp(2*x) - 1)/(math.exp(2*x) + 1)
    out = Value(t, (self, ), 'tanh')

    def _backward():
      self.grad += (1 - t**2) * out.grad
    out._backward = _backward

    return out

  def backward(self):

    topo = []
    visited = set()
    def build_topo(v):
      if v not in visited:
        visited.add(v)
        for child in v._prev:
          build_topo(child)
        topo.append(v)
    build_topo(self)

    self.grad = 1.0
    for node in reversed(topo):
      node._backward()


a = Value(2.0, label='a')
b = Value(-3.0, label='b')
c = Value(10.0, label='c')
e = a*b; e.label = 'e'
d = e + c; d.label = 'd'
f = Value(-2.0, label='f')
L = d * f; L.label = 'L'
L
from graphviz import Digraph

def trace(root):
  # builds a set of all nodes and edges in a graph
  nodes, edges = set(), set()
  def build(v):
    if v not in nodes:
      nodes.add(v)
      for child in v._prev:
        edges.add((child, v))
        build(child)
  build(root)
  return nodes, edges

def draw_dot(root):
  dot = Digraph(format='svg', graph_attr={'rankdir': 'LR'}) # LR = left to right

  nodes, edges = trace(root)
  for n in nodes:
    uid = str(id(n))
    # for any value in the graph, create a rectangular ('record') node for it
    dot.node(name = uid, label = "{ %s | data %.4f | grad %.4f }" % (n.label, n.data, n.grad), shape='record')
    if n._op:
      # if this value is a result of some operation, create an op node for it
      dot.node(name = uid + n._op, label = n._op)
      # and connect this node to it
      dot.edge(uid + n._op, uid)

  for n1, n2 in edges:
    # connect n1 to the op node of n2
    dot.edge(str(id(n1)), str(id(n2)) + n2._op)

  return dot
draw_dot(L)

https://youtu.be/VMj-3S1tku0?feature=shared&t=1925

Types of Neural Networks

Feedforward Neural Network (FNN):

  • Simplest type where connections between nodes do not form cycles
  • Information moves in one direction
  • Use Cases: Classification and regression

Deep Neural Network (DNN):

  • Type of feedforward neural network with multiple hidden layers (hence "deep").
  • DNNs can learn complex patterns in data by stacking layers of neurons.
  • Use Cases: Speech recognition, Image classification, NLP, …

Convolutional Neural Network (CNN):

  • Specialized type designed for processing structured grid data, such as images.
  • Uses convolutional layers to automatically detect and learn spatial hierarchies of features (like edges, textures, and shapes).
  • Use Cases: Image recognition, object detection, video analysis

Recurrent Neural Network (RNN):

  • Designed for sequential data.
  • Have connections that loop back on themselves
  • Allows them to maintain a memory of previous inputs.
  • Suitable for tasks where context is important.
  • Use Cases: Time series prediction, NLP, and speech recognition.

Terminology

  • MLP: Multi-Layer Perceptron, the classic type of neural networks
  • Layer: A group of neurons
  • Weights: Strengths of connections between neurons
  • Bias: Inputs multiplied by weights and summed up with an output neuron's bias
  • Epoch: Going once through the entire training data
  • Activation Function: A nonlinear function that the output goes through
  • Binary Classifiers: Take a float vector in and give an integer out
  • Supervised Learning: Training by giving input and expected output
  • Mini-Batching: Updating weights and biases several times during an epoch
  • Learning Rate: How much to update the weights during training
  • Gradient Descent: How neural networks are trained.
  • Cost Function: Says how wrong the output is for that input.

Resources