ANN Building Blocks part 1

Biological Neurons

Neuron

Algorithm

Multiple input (on/off):
- from 1-several neurons
Processing:
- Combination: of inputs
- Activation: on or off state
Single output (on/off): to 1-several neurons

Artificial neurons

Neuron

Algorithm

Multiple input:
- from 1-several neurons
Processing:
- Combination: of inputs – linear model
- Activation: activation function
Single output: to 1-several neurons

Weighted linear combination of input:

\[ \begin{eqnarray*} z_j &=& \sum_{i} w_{i,j} a'_{i} + b_j\\ \textrm{weights}&& w_{i,j}\\ \textrm{bias}&& b_j \end{eqnarray*} \]

Activation function:

e.g., the Sigmoid (logistic) activation function

\[a_j = \sigma(z_j)\]

The Sigmoid Neuron

Weighted linear combination of input:

\(z_j = \sum_{i} w_{i,j} a'_{i} + b_j\)

Sigmoid/logistic activation function

\(a_j=\sigma(z_j) = \frac{1}{1+e^{-z_j}}\)

Compare with logistic GLM

Weighted linear combination of input:
- \(z = \sum_{i} \beta_{i} x_{i} + \alpha\)
Sigmoid/logistic link function
- \(Pr[y=1|x] = p = \sigma(z) = \frac{1}{1+e^{-z}}\)

… or equivalently

\(\begin{eqnarray} \sigma^{-1}(p)&=&\log\left(\frac{p}{1-p}\right) =\\ logit(p) &=& \sum_{i}\beta_{i} x_{i} + \alpha \end{eqnarray}\)

Example

Neuron

Let inputs be:
\(\begin{eqnarray} a'_1&=&1\\ a'_2&=&0\\ a'_3&=&1 \end{eqnarray}\)

and we have
\(\begin{eqnarray} z_1 &=& \sum_i w_{i,1}a'_i + b_1\\ a_1 &=& \sigma(z_1) \end{eqnarray}\)

\(z_1 = 0.3 \times 1 + 0.8 \times 0 + 0.2 \times 1 - 0.5 =\) \(0\)

\(a_1 = \sigma(z_1) = \frac{1}{1+e^{-0}} =\) \(0.5\)

…

So, if a sigmoid artificial neuron is just another way of doing logistic regression?

… then what’s all the fuss about?

The fuss happens when you connect several neurons into a network

Neuron

Feed-forward artifical neural networks (ffANN)

Layers

“Columns” of 1-many neurons
A single Input layer
- Input neurons receives data input and passes it to next layer
1-many Hidden layer(s)
- Articial neurons process their input and deliver output to next layer
A single Output layer
- Artifical neurons process their input and deliver final output \(\hat{y}\)
  - output \(\hat{y}_j = a_j\)
  - Continuous \(\hat{y}\): Regression
  - Discrete \(\hat{y}\): Classification

Connectivity between layers

ffANN are fully connected (“dense layers”)
- each neuron in a layer is connected to each neurons in next layer

ffANN examples

ANN1

Other drawing style, omitting \(w\) and \(b\).

ANN1alt

ANN2

Often layers are ‘boxed’

ffANN examples

layers w >1 dimension (e.g., images) – (messy!) Neuron

Simplify! nodes and arrows implicit Neuron

Collect similar layers into ‘blocks’ Neuron

ffANN examples

Also other type of layers/blocks (cf. coming lectures)

Hidden Layers

Inutitive function of hidden layers?

Each layer can be viewed as transforming the original data to a new multi-dimensional space.
A hidden layer should, in practice, have at least two neurons to be meaningful
- Single neuron layer collapses information and forms a bottleneck
- A bottleneck early heavily constrains the NN

Depth of ANN

number of hidden layers + output layers

Deep Learning

Formally, ANNs with depth > 1
- (often include more advanced layers as well)

Why deep Learning?

For Regression

Single layer \(\approx\) logistic regression

Why Deep Learning?

For Regression

Single layer \(\approx\) logistic regression
More layers \(\rightarrow\)
- more complex, non-linear, models

Why deep Learning?

For Regression

Single layer \(\approx\) logistic regression
More layers \(\rightarrow\)
- more complex, non-linear, models

For classification

Single layer \(\approx\) one hyper-plane

Why deep Learning?

For Regression

Single layer \(\approx\) logistic regression
More layers \(\rightarrow\)
- more complex, non-linear, models

For classification

Single layer \(\approx\) one hyper-plane

Why deep Learning?

For Regression

Single layer \(\approx\) logistic regression
More layers \(\rightarrow\)
- more complex, non-linear, models

For classification

Single layer \(\approx\) one hyper-plane
Adding layers \(\rightarrow\)
- more hyper planes \(\rightarrow\)
- more advanced classification

Mini exercise

http://playground.tensorflow.org/
- Try different input “problems”
- Investigate how different depth affect classification
  - number of hidden layers
  - number of neurons in layer
- Run for several epochs (=iterations)