You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Drew Bednar a99d1bf936 Saving both Blog and NN stuff		6 years ago
..
clusters	Saving both Blog and NN stuff	6 years ago
clusters_two_categories	Saving both Blog and NN stuff	6 years ago
complex	Saving both Blog and NN stuff	6 years ago
linear	Saving both Blog and NN stuff	6 years ago
quadratic	Saving both Blog and NN stuff	6 years ago
README.md	Saving both Blog and NN stuff	6 years ago

README.md

Neural Networks in Python Youtube tutorial

https://www.youtube.com/watch?v=aBIGJeHRZLQ

Hyper-parameters

Batch size: How many data points are we passing through the network during each step.
Number of Hidden Layers
Number of Neurons per layer
Learning Rate: How much do we update the network each step through
Optimizer: Algorith to update the nueral network
- Adam is very popular
Dropout: Probability nodes are randomly disconnected during training. If we drop out nodes randomly the reset of the network has to keep up. Our training data will not be complete and dropout helps simulate those unknowns.
Epochs: How many times do we go through our training data

How do we choose layers, neurons, and hyperparams

Use training performance (with a validation split) to guide your decisions
- High accuracy on training, but not validation (overfit) - Reduce # of params.
- Low accuracy on the validation set may mean you are underfitting the data - Increase # of params.
Automatically search for best hyperparams with a grid search (learning rate, batch size, optimizer, dropout, etc. )

Activation functions

Activation functions introduce non-linearity into our neural net calculations. It is a method that allows us to fit to more complex data and compute more complex things.

Ex:

Sigmoid
Tanh
ReLU
Leaky ReLU
Maxout
ELU

Hidden Layers

To start ReLU isn't a bad way to go in your hidden layers. ReLU avoids the vanishing gradient problem, and is usually a safe bet. Your mileage may vary.

Output Layer

Softmax function is good for single-label classification. (Ex: Is it Red, Yellow, Blue, or Green?)

Sigmoid is good for multi-label classification. (Ex What is the color and shape? Label1: Color Label2: Shape)

Keras vs PyTorch

Keras (Uses Tensorflow under the hood.)

Great for getting started quickly & rapid experimentation.
Lacks control & customization for more complex projects.

Tensorflow

Historically the most popular framework for industry
Can get pretty complicated & documentation isn't always consistent.

PyTorch

Favorite of the research / acedemic community
Very pythonic syntax, can easily access values throughout the network.