You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Drew Bednar a99d1bf936 Saving both Blog and NN stuff 5 years ago
..
clusters Saving both Blog and NN stuff 5 years ago
clusters_two_categories Saving both Blog and NN stuff 5 years ago
complex Saving both Blog and NN stuff 5 years ago
linear Saving both Blog and NN stuff 5 years ago
quadratic Saving both Blog and NN stuff 5 years ago
README.md Saving both Blog and NN stuff 5 years ago

README.md

Neural Networks in Python Youtube tutorial

https://www.youtube.com/watch?v=aBIGJeHRZLQ

Hyper-parameters

  • Batch size: How many data points are we passing through the network during each step.
  • Number of Hidden Layers
  • Number of Neurons per layer
  • Learning Rate: How much do we update the network each step through
  • Optimizer: Algorith to update the nueral network
    • Adam is very popular
  • Dropout: Probability nodes are randomly disconnected during training. If we drop out nodes randomly the reset of the network has to keep up. Our training data will not be complete and dropout helps simulate those unknowns.
  • Epochs: How many times do we go through our training data

How do we choose layers, neurons, and hyperparams

  • Use training performance (with a validation split) to guide your decisions
    • High accuracy on training, but not validation (overfit) - Reduce # of params.
    • Low accuracy on the validation set may mean you are underfitting the data - Increase # of params.
  • Automatically search for best hyperparams with a grid search (learning rate, batch size, optimizer, dropout, etc. )

Activation functions

Activation functions introduce non-linearity into our neural net calculations. It is a method that allows us to fit to more complex data and compute more complex things.

Ex:

  • Sigmoid
  • Tanh
  • ReLU
  • Leaky ReLU
  • Maxout
  • ELU

Hidden Layers

To start ReLU isn't a bad way to go in your hidden layers. ReLU avoids the vanishing gradient problem, and is usually a safe bet. Your mileage may vary.

Output Layer

Softmax function is good for single-label classification. (Ex: Is it Red, Yellow, Blue, or Green?)

Sigmoid is good for multi-label classification. (Ex What is the color and shape? Label1: Color Label2: Shape)

Keras vs PyTorch

Keras (Uses Tensorflow under the hood.)

  • Great for getting started quickly & rapid experimentation.
  • Lacks control & customization for more complex projects.

Tensorflow

  • Historically the most popular framework for industry
  • Can get pretty complicated & documentation isn't always consistent.

PyTorch

  • Favorite of the research / acedemic community
  • Very pythonic syntax, can easily access values throughout the network.