1. Assumptions of a linear regression model
The relationship between the variables (independent/dependen)t is linear.
The findings are not dependent of each other.
There is no multicollinearity which means independent variables are not highly correlated.
2. Types of neural networks and how are they applied in the real world
Generative Adversarial Networks (GANs) - Consists of a generator and a discriminator component, that are trained simultaneously in a competitive manner to generate realistic data
Used in Image generation and Image conversion
Transformer - Relying solely on attention mechanisms, eliminating the need for recurrence and convolutions
Used in language translation and human like text generation
3. Types of clustering algorithms
Clustering is an example of unsupervised learning where data is grouped (clustered) into groups.
K-Means Clustering - Algorithm separates data into group points by finding the nearest center point for each group and adjusting the center point until there is little change.
Hierarchical Clustering - Algorithm creates a tree of clusters by measuring how far part data points are from each other. It starts with each point as its own cluster and merges them step by step such as in the Bottom-Up method.
4. Types of decision trees.
Classification Trees - Produce categorical outputs. Designed to categorize into two or more classes based on pre-set rules from training data.
Regression Trees - Produce numeric outputs. Predicts value for each node, representing expected response variable values to that leaf.
5. Type of methods of random forest.
To avoid closely fitting data in a decision tree, this technique builds multiple decision trees and lets them vote on how to classify inputs. (Data Science from Scratch by Joel Grus, 2015)
Bootstrap Aggregating - Instead of training each tree on all the inputs, you train each tree on the "bootstrap" sample. This reduces variance and overfitting making for a better model.
Feature Randomness - Instead of finding the best node split among feautures, it searches for the best of random features. This increases the diversity in the tree which makes it more accurate.
GAN - Anime Image Creation
GAN - Image Conversion
Transformer - Language Translation
Transformer - Text Generation
Clustering - Bottom Up Hierarchical
Clustering - K-Means
Decision - Regression Tree
Decision - Classification Tree
Random Forest - Bootstrap Aggregation
Random Forest - Feature Randomness