### Data modelling

There are numerous different techniques you can use to model your data, which you choose is dependent on the problem you’re wanting to solve:

- Classification may be used to solve a Yes/No question
- Regression can be used to predict a numerical value
- Clustering can group observations into similar looking groups

### Predictive modelling

When you have specific definitions to group your data by, predictive modelling can be a useful alternative to clustering. Variables found to be statistically significant predictors of another variable can be used to define segmentations for your analysis.

### Statistical learning

Statistical learning emphasises more on mathematics and statistical models with their various interpretations and precisions.

### Data mining

The technology used for collecting, store, processing, transforming and analysing raw data in order to make it useful for gaining insights.

### Overfitting

A term to describe where a model has been iterated several times to the effect that it is performing more accurately with the test data than it would with any new data.

### Regularisation

Regularisation, also referred to as ‘shrinkage’, is the process of adding information into your data as a technique for avoiding overfitting in your machine learning model.

### Cross validation

Cross validation is a process for evaluating a machine learning algorithm which is also a technique to prevent overfitting. Nested cross validation is a method for tuning the parameters of an algorithm.