# Month: May 2020

## Applied statistics

Applied statistics Applied statistics is used to solve practical problems with data. For each type of experiment in applied statistics, specific steps must be followed to collect, analyse and interpret the data.   Though applied statistics and data science can overlap at times, it is usually the case that applied statistics is being used by Read More …

## Nonparametric statistics

Nonparametric statistics When your data does not fit to a normal distribution, nonparametric statistical methods are required. A number of the nonparametric tests use ordinal data rather than numerical, with the data ranked and sorted in order. For example, nonparametric statistics could analyse a Likert scale ranking survey responses: Strongly Agree, Agree, Neither agree nor Read More …

## Natural language processing (NLP)

Natural language processing (NLP) Natural Language Processing is a branch of artificial intelligence that uses computer algorithms to understand human’s language.   NLP aims to read, understand and learn from human languages and to provide insights about the data.

## Z-tests

Z-tests A one-sample z-test assesses the sample mean of a variable against a population mean, whilst a two-sample z-test compares the mean from two different groups. The differences are compared with the estimated standard error to conclude whether there is evidence that the population means differ.   It is generally accepted that a sample size Read More …

## T-tests

T-tests T-tests can be used with any sample sizes and the mean or standard deviation of the population do not need to be known. Although the t-test relies on the assumption of a normal distribution, its probability values are based on the t-distribution. The test is appropriate when either the population is normal or the Read More …

## Confidence intervals

Confidence intervals A confidence interval describes a range of values that are likely to include the true value for a population. The upper and lower confidence limits are the two numbers that make up the range of the interval. Confidence intervals do not provide certain answers, they are an estimate based on a sample.   Read More …

## Analysis techniques

Time series analysis General term for trending findings over a time period, usually in a graphical format, which can be used to make predicted forecasts for the future.     Survival analysis A statistical method of estimating the expected duration of time until an event occurs. Example: A human beings expected lifespan being assessed using Read More …

## Chart types

Histogram Similar in look to a horizontal bar graph except the bars are connected to each other, histograms are formed from grouped data to display frequencies or relative frequencies (percentages) for each class in a sample.     Scattergrams A method of displaying the correlation between two or more variables, including a line of best Read More …

## JavaScript

JavaScript A language commonly used in web design, JavaScript complements and integrates with both Java and HTML.   JavaScript packages include numerous libraries for charts, graphs and other data visualisations.   The syntax used by JavaScript bears resemblance to the syntax for the C# programming language.

## Python

Python Python is an open source language used for detailed statistical analysis, testing and modelling. It is considered object-oriented and is often used for building reusable code patterns.   Popular Python packages for data science include: NumPy (Numeric Python, for performing calculations over entire arrays) Matplotlib (for data visualisations) SciPy (for scientific and technical computing) Read More …