THE THREE-SIGMA RULE
By RAFAŁ WAŚKO (Predictive Solutions) The three-sigma rule is an important tool in statistics and quality management. In the context of data analysis, it allows the identification of outlier points that are significantly different from the rest of the data. The use of...
SEGMENTATION: FROM GROUPING TO CLASSIFICATION
By RAFAŁ WAŚKO (Predictive Solutions) Segmentation is a key process in data analysis, dividing a data set into relatively homogeneous groups based on specific criteria. The purpose of segmentation is to identify hidden patterns, differences and similarities between...
OUTLIER OR ANOMALY? DETECTION OF ABNORMAL OBSERVATIONS
Von NATALIA GOLONKA (Predictive Solutions) Can one abnormal occurrence cause concern? Based on one deviation from the norm, should a red light start flashing? Of course! In many industries and businesses, an anomaly is a sign that must be reacted to quickly and...
ENTROPY
By NATALIA GOLONKA (Predictive Solutions) Entropy is a measure of disorder or uncertainty in a probability distribution.The concept was first introduced in 1854 by the physicist Rudolf Clausius, dealing with thermodynamic issues, and in this sense the definition of...
STATISTICAL INFERENCE
By NATALIA GOLONKA (Predictive Solutions) Statistical inference is the branch of statistics through which it becomes possible to describe, analyse and make inferences about the whole population on the basis of a sample.Studying the entire population can be a very...
LEVELS OF MEASUREMENT
By NATALIA GOLONKA (Predictive Solutions) The level of measurement is one of the most important properties of variables. It determines which statistical tests will be available to the researcher during the course of the analysis. But what information does it convey to...
PEARSON’S CHI-SQUARE CORRELATION TEST
By RAFAŁ WAŚKO (Predictive Solutions) Popular statistical tests include Pearson's chi-square tests. It is worth noting at the outset that this test has more than one application. In this material, I will discuss the main differences between the tests and introduce the...
OUTLIER CASES. IDENTIFICATION AND SIGNIFICANCE IN DATA ANALYSIS
By RAFAŁ WAŚKO (Predictive Solutions) In data analysis, it is important to identify unusual observations that are significantly different from the others. Such values, called outliers or outlier cases, can affect the results of statistical analysis and lead to...
LOGISTIC REGRESSION
By WIKTORIA KORYGA (Predictive Solutions) In practice, the simplest and most commonly used type of regression is the linear regression model, whose parameters are estimated using the Least Squares Method. However, linear regression is only used to predict a continuous...
GINI INDEX
By WIKTORIA KORYGA (Predictive Solutions) The Gini index is a measure of the concentration of a variable's distribution. In statistics it is commonly used to describe the concentration (unevenness) of the distribution of a random variable, while its most popular use...