Select Page

Written by Rafal Wasko (Predictive Solutions)

BEFORE STARTING A MORE COMPLEX DATA ANALYSIS, IT IS WORTH TAKING A CLOSER LOOK AT VARIABLE DISTRIBUTIONS WHICH ARE OF INTEREST TO US.

We can use frequency tables, but a good way to quickly look at the data is also to present them on charts. To do this, we can use e.g. a histogram or a box plot. The latter one, despite its simplicity, presents a lot of important information, which is often referred to as the five Tukey numbers, which describe the data set. It is the minimum, the maximum, the lower and the top quartile and the median. Another chart which also contains a lot of information regarding the variable of interest to an analyst is the violin plot.

A violin plot can be considered as one of the variants of a box plot, with the difference being that a violin plot presents more data, e.g. we can quickly observe whether the distribution of the presented variables is multimodal.

At the beginning let’s see how a simple violin plot looks like for one quantitative variable being height specified in centimeters.

The same data are presented on the chart as on the chart above, but with breakdown into sex.

The chart allows presentation of distribution of a quantitative variable within a qualitative variable category. As one can see, a violin plot has several elements in common with a box plot. Additional statistics can be displayed on the plot, i.e. the median position marker and the quartile range.

To sum up, a violin plot will be helpful when analyzing the distribution of data. It presents more data than a box plot and is also simple to interpret. The plot also allows to present data with breakdown into subsets and presents where there is a larger or smaller accumulation of the observations.

Contact us, to find out more about these and all other great features of PS IMAGO PRO!