
Kernel density estimation is best used when a fair amount of data is available, resulting in more stable density estimates. As there are more data points in a region, the height of the density curve in that area will increase. Each data point has an equivalent influence on the final distribution. To build the final density curve, the areas for all data points are stacked up into a complete whole. Bandwidth size is usually determined by using mathematical rules of thumb, but can be tweaked depending on the shape and skew of the data to be plotted. In addition, kernels can have different width, or bandwidth, affecting the influence of each individual data point. Kernels can take different shapes from smooth bell curves to sharp triangular peaks. The shape of this area is called the kernel function. In a KDE, each data point contributes a small area around its true value. Below, we’ll perform a brief explanation of how density curves are built. The density curve, aka kernel density plot or kernel density estimate (KDE), is a less-frequently encountered depiction of data distribution, compared to the more common histogram. Additional elements, like box plot quartiles, are often added to a violin plot to provide additional ways of comparing groups, and will be discussed below. The peaks, valleys, and tails of each group’s density curve can be compared to see where groups are similar or different. Violin plots are used when you want to observe the distribution of numeric data, and are especially useful when you want to make a comparison of distributions between multiple groups. The latter fact would have been missed with the box plot alone. However, the second experimental condition (B) has a much more elongated distribution compared to the other two groups, without a distinct peak. We can see from the plot that the two experimental techniques provided different benefits compared to the control. In the middle of each density curve is a small box plot, with the rectangle showing the ends of the first and third quartiles and central dot the median.

The example violin plot above depicts the results of a fictional experiment with one control group and two experimental conditions. Densities are frequently accompanied by an overlaid chart type, such as box plot, to provide additional information. The width of each curve corresponds with the approximate frequency of data points in each region. A violin plot depicts distributions of numeric data for one or more groups using density curves.
