Exploratory Data Analysis [1]

Exploratory Data Analysis (EDA) is an approach/philosophy for data analysis that employs a variety of descriptive statistics, mostly graphical, to explore data without a priori assumptions about the data, such as those required for inferential approaches. The EDA approach is not a set of techniques, but an attitude/philosophy about how a data analysis should be carried out.

The particular graphical techniques employed in EDA are often quite simple, consisting of various techniques of:

In his seminal work on the subject, Exploratory Data Analysis (Addison-Wesley, 1977), John Tukey suggests that one thinks of exploratory analysis as the first step in a two-step process similar to that used in criminal investigations. In the first step, one searches for evidence using all of the investigative tools that are available. In the second step, that of confirmatory data analysis, one evaluates the strength of the evidence and judges its merits and applicability. It is in this second step that one would likely apply the techniques of inferential statistics.


1. This definition is based on www.itl.nist.gov/div898/handbook/toolaids/pff/1-eda.pdf, www.datamology.com/eda.shtml, and www.statistics4u.info/fundstat_eng/.