Controlled Vocabulary [1]

A controlled vocabulary is a coding protocol that allows the most consistent transformation of qualitative information into a countable set of data categories. For example, in the transformation of qualitative human rights statements into quantitative information performed by the Human Rights Data Analysis Group (HRDAG), violations are coded via a controlled vocabulary into categories such as "killing," "forced displacement," or "sexual assault." Many countries employ a controlled vocabulary to classify adverse reactions to vaccines, and store them in a database suitable for analysis.

In order to create a controlled vocabulary and ensure the quality of the data, every definition within the controlled vocabulary must satisfy the following five properties:

  • Mutually exclusive: No single item to be coded can fit into more than one definition in the controlled vocabulary.
  • Exhaustive: A definition must exist for every possible item that can occur in the qualitative data being studied.
  • Distinguished: Each definition must have an explicit characteristic that distinguishes it from all others in the controlled vocabulary.
  • Exemplified: Each definition must be accompanied by examples showing how to apply the definition in a specific situation.
  • Countable: Each definition must contain a counting rule explicitly stating how items are enumerated.


1. This definition is based on the Controlled Vocabulary definition given on the HRDAG Web site.