COW Data Set Hosting: Review Procedures


The criteria for release of an updated data set by COW include basic data standards of internal data set consistency, comparability to and compatibility with existing data sets, and high quality data. These criteria can be met by carefully following the coding rules defined at the beginning of a project, and working with the Director and Associate Director office to ensure consistency of the data set format and structure.

When a host believes that a data set is ready for release as an updated COW data set, he or she will submit the data to the COW Director and Associate Director. A series of checks will then be undertaken before a version number is assigned and the updated data released.

  1. A series of automated checks will be conducted to ensure that all countries and years have been included in the data set (where a data set is cross-national and cross-time), that all data points are unique (no duplicate records or values), and that country codes and data points included in the data set match the Correlates of War National System Membership lists.
  2. Variable names and value codes will be examined for uniqueness, descriptive accuracy, and consistency. For example, whenever possible variable names must match names from prior data sets, and must accurately describe of the variables' content. Dummy variables will be coded as 0=no, and 1=yes. Missing value codes will be consistent (typically -9 when possible) and clearly described in the documentation. Names and categories deemed unique will be checked for uniqueness.
  3. A review of procedures will be done to ensure that coding rules have been followed.
  4. Spot checks of individual data points collected by the individual host will be conducted to verify data values and source identification.
  5. Documentation will be reviewed, and source lists will be examined to ensure that every new data point can be traced to a point of origin.
  6. The format of the data set (e.g. unit of analysis [country-year, monad-year], file type [Excel file, Access file, flat text]) will be examined and made consistent with other data sets.
  7. In case of problems, the data set may be updated by COW, or may be returned to the host for further work.
  8. The COW Advisory Board may be routinely consulted on issues of data set structure, coding rules, case coding, and other issues that arise in the course of data set review.
  9. In the case of disagreement between the host and COW about the release status of the data set, whether such disagreements concern issues of format or substantive coding decisions, the COW advisory board is available for consultation and problem resolution.
  10. The target for final data set release no more than six months after a candidate final release data set is submitted.