COW Data Set Hosting Program

A major practical issue in collecting and releasing updated cross-national and cross-time data sets such as those collected by the Correlates of War (COW) Project has been the amount of resources needed to simultaneously maintain a large number of data sets. As a result, COW has implemented a distributed system of data set hosting based on the notion of “coordinated decentralization.” The goal is for each COW data set to obtain a semi-permanent “home” and “host,” that is, an institution and an individual who will agree to maintain a data set and the related documentation for a period of time. The care given to a data set by its host follows a set of guidelines designed to ensure continued consistency with COW standards. The Director and the COW Advisory Board are responsible for monitoring data sets and hosts.


 There are several guidelines for the adoption of a data set by an institution:

  1. The host agrees to comply with standards set by the COW project with respect to data collection procedures, coding rules, structure and format of the data set, and documentation procedures. These standards are described here.

  2. The host of a COW data set takes responsibility for revising and routinely updating the data set, documentation, and related archival material in his or her care for a period of 3-5 years. The host will keep track of reported errors and questions, and will release new revised versions at regular intervals, typically every six months if minor errors are discovered and corrected, and at longer interval as appropriate for major revisions and updates.

  3. Data set hosts must be experienced with the collection of quantitative data sets, and should have experience with the data set in question. Sufficient institutional resources should be available to support the hosting, possibly including relevant computer resources, research support, or (especially in the case of junior faculty) assurance that proper credit will be given to the host.

  4. The host agrees to serve as the primary contact person and deal with substantive questions concerning the data set (i.e., the host's email address will be listed on the COW web site as the person to contact with data set questions).

  5. COW data sets will be released only through the COW website (not by individual hosts), and only after the data are final. Procedures for data set review are described on the website. The purpose of this rule is to avoid a proliferation of partial, unofficial, or inconsistent data sets through the research community.

  6. The host agrees not to publish any analytical results based on the resulting updated COW data set before the data are officially released by the COW project. Exceptions may be made for descriptive papers at conferences and dissertation theses, but it must be noted that such results represent analysis based on work in progress and of possibly incomplete data sets, and cannot be said to use official COW data. The purpose of this rule is to avoid a proliferation of non-replicable or frequently-revised results through the research community.

  7. When a major revision or update of a data set is complete, the host agrees to compose and publish an “article of record” concerning the new data set (for instance, in a professional journal). We expect all scholars who use the resulting data set to cite this article of record and to clearly state the data set version used for analysis.

  8. The host agrees to submit a yearly status report on the data set to the Director.

  9. Host shall maintain all available documentation associated with the data sets, and that documentation should be accessible to the Director, Associate Director, and Advisory Board.

  10. Hosts are expected to attend periodic COW meetings during annual meetings of the professional associations.

  11. A subset of data hosts serve on the COW Advisory Board on a rotating basis.

  12. Hosts are expected to work with the Director to secure external grant funds for data collection and updating.

  13. The Advisory Board reserves the right to remove a data host if there are significant problems with data collection or updating.

Related Documents

Current Data Hosts