Creating the Database Tables

Digital Archive Sponsor Logos

The original file and name the new files were created. These pieces were saved in the appropriate folder (i.e., “Manuscripts,” “Review Cover Sheets,” “Anonymous Review,” or “Response to Reviewers”). 

Each manuscript  file includes an MSN (manuscript number), followed by a period, and an RN (review number). The first submission of a manuscript will be “R0,” though this will not be reflected in the original file name. Extraneous materials are coded similarly, except they are appended with a period, and an ‘X’ (e.g., MSN.RN.X)   Review cover sheets were marked with a period, an ‘N,’ and a stable reviewer number (e.g., MSN.RN.N1). Anonymous reviews are marked with a period, an ‘A,’ and the stable reviewer number (e.g., MSN.RN.A1). Response to Reviewers are coded with ‘RR” (e.g., MSN.RN.RR). The same person often reviews multiple revisions of the same manuscript.

When a manuscript has been accepted following revisions, there will often be two copies of the final manuscript, one submitted by the author(s) and one that has been copyedited for print. In these cases, one discrete manuscript was coded as 03-095.R3.0 and the other as 03-095.R3.1.  Multiple copies of the same anonymous review are similarly coded, e.g., 03-095.R3.A1.0 and 03-095.R3.A1.1 Using this exact format is important, as it allows for the files to be accurately linked into the database. To clarify this process, see the following tables.


Manuscript Number 03-095


Digital Archive Database Meta Files (Journal Builder Files)

As noted, during the period covered by the Archive Project, between 1990 and 2010, ASR editors used the Journal Builder tool to manage the submission, review, and publication process.  Manuscripts, reviews and correspondence were organized in paper files, which became the source of the DA scanned files discussed below).  Journal Builder, on the other hand, provides key meta-data about the manuscripts and reviews making up the DA. 

Journal Builder provided the project with an accurate record of every version of every manuscript submitted during this time period: its title, author(s), reviewers, transaction dates, reviewer decisions and the outcome associated with each version of a manuscript. All versions of a manuscript share a common manuscript number, with each version of a manuscript designated with a revision number (R0 through Rn). In the Journal Builder files, which the project obtained as a series of Excel workbooks, each row is a version of a manuscript. See Table 1 for an example of a Journal Builder file.  The manuscript number begins with two digits, representing the year of initial submission, followed by a dash and then a three-digit manuscript number, which is a sequential index of manuscripts entered into Journal Builder in a given year. This number is then followed by the revision number within paragraph marks.

The manuscripts contained in Table 1 indicate an important feature of the DA. Note that the manuscript numbers begin with 89, presenting the year 1989 and in each case the revision number is R1, indicating each record represents a revision of an original manuscript, which would have the revision number (R0) that was submitted prior to the DA’s focus on the time period 1990-2010.  These records are left censored in the DA and it is up to users to decide if they wish to include them in their analyses. If one is interested in looking at review decisions during this time period regardless of revision number, then the researcher may decide to include them in the analysis.  On the other hand, if the researcher is concerned with “the life course” of a manuscript, then she may want to exclude these cases as the origin of the manuscript is left censored.


Sample Journal Builder File

Table 1

Journal Builder
Key variables are created to later match the four files for purposes of analysis. Manuscript (Mx) and version (Vy) numbers are the keys linking the Manuscript File with the Reviewer and Author Files. PIDz numbers link the Person File with the Author and Reviewer Files. The Journal Builder files were then used to create four new files—a manuscript file, a reviewer file, an author file, and a person file. (See Tables 2a, 2b, 2c, and 2d for examples of the files created from the Journal Builder files.)  Before looking at each of these files individually, it is important to understand the rationale for breaking up the Journal Builder files into different types of files. Separating out manuscript, author, reviewer and person data in separate files gives researchers the maximum flexibility in working with the data, particularly with regard to the selection of the unit of analysis. Moreover, data from the Journal Builder files not only had to be separated in these four types of data, but multiple reviewer and possibly multiple author records had to be created for the author and reviewer files from each record in the Journal Builder file.  For example, one sees in the first row of Table 2a that the reviewer column contains the names of three reviewers separated by semi-colons (Person 1; Person2; Person3;) and the columns “Sent,” “Return date,” and “Rec” each also contain three data points separated by semi-colons, indicating when each reviewer was sent the manuscript, when it was returned to ASR, and the recommendation of that particular reviewer. In this case three records would have to be generated for the reviewer file containing the information of each individual reviewer. Similarly, looking at the fourth row in Table 1, four records would be created in the author file for revision (R1) of manuscript 89-016. The editorial decision and decision date, which is common to all four reviewers, is stored separately in the manuscript file, along with the manuscript title.


Manuscript File

The manuscript number (MSN) is the primary identifier, with the revision number (RN) as a secondary identifier.  The manuscript file contains one record for each revision of the manuscript number, and the version number. For each version of the manuscript, the file contains the manuscript title, submission date, editorial decision (accept, reject, revise and resubmit), and the date the decision was made.  If no permission is given by any author or reviewer, the title will be replaced by key words from the title on the manuscript. If permission has been given, the original title is used.

Table 2a: Overview of the Manuscript File Content

Overview of the Manuscript File Content


Author File

This file contains the name of each author for each revision of the manuscript (along with the manuscript number).  As with reviewers, authors may change over time (sometimes being added, sometimes subtracted).  As with the the manuscript number (MSN) is the primary identifier, with the revision (RN) as a secondary identifier.  With these identifiers, data from the author file can be linked to the manuscript file or to the reviewer file.  The personal id (PersonId) may be used to link an author to his or her demographic and institutional data contained in the person file.  If the author does not agree to have his or her name included in the file only a person identification number will be provided.

Table 2b: Overview of the Author File Content

Overview of the Author File Content


Reviewer File

The reviewer files contain one record for each review of each revision of the manuscript (along with the manuscript number).  During the review and resubmit processes, the names of reviewers can vary from version number to version number, so a sequential number is assigned to each, since the name of the reviewer is redacted from the file. The reviewer file will also include the send and return date of the manuscript along with the reviewer’s recommendation for each version of the manuscript.  If the reviewer does not wish his or her name to be included in the file, a person identification number will be used instead.

Table 2c: Overview of the Reviewer File Content

Overview of the Reviewer File Content


Person File

There is one record for each author or reviewer.  The Person file contains data from the Journal Builder file for person name, to be deleted if permission is not given to use it, and a person ID is substituted, demographic information including race, ethnicity, and gender that are derived from the ASA membership data and other sources were included along with information about affiliation and rank over time from the Grad Guides and from additional internet searches. 

The purpose of the Person File (PFILE) is recording all reviewers and authors who participated in the ASR submission and review process. Each person appears only once with the demographic variables, Ph.D. granting institution, degree major, degree granting year, an affiliated institution in the year of publication, an affiliated institution in 2016-2017.

Each person is identified with identification numbers (PID) linking Reviewer Files, Author Files, and curated manuscript PDF files. If the person has not given permission for a particular submission or review, then they are not linked to the PFILE from author/reviewer files.

PFILE was created from the master data, which is constructed with the list of authors/reviewers provided by ASA, the Manuscript File, the Curation Log files, the ASA membership data, the GRAD Guides, and the Qualtrics survey data,.

Currently, PID links individuals in the PFILE, Author File, and the Reviewer File, but it will reveal authors’ identities if the author has granted us the permission to be used with their names. We plan on providing a different PID for reviewers in the public version of PFILE and Reviewer File.

Table 2d. Overview of the Person File Content

Overview of the Person File Content


Permissions Survey Files

As noted, securing permission from authors and reviewers to place their manuscripts and reviews in the DA in the form of identifiable files that allow for research on changing paradigms, demographics, networks, and academic careers. We sent out the permissions survey and five (5) follow-ups. In addition, we sent personalized emails.  to those co-authors and those authors and reviewers who had never responded to the permission surveys that either Bobbie Spalter-Roth, Jim Witte, or Jean Shin knew. This later technique was highly effective and we have permission for 2,465 manuscripts of about 40 percent of the manuscripts with email addresses.