Animate v.2004

Custom Data Integration

File Type Description
The four essential file types used in VoteWorld (JVW) are called CRR, RCL, INF, and DTL. This comes from the fact that the file’s extension is one of these forms, ie. 1e.crr or 28u.dtl. The number and letter combination before the extension refers to the session/parliament of the data. For example, in the US House, the files are simply named using the convention “#.ext”, such as 1.dtl, where the number refers to the first session of the DTL file, while in the European parliament, an ‘e’ is added to make 1e.dtl for the first session. Likewise, the naming convention for the UN is “#u.ext” and for the US Senate is “#s.ext”.
Each of these file types hold different data which is described below. Where possible each data value is separated by a “space-comma-space” combination. For example:
“G , ANGLADE” or “1402 , 6”
If there is no space between the commas, and if the commas are not present, the data will be considered invalid and the program will not run. The only exception to this rule is for the US Congress data which was formatted before the other legislatures and is assumed to be the only unique case. In the case were data is not available, such as DW-nominate scores, n/a is used. In cases where an integer is used to refer to a party id or a country id (as described below), a code table explaining what the values stand for should also be provided, usually in an Excel or text format.

Custom Data Format
The file format used for custom data for legislatures not already a part of JVW follows. Once all file types are prepared, they need to be zipped together for ease of integration into JVW.

CRR Files
CRR files contain personal or specific information about a politician or a country, such as where the politician is from, party affiliations, and DW-nominate scores. Each line of the crr file holds the following information.

polid(int) , tokenRep(string) , name with spaces(string) , geographic name(string) , x1float(float) , y1float(float), descriptor1(int) , descriptor2(int) , descriptor3(int)
ex:
50 , G , ANGLADE Magdeleine , Spain , 0.363 , -0.848, 5 , 1402 , 6
52 , M , ANSART Gustave , Italy , 6 , n/a , n/a, 7 , 1409 , 5

polid - is the unique identification number for each politician. In the event that the politician is present in multiple sessions, their id never changes across sessions. This number should be an integer. This value may not be absent.
tokenRep – is a character representation of the party this politician belongs to that will be used to display the politician on the ideological map. This should be a single letter character and may not be absent.
name – the politician’s full name. Spaces are allowed between first, last, and/or middle name, and no commas should be used to separate name parts. This value may not be absent.
geographic name – the geographical full name with which to associate the politician. Spaces are allowed between multiple name parts and no commas should be used to separate name parts. In the case that a map is not being used, these values should be unique and not n/a since politicians need to be grouped in some unique fashion. This value may not be absent.
x1float/y1float – are floating point values for the DW-nominate scores in the first and second dimension respectively. These values may be absent.
descriptor1/descriptor2/descriptor3 – are other optional values by which politicians may be grouped (party, region, etc) in addition to the mandatory grouping my a geographic name. These values are integer and are supplied with Legend Tables to decode what the integer stands for. These values may be absent.

RCL Files
RCL files contain the actual votes of each politician, and should not be used to store additional descriptive data about the politician. The minimum data needed on each line is the politician/country id number, followed by a list of vote data. The vote data uses the numbers 0 – 9 to describe the votes and follows the convention:
0 – politician was not a member to vote (occurs when a seat is vacated or taken in the middle of a session)
1 – yes vote
2 – pair yes vote
3 – Ann. yes vote
4 – Ann. yes vote
5 – pair no vote
6 – no vote
7 – blank
8 – present but didn’t vote
9 – abstain vote

polid(int) , v1(int) , ....vn(int)
ex: 2 , 6 , 6 , 9 , 9 , 9 , 9 , 9 , 9 , 9 , 1 , 9 , 9 , 9 , 1

polid – the unique integer identification number for the politician.
v1-vn – the first through nth vote of the politician. Note that there is no terminating comma.

INF Files
INF files, like crr’s, contain any and all necessary information about a particular vote or issue on the table, such as the date, the number of yes/no votes, and most importantly the cutting line points.

"date" , voteNumber(int) , "title" , numYes(int) , numNo(int) , numAbstain(int) , d1float , z1float , d2float , z2float(floats, may be absent)
ex:
"19-Jul-79" , 2 , "Order of Business" , 78 , 218 , 26 , n/a , n/a , n/a ,
"19-Jul-79" , 3 , "Order of Business" , 29 , 205 , 6 , 0.528 , -0.234 , 0.253 , 0.127

“date” – is the date stored in a string literal and surrounded by quotations. There are no spaces between dd-mm-yy components and hyphens are used instead. This value may not be absent.
voteNumber – is the vote within the session and not throughout all sessions. This value is an integer and should not be absent.
“title” – is the a short descriptor of the vote in the quotation marks. Spaces are allowed between words and this value may not be absent.
numYes – is an integer specifying the number of politicians that voted yes on this issue. This value may not be absent.
numNo - is an integer specifying the number of politicians that voted no on this issue. This value may not be absent.
numAbstain - is an integer specifying the number of politicians that abstained on this issue. This value may not be absent.
d1float, z1float, d2float, z2float – are the values used in calculating the cutting line along the first and second dimensions. d1 is the distance from z1 to the center of the cutting line along the first dimension, and d2 is the distance from z2 to the center of the cutting line along the second dimension.

DTL Files
Sometimes, you would like to give more detailed information in the form of sentences about a vote, rather than have the user interpret what particular data codes mean. In this case only, a DTL file accompanies the INF file, and this where you can include as detailed a “string literal” description as you like. Each line of this description should match the line the vote description appears on in the INF file. That is, if a vote detail is absent or not needed in the DTL file, it should have “” on the line and not introduce the next vote description on this line. This one to one correspondence between INF and DTL files is essential.

"some long vote description for each vote on each line"
ex:
"1st vote - Order of Business"
"2nd vote - Order of Business"

Legend Files
In the case where the optional descriptors are used in the CRR Files, JVW will prompt for the files that hold this legend information. The order in which they are loaded into JVW must match the order of their appearance in the CRR File. That is, the file holding the code table for descriptor1 should be loaded first, descriptor2 second, and descriptor3 third. Usually, these files are simple text files with the following format.

integerID(unique integer) , tokenRep(string), fullDescriptorName(string);
ex:
1302 , B , Bloque Nacionalista Gallego
1204 , G , Bündnis 90/Die Grünen

integerID – is the unique integer code used in the CRR file. This value may not be absent.
tokenRep – is a character representation of how the token used to depict this descriptor on the ideological map. This value may not be absent.
fullDescriptorName – the full name of the descriptor (spaces are allowed). This value may not be absent.

Map Files
JVW will allow researchers the option of viewing both ideological and geographical maps. While no additional material is required on the part of the researcher to view ideological maps, the option to view geographical maps will require shapefiles.
Shapefiles are Geographic Information System (GIS) data formats developed by ESRI. For more information or an introduction to GIS it is recommended to look at www.esri.com or www.gis.com. Shapefiles, or the map files JVW will work with, come in three essential file types: shp, dbf, and shx. Some examples of databases, in addition to esri.com, to get shapefiles include http://nationalatlas.gov. A simple google of “Europe shapefile” or “America shapefile” will find usable map files, but even then they need to be formatted as follows.
First, you will need to open up the dbf file in Excel. This is a screen shot of an unformatted dbf file:

You want to format the dbf file so it has only 3 columns. The first column holds the geographic name, and has the header NAME. The second column may hold any other descriptor for the area, such as country capitol. If there is no other descriptor you wish to use simply copy the contents of the first column. The third column has the header VOTE. The numbers in this column should be unique for each entry. It’s a simple procedure to start numbering the values from 0 to n where n is the number of rows of data. This is what a formatted dbf file would look like:

Once you have these three columns, you will want to save your dbf file with the changes. To do this, you will need to go to Insert->Name->Define.

Select ‘Database’ from the drop down window at the top, then fill in the row/column parameters at the bottom, or click on the little square icon on the bottom right of the window, and select all rows and columns with data.


Say ‘Ok’ and save your changes. When you close Excel, say yes to any prompts about saving changes and overwriting existing files. Make sure your dbf, shx, and shp files are all named the same (they usually are by default). The final step to preparing your map files for use with JVW is to zip the shp, dbf, and shx files together. In addition, make sure not to forget to zip all your crr, rcl, inf, and dtl files together.