Distributed Cataloging on the Internet: the </td><td>RePEc project </td><td>

Distributed Cataloging on the Internet: the RePEc project

José Manuel Barrueco Cruz and Thomas Krichel

José Manuel Barrueco Cruz Thomas Krichel

Biblioteca de Ciències Socials "Gregori Maians" Department of Economics

Universitat de València University of Surrey

Campus dels Tarongers s/n Stag Hill

46071 València Guildford GU2 7XH

Spain United Kingdom

jose.barrueco@uv.es T.Krichel@surrey.ac.uk

Abstract
Cataloging online documents requires a finer level of granularity than many other objects. Collecting that data is a costly process to achieve and manage. One possible approach towards cataloging these resources is to get a commonity of providers involved in cataloging the materials that they provide. This paper introduces RePEc of http://netec.wust.edu/RePEc, as an example for such an approach. RePEc is mainly a catalog of research papers in Economics. In May 1999, it is based on set of over 80 archives which all work independently but yet are interoperable. They together provide data about almost 60,000 preprints and over 10,000 published articles. In principle each institution participating in RePEc provides its own papers by providing and maintaining an archive. The key issue of the paper is to evaluate the success of that decentralized approach in providing data of reasonable quality.

José Manuel Barrueco Cruz is a librarian at the Universitat de València. Thomas Krichel is a lecturer in Economics at the University of Surrey. Both welcome comments on this paper, write to wopec@netec.mcc.ac.uk. This paper is available in PDF.

1: Introduction

The electronic dissemination of Economics working papers can be traced back to the start of the Working Papers in Economics (WoPEc) project in April 1993. By May 1999 this single archive has grown into an interconnected network of over 80 archives holding over 14,000 downloadable working papers and over 50,000 descriptions of offline papers from close to 1,000 series. This is the largest decentralized collection of free online scientific documents in the world. The network of archives is called RePEc. This term is was initially conceived to stand for "Research Papers in Economics". Nowadays it is best understood as a literal, because the objectives of RePEc go way beyond a database of scientific papers. In Section 2 we will introduce some of the broader aspects of RePEc.

RePEc data is freely available, in the sense that the provider pays for the provision of the data, not the user. In order to make such a system viable without public subsidy, the cost of providing the data must be spread among many agents (understood here and in the rest of the paper as a person or institution). This requirement has been a feature of RePEc right from the start of the collection in May 1997. Each participating provider sets up an archive on a http or ftp server. The archive supports the storage of structural data about objects relevant to Economics, and possibly the storage of some of the objects themselves. All objects in RePEc are uniquely identified following by handles.

RePEc data can be accessed through a plethora of user services. Some are heavily used, for example the "IDEAS" user service had one million hits in just over 2 moths in 1999. The main interest of this paper is to examine the collection aspect of the data. The idea that a coherent literature catalog can be put together by a large group of people who are physically dispersed and have very little personal commonication without the need of extensive training nor intensive coordination remains to be demonstrated. In May 1999, RePEc is two years old. We feel that this is a good time to review the operations of RePEc and the data that it has collected. Clearly the RePEc data is in a constant state of flux. To keep matters simple we took a dump of the data on 1 May 1999. In this paper only refer to the state of the data on that date.

There are some aspects of RePEc that this paper does not discuss. We eschew any mentioning of the data on software, books, etc to concentrate on the collection of traditional academic papers be they preprints or published articles. This data forms the bulk of the present collection. We also leave out the personal and institutional data which are is not included in the papers and article templates. We aim to use such data to build a fully relational database system that describes Economics as a discipline. We will report on such efforts in future papers.

In Section 3 we describe the setup of an individual archive through the use of an example. In Section 4 we look at the actual network of archives and the social fabric that supports them. In Section 5 we look at some aggregative aspects of the complete RePEc dataset. Section 6 concludes this paper with an overall evaluation of the methods used by RePEc.

2: RePEc

The nature of RePEc is not precisely defined. Most people think about it as a collection of archives and services that provide data about Economics. More precisely, RePEc is most commonly understood as referring to three things. First it is a collection of archives that provide data about Economics. Second it is the data that is found on these archives. Third, it is often also understood to represent the group of people who build archives and channel the data from the archives to the users. In that latter sense RePEc has no formal management structure.

RePEc has two aims. The "cataloging aim" is to provide a complete description of the Economics discipline that is available on the Internet. The "publishing aim" is to provide free access to Economics resources on the Internet.

The basic principle of RePEc can be summarized as follows

Many archives ---> One dataset ---> Many services

The basic RePEc concepts are: archive, site and service.

An "archive" is a space on a public access computer system which makes data available. It is a place where original data enters the system. The is no need to run any software other than an ftp or http daemon that makes the files in the archive available upon request.

A "site" is a collection of archives on the same computer system. It usually consists of a local archive augmented by frequently updated ("mirrored") copies of remote archives.

A "service" is a rendering of RePEc data in a form that is available to the end user.

All archives hold papers and metadata about papers, as well as software that is useful to maintain archives. Everything contained in an archive may be mirrored. For example, if the full text of a paper is in the archive, it may be mirrored. If the archive does not wish the full text to be mirrored, it can store the papers outside the archive. The advantage of this "remote storage" is that the archive maintainer will get a complete set of access logs to the file. The disadvantage is that every request for the file will have to be served from the local archive rather than from the RePEc site that the user is accessing. Of course an archive may also contain data about documents that are exclusively available in print.

An obvious way to organize the mirroring process would be to mirror the data of all archives to a central location. This central location would in turn be mirrored to the other RePEc sites. The founders of RePEc did not adopt that solution, because it would be quite vulnerable to mistakes at the central site. Instead each site installs the mirroring software and mirrors "on its own", so to speak. Not all of them adopt the same frequency of updating. Many update every night, but a minority only updates every week. It is therefore not known how long it takes for a new item to be propagated through the system.

There is no need for every site to mirror the complete contents of every archive in the system. To conserve disk space and bandwidth some sites only mirror bibliographic information rather than the documents that an archive may contain. Others mirror all the files of an archive. Others may mirror only parts of a few archives. The software that is used to mirror the archive is provided at RePEc:all. It first mirrors the central archive. This software then reads a configuration file and then writes batch calls to the popular "mirror" program for ftp and the "w3mir" script for http archives.

Each service has its own name. A service that is based on mirrored scripts may run on many locations. Within reason, all services are free to use any part of the RePEc data as they see fit. For example a service may only show papers that are available electronically, others may restrict the choice further to act as quality filters. In this way services implement constraints on the data, whether they be availability constraints or quality constraints.

There is no official list of all user services but http://netec.wustl.edu/WoPEc.html has a fairly comprehensive lists. The ability of RePEc to be used in many user services is its most important success feature. It is the single most important reason behind its success. However in this paper we will only concentrate on the quality of the data itself, rather than its implementation and use of user services.

3: The structure of an archive

RePEc stands on two pillars. First, an attribute:value template metadata format called ReDIF. This acronym stands for Research Documentation Information Format but it is best understood as a literal. ReDIF defines a number of templates. Each templates describes an object in RePEc. It has a set of allowable fields, mandatory, and some repeatable. The second pillar is the Guildford protocol. It fixes rules how to store ReDIF in an archive. It basically indicates which files may contain which templates. It is possible to deploy ReDIF without using the Guildford protocol. But in the following we will ignore this conceptual distinction, because it is easiest to understand the structure and contents of an archive through an example. This is done in Subsection 3.1. Therefore we will list files in the way required by the protocol as well as the contents of the file that is in fact written in ReDIF. This is done in Subsection 3.1. We return to technical aspects of ReDIF in Subsection 3.2.

3.1: The Guildford Protocol

RePEc identifies each archive by a simple identifier or handle. Here we look at the archive RePEc:sur which lives at ftp://www.econ.surrey.ac.uk/pub/RePEc/sur. On the root directory of the archive, there are two mandatory files. The file surarch.rdf contains a single ReDIF archive template.

Template-type: ReDIF-Archive 1.0
Name: University of Surrey Economics Department
Maintainer-Email: T.Krichel@surrey.ac.uk
Description: This archive provides research papers from the
 Department of Economics of the University of Surrey, 
 in the U.K.
URL: ftp://www.econ.surrey.ac.uk/pub/RePEc/sur
Homepage: http://www.econ.surrey.ac.uk
Handle: RePEc:sur

In this file we find basic information about the archive. The other mandatory file is surseri.rdf. It must contain one or more series templates.

Template-Type: ReDIF-Series 1.0
Name:  Surrey Economics Online Papers
Publisher-Name: University of Surrey, Department of Economics
Publisher-Homepage: http://www.econ.surrey.ac.uk
Maintainer-Name: Thomas Krichel
Maintainer-Email: T.Krichel@surrey.ac.uk
Handle: RePEc:sur:surrec

These two files are the only mandatory files in the Guildford protocol. If these are the only files present in the archive then all the archive is doing is to reserve the archive and the series codes. All documents have to be in a series. The papers for the series RePEc:sur:surrec are confined to a directory called surrec. It may contain files of any type. Any file ending in ".rdf" is considered to contain ReDIF templates. Let us consider one of them, surrec/surrec9601.rdf (We suppress the Abstract: field to conserve space.)

Template-Type: ReDIF-Paper 1.0
Title: Dynamic Aspect of Growth and Fiscal Policy
Author-Name: Thomas Krichel
Author-Email: T.Krichel@surrey.ac.uk
Author-Name: Paul Levine
Author-Email: P.Levine@surrey.ac.uk
Author-WorkPlace-Name: University of Surrey
Classification-JEL: C61; E21; E23; E62; O41
File-URL: ftp://www.econ.surrey.ac.uk/pub/
 RePEc/sur/surrec/surrec9601.pdf
File-Format: application/pdf
Creation-Date: 199603
Revision-Date: 199711
Handle: RePEc:sur:surrec:9601

Note that we have two authors here. The "Author-WorkPlace-Name" attribute only applies to the second author. We will come discuss this point now.

3.2: The ReDIF metadata format

The ReDIF metadata format is mainly an extension of the Karlsson and Krichel (1999) commonly known as the IAFA templates. In particular it borrows the idea of clusters from the draft

There are certain classes of data elements, such as contact information, which occur every time an individual, group or organization needs to be described. Such data as names, telephone numbers, postal and email addresses etc. fall into this category. To avoid repeating these common elements explicitly in every template below, we define "clusters" which can then be referred to in a shorthand manner in the actual template definitions.

ReDIF takes a slightly different approach to clusters. A cluster is a group of fields that jointly describe a repeatable attribute of the resource. This is best understood by an example. A paper may have several authors. For each author we may have several fields that we are interested in, the name, email address, homepage etc. If we have several authors then we have several such groups of attributes. In addition each author may be affiliated with several institutions. Here each institution may be described by several attributes for its name, homepage etc. Thus a nested data structure is required. It is evident that this requirement is best served in a syntax that explicitly allows for it such as XML. However in 1997--when ReDIF was designed--XML was not available. We are still convinced that the template syntax is more human readable and easier understood. However the computer can not find which attributes correspond to the same cluster unless some ordering is introduced. Therefore we proceed as follows. For each group of arguments that make up a cluster we specify one attribute as the "key" attribute. Whenever the key attribute appears a new cluster is supposed to begin. For example if the cluster describes a person then the name is the key. If an "author-email" appears without an "author-name" preceding it the parsing software aborts the processing of the template.

Note that the designation of key attributes is not a feature of ReDIF. It is a feature of the template syntax of ReDIF. It is only the syntax that makes nesting more involved. We do not think that this is an important shortcoming. In fact we believe that the nested structure involving the persons and organizations should not be included in the document templates. What should be done instead is to separate the personal information out of the document templates into separate person templates.

4: The archives

The main issue of this paper is to evaluate if the decentralized collection of conventional bibliographic data works on the Internet. Given the requirements on the data that we have sketched in the previous section, will the staff at the providing institutions be able to supply it? This question is addressed in this section and in the next one.

In order to understand how the archive work in practice, we look at the social structure supporting the archives. We set out our understanding about how archives are maintained internally and supported externally. In theory all providers are equal. They all read the same documentation; then they all register once with the same central archive etc. In practice some are more equal than others. These people are not only are involved in the operation of a local archive but they also

act as consultants for the erection of new archives

work on implementation software

develop and maintain RePEc services

For the purpose of this paper it is therefore useful to think about these persons as a small "RePEc team" whose members share roughly the same knowledge and enthusiasm. It consists of

José Manuel Barrueco Cruz, Assistant Librarian, Universitat de València

Christopher Baum, Associate Professor of Economics, Boston College

Sune Karlsson, Assistant Professor, Stockholm School of Economics

Thomas Krichel, lecturer in Economics, University of Surrey

Sergei I. Parinov, Head of the Department for Information Technologies in the Economic System at Institute of Economics and Industrial Engineering of the Siberian Branch of the Russian Academy of Science

Robert P. Parks, Professor of Economics, Washington University St. Louis

Geoff Shuetrim, Researcher, Reserve Bank of Australia

Christian Zimmermann, Assistant Professor of Economics, University of Québec at Montréal.

The level of involvement of these persons varies. But for the purpose of this study these people are assumed to be equal. This assumption allows us to bring forward the following definition. We say that an archive is "controlled" if it is managed by a member of the RePEc team or managed by a person working under the supervision of a team member. The other archives will be called "loose" archives. These archives are not under the direct control of a member of the RePEc team. We will address the controlled and loose archive in Subsection 4.1 and Subsection 4.3 respectively.

4.1: The controlled archives

Within the controlled archives we distinguish five categories.

"archives of controlled collections" bring together material collected from different sources. The data collection effort is made by a member of the RePEc team or an agent working directly under her control.
"archives of hosted collections" combine material collected from different sources. The data collection effort is made by a person outside the RePEc team but the RePEc archive where the data is held in the form of ReDIF is on a disk to which a member of the RePEc team has write access.
"archives of hosted sources" are archives for data from a single source that is provided by a person outside the RePEc team. The RePEc archive where the data is held in the form of ReDIF is on a disk to which a member of the RePEc team has write access.
"archives of controlled sources" are archives for data from a single source that is provided by a person who belongs to the RePEc team.
"dead archives" are archives that are no longer being maintained on the host where they lived originally and that have been moved to a site that is controlled by a member of the RePEc team.

Table 1 lists the archives within these categories.

Each archive is identified by a three-letter code. Some elementary metadata about the archive like its name, its url and some basic contents information are polled by a special central archive with the handle RePEc:all, where "RePEc" is the naming authority and "all" is the archive code.

type codes of archives

controlled collections RePEc:wpa RePEc:wop RePEc:wuk RePEc:hhs RePEc:hhb

hosted collections RePEc:fth RePEc:fip

hosted sources RePEc:bbk RePEc:cpr RePEc:nbr RePEc:red RePEc:ecm RePEc:els

controlled sources RePEc:val RePEc:sur RePEc:apr RePEc:boc RePEc:nos

RePEc:ecm RePEc:cre

dead archives RePEc:bru RePEc:tex RePEc:tcd

4.1.1: Controlled collections

The RePEc:wop archive contains the data from the WoPEc project. This project was opened by Thomas Krichel in April 1993 on a gopher server. At that time he placed the first electronic working paper in the discipline. During the following two years the emphasis of the archive shifted towards the description of remote documents, i.e. those which are not available on the local server. In 1998, as a result of a major collection and awareness raising effort undertaken by the WoPEc project in the United Kingdom the part of the RePEc:wop data that related to series in this country was taken out to form the RePEc:wuk archive. The maintenance of the series from outside the United Kingdom was contracted out to a team at the Siberian Branch of the Russian Academy of Sciences working under the leadership of Sergei Parinov.

RePEc:wpa is the data from the Economics working paper archive at WUStL (EconWPA) as operated by Bob Parks. EconWPA opened in September 1993. It is a collection of electronic papers. It operates through author submissions in a much similar way as the xxx operates for Physics. Authors upload their papers with the archive and supply the bibliographic information themselves. In 1997 a converter was written by the Siberian team that converts these internal data to ReDIF.

RePEc:hhs and RePEc:hhb archives maintained by the S-WoPEc and S-WoBa projects. These are Swedish national collections that gather data from many institutes. Section 4 of Karlsson and Krichel (1999) contains a comprehensive description of the way that these projects operate.

4.1.2: Hosted collections

RePEc:fth is by far the largest archive. It describes over 28,000 papers but none of them are downloadable. It is a database produced by Fethy Mili, the Economics Departmental Librarian at Université de Montréal. He collects data on printed working papers that he receives. His collection is one of the largest on this planet. He receives about 300 series. In order to update the holdings in the archive he uploads an ASCII dump of its recent additions to the database. He then activates three scripts. These convert his data to ReDIF, add the new data to existing stock, remove duplications (that involves some manual work) and clean the file space. These scripts have been written by Thomas Krichel. He has direct access to Fethy Mili's files. They also collaborate to ensure that there are not many duplications between the data in RePEc:fth and the data provided in other archives. This effort requires the collaboration of the other archives as well. Clearly, data supplied directly by the provider of the papers is preferable to data supplied by an intermediary. However the data of the providers does not stretch as far back as Fethy's data. Ideally there should be a transfer of the data from the RePEc:fth archive to the archive of the original provider. The provider should then consolidate their data with the data from RePEc. There are cases where such a consolidation has been successful. However there are some cases where this consolidation is still forthcoming. In those cases there are double holdings of data.

RePEc:fip contains the Federal Reserve of the United States' "Fed in Print" database. This database is a centrally collected database for all the documents that the Fed publishes. Every regional branch contributes data to the collections. These contribution sends a text dump via email to Thomas Krichel who then runs a converter into the ReDIF format. A similar process has been working since 1994, when the US Fed became the first governmental organization to contribute to the collection.

4.1.3: Controlled sources

These are archives that make material available from a single provider. RePEc:nbr and RePEc:els are based on the machine of the provider. For others the providers use a variety of means to bring the data to a machine of the RePEc team member. RePEc:cpr use the most primitive way by physically delivering floppy disks to Thomas Krichel.

4.1.4: Dead archives

RePEc:bru, RePEc:tex and RePEc:tcd are dead. These are archives that are no longer being maintained and are kept at a site under the control of a member of the RePEc team. It is not clear at what stage the archive is no longer being maintained. The best indication comes when an archive maintainer informs a member of the RePEc team that (s)he is leaving the workplace where (s)he built the archive. Another indication is that the archive can no longer be found at the url indicated in the archive template. (Note that it is possible to move an archive without notifying the keeper of the RePEc:all archive. To do that the old and new archives need to be held open simultaneously and the archive url must be set to the new archive on both locations until the RePEc:all archive has passed and taken a copy of the archive file with the new location.) Finally another indicator is when the archive has not been updated for a long time. There is currently no policy towards dead archives. It is likely that the number of dead archives will increase initially because the number of archives is increasing. Later the problem is likely to decline if participation in RePEc becomes a common practice.

4.2: Provider support

The decentralized way of operating implies that any archive operator could write anything they like in the template. Racist, pornographic, arty-farty etc. expressions should be rare within the collection because each archive has to register with the keeper of the RePEc:all archive before the material from the archive can enter the RePEc distribution channels. All archives are based with recognized institutions or well-known academics. Any check on the semantics of the contents has to be performed by the archive maintainer herself. They should be quite competent at performing this task since it they posses the local knowledge that is required. However to encode that information into the templates some "remote" or "global" knowledge is required. We provide this knowledge in the documentation. We also aim to give the archive maintainers feedback on their data. This is what we deal with in this section. In particular we aim to provide the contributors with reports on the validity of their data.

When we undertook this study we sent an email to all providers of loose archives. This email contained details about the checks. The checks were not publicized before in this way. They were included in the guide for archive maintainers that Christian Zimmermann maintains at http://ideas.uqam.ca/ideas/maintain.html. But it can be confidently assumed that when the data were collected the error files had not been used.

Let arc be the three-letter code of an arbitrary archive. Then at ftp://netec.mcc.ac.uk/pub/RePEc/help/arc.rech we store the result of a syntax check. It would be too long and too tedious to list the contents here. However we invite the reader to inspect those files. They are updated within a few hours of a change in the archive therefore the time stamp on the file will give a good indication for the time when the archive was last modified.

The syntactic control is not the only one that we need to do. Since the templates are in fact within a relational framework we need to make sure that the relationships are valid. Even without the fully relational features that the dataset is aimed to eventually cover there are some relational features that need checking. For example, a paper handle RePEc:arc:xyzxyx:1999-1 would make no sense if there is no series RePEc:arc:xyzxyx defined in the dataset. If this is not the case it is likely that a user service on the web for example would have at least one broken link that would point from the paper to its series or vice-versa. Note that the relational control is computationally much more involved than the syntactical check. The syntax check only needs to look at an individual template. The relational check needs to be aware of the complete dataset. This has some important implications for the idea of a decentralized and relational database. The full extent of these implications has yet to be revealed. For the moment moment we provide a relational check at ftp://netec.mcc.ac.uk/pub/RePEc/help/arc.rela.

The last check is to control the url in the dataset. This is useful to find broken links to the full text of documents.

A dataset as large as RePEc will never be completely error-free. There are many mistakes which are not machine-understandable and which therefore need human control. To deal with machine-understandable errors before they ever reach the user would be a desirable aim to reach. At the moment only the syntax check is performed at the time when the ReDIF data is read. Ideally one would like to see a check of the relational features as well as the control of the URLs be performed when the data is read. The only way that this could be done is to compute centrally a list of blacklisted templates. This is an area for future work.

Finally there is an email discussion list for RePEc. It is called repec-admin. This is essentially meant for the common concerns of archive providers. There is not a lot of traffic on the list. Most discussion concern general policy issues such as copyright and data on usage. We estimate that only about one in three archives have a representative on the list.

4.3: The loose archives

In order to gather some data about the loose archives we sent an email to all 66 loose archives. It was sent to the addresses that are listed in the Maintainer-Email field of the archive template. The mail informed about the maintainer support that we detail in the Subsection 4.2 (The text of the mail and the responses is available from the authors on request.) After sending a reminder, we achieved 22 responses. The small number of replies that we received is cause for concern. For anonymous questionnaires a response rate of one in three would be a great success. However we are dealing with a situation where the maintainer have some knowledge about what we are doing. There must be a common concern between us and the maintainer. Recall that almost all of them took the initiative to open the archive in the first place.

The first question concerned the personal address of the respondent. In the second question we were interested in the professional status of the respondent

How would you describe your function? ("academic faculty", "researcher", "computing officer", "publications officer", "secretary", "student" etc).

Eleven respondents are academic staff, either in teaching or in full-time research positions, three professional researchers outside academia--we will group these as faculty respondents in the following--three students, two computer systems administrators, one librarian, one secretary, and one publications officer. The academic faculty were the first to respond, most of the answers we received from the others came after the reminder email.

The next question concerned

Briefly describe how your RePEc archive was initiated. For example: who had the idea, why did you open the archive...

Here all the faculty respondents write that they opened the archives themselves. Eight of them mention that they have been approached by a member of the RePEc team. Six of them mention general awareness of RePEc. One student maintainer says that he was approached by the webmaster. All other non-faculty respondents report that they have been approached by faculty members.

We already come up with one interpretation. It seems that the work on the archive works best when the academics are taking the initiative.

Briefly describe how the archive is maintained. For example: who is involved in maintaining it, do you update the files regularly, how do you collect the data...

The answers to this question are very difficult to summarize. Suffice is to say that we did not find two institutions that have the same procedures. Some involve as many as three persons in the process. Some use in-house scripts to update the server or to input the data. Even if academics are doing it on their own every single one of them seems to find a different way to do it. We conclude that trying to write a software that would help to automate the process would be very difficult because the software would have to cope with a large variety of individual organizational structures.

Are the researchers aware that their work is disseminated though IDEAS/NEP/WoPEc etc?

Here we have 16 answers that broadly say "yes" versus 6 that broadly say "no". A word of caution is on order here. An archive maintainer can easily state that the researchers are aware of it if she told them once that the research was disseminated. As one maintainer put it: "I keep reminding them but my guess is, less than 30% remember".

Do you find it difficult/cumbersome to make the updates? What could we do to help you?

All respondents write that it was easy or fairly easy to perform the updates. Many add that they did not need any further help. Suggestions are few and far between. None of them suggest procedural changes with the provision of the data. While this is a favorable result we must not forget that only one in three maintainers replied. It is possible that the other maintainers are rather confused and therefore did not bother to reply. We have some anecdotal evidence to support this idea.

5: The total dataset

ReDIF-paper ReDIF-article

field all max all max

template-type 58254 1 10112 1

handle 58251 2 10110 1

title 58235 2 10110 1

author-name 98321 14 13855 6

creation-date 52730 1 8819 1

revision-date 536 8

publication-date 510 1

abstract 22984 3 1896 1

classification-jel 20194 2 436 1

keywords 39219 3 9084 1

keywords-attent 457 1

publication-status 6227 3 1568 1

note 9011 1 1479 2

series 4124 2

number 16021 2 1501 1

price 4175 3

file-url 17259 22 1853 2

order-url 2417 1

contact-email 1141 1

availability 7169 2

length 33342 12

pages 7920 1

month 489 1

issue 8705 1

volume 1293 1

year 1293 1

journal 488 1

paper-handle 19 1

5.1: Aggregate Contents

In Table 2, we examine the document data in RePEc as of 1 May 1999. For each field we give the total occurrences of the field in the "all" column and the maximum of occurrences that the field has within a single template in the "max" column. The document data appear in the ReDIF-paper and the ReDIF-article templates.

Total numbers for documents are given by the "template-type" and "handle" fields. Since each template should have exactly one type and exactly one handle the tiny difference between the two numbers is made up of mistakes in the dataset. The title field is also required. It is encouraging to see that most documents have a creation date attached to them, because as the dataset grows it will become increasingly important to distinguish between recent and dated documents; only the former are likely to be of much interest. By contrast "revision-date" information is rare. Articles may also have a "publication-date". The difference of this field with the "creation-date" field is not clear. We consider this to be a design error in the template structure.

Let us consider the elements that refine the contents description. We encourage contributors to provide abstracts. The presence of abstracts for about one in three papers is very positive. The abstract field can be repeated. This is desirable when there are abstracts in different languages. A large number of the papers have a Journal of Economic Literature (JEL) classification code attached to them. However almost all papers in the offline papers only archive RePEc:fth have the codes and that explains a very large proportion of the classified material. This data has been compiled by a librarian. For the electronic papers there are only two in five papers that have a classification field. We agree that this is a serious limitation to the quality of the data. It would have been possible to require a classification number for each paper right from the start. This would have hampered the collection effort. In particular it would have made it impossible for the WoPEc team to "snarf" bibliographic data from sites where this JEL data was not available. There is also some concern among economists that their areas of work do not match with these codes. The use of more complete and sophisticated classification schemes would not be possible. The main argument against requiring JEL classification codes was, however, that there is considerable opposition against the scheme in the heterodox Econonomics commonity. They feel that the JEL classification scheme reflects the view of the orthodoxy. Requiring JEL classification codes would have meant excluding these contributors. Then and now only a tiny part of the collection could be grouped as heterodox. However our aim is that RePEc be a broad church. This was the decisive argument against requiring the use of JEL codes.

There is a large number of templates that have keywords. About 50% of these templates come from RePEc:fip where each paper has a keyword. ReDIF allows for both free and qualified controlled vocabulary. This facility is used by for the internal keyword scheme of the Attent: Research Memoranda database. They are only used by the RePEc:dgr archive.

The "publication-status" field can be used to indicate where the paper has been submitted to and where the paper has been formally published. This field appears in the data from large research bodies that have been issuing a series of papers for many years and that have data about the formal publication of the papers. The fields "series" and "number" are somewhat redundant since this information should also be available from the handle. The "price" field normally refers to the delivery of a printed copy. The mode of delivery is often just expressed in the "price" field. The "file-url" field refers to the "full text" locus of a part of the full text. Usually it is the complete full text.

The document may have several components in addition to the full text. These can be listed as several "file" clusters. Each may carry an uncontrolled field about its function within the paper. For example the author may wish to supply a computer program that was used to produce the paper. In that case a whole series of files may be made available. However that is not the way the option of having many files is actually exercised. Most of the time it is used to include elements like graphics or tables that the author did not manage to include into the main document file.

The "order-url" field is used to point to an intermediate page that sits between our description and the files of the document. In that case we are not aware if the resource does actually exist online. "order-url" may be used in conjunction with the "file-url" attribute. Note that there is no "order-email" field in the document templates. Such a field figures in the series template, because the ordering of a paper should be the same for all papers in the series. The "contact-email" may otherwise be used to contact somebody who has any connection with the paper. This field is only used by the contributors to the RePEc:wpa archive. The "availability" is used most of the time to signal that the paper is no longer in print.

Finally a "length" attribute can be used to indicate how many pages the reader has to go through to read the paper. This field is present in all templates provided by RePEc:fth and it seems to appear in a surprisingly large number of other templates.

Articles have a number of specific attributes that are listed at the bottom of the table. Strictly speaking these are not descriptive elements of the articles themselves, they rather relate to the position the article has within the journal. Finally the "paper-handle" allows to point from the preprint version to the article template.

file person organization

name all max name all max name all max

url 19112 1 name 112176 1 name 8598 1

format 19024 1 postal 8 1 postal 2118 2

size 2630 1 homepage 1557 2 homepage 596 2

function 1661 1 email 3166 2 email 1451 3

restriction 2548 1 phone 282 1 phone 164 1

fax 259 1 fax 197 1

workplace-name 8598 4

5.2: The clustered data

The data available in Table 2 is not the complete set of information available in the dataset. It only lists the individual attributes and the key attributes of clusters in the paper and article templates. In Table 3 we have the data that is contained in the clusters in the document subset of the RePEc data. This data is therefore consistent with the data in Table 2.

There are three types of clusters, "file", "organization" and "person". The numbers that are present suggest that there are significant possibilities for a relational structure in the dataset between persons and their organizations. An interesting consideration in the person cluster is the high number of workplace templates. Providers of the data seem to attribute more importance to the workplace of a person rather than to her strictly personal data, e.g. her homepage. The only explanation that we can offer here is that most likely the data is provided by an agent of the workplace. The low number of homepages is an indicator which also suggests that in most cases the provider is not the author herself. Note also that the workplace information--when it is present--is much more complete than the corresponding data for the individuals.

5.3: Mistakes in the data

When a dataset is provided by various parties at the same time it is not a good idea to use the data directly. Recall that mirroring software is being used to keep updated copies of archives on sites that unite more than one archive. An archive could provide a site with an arbitrary stream of nonsense data. To avoid that data to appear in a user service we need to check the data at the time of reading. There are two types of problems that the software notices when it reads ReDIF templates: "errors" and "warnings". An "error" result in the template being discarded. A "warning" results in the value of the field being ignored. We deal with both problems later on. Note that the distinction between errors and warnings is not part of the published ReDIF documentation. It is specified in the specification file of the ReDIF checking software, see Krichel and Kurmanov (1999).

5.3.1: ReDIF Errors

In Table 4 we show the errors that are in the data. At the start of the template, the first element has to be the template type. The border of the template is the zero-width space between the template-type statement and what precedes it. The checking software is able to signal the error that the template is not valid because it has as invalid start.

Another type of error is the omission of a required field. The handle field is of course a mandatory field. For papers and articles we also require the name of at least one author, and the title. The requirement that leads to most errors is the specification of the file format. A document--whether a paper or an article--will be represented in a number of files. For each of those files we require a statement about the file format (say PDF, postscript etc) as well as some indication about the compression or archiving software that is being used.

This requirement was inspired from our experience with user services. The early user services of RePEc were confronted with the problem to teach users what to do with the files when they had downloaded them. If the format of the file was documented it became possible to link each file to a documentation page that would explain users how to deal with the file. When such links were introduced at the WoPEc project in 1996 the number of queries fell dramatically. The need to require file formats was therefore introduced into the ReDIF specification.

The next errors listed in Table 3 concern fields that repeated but that ReDIF does not allow to repeat. No template may have two handles and no document may have two titles.

description of error number

Start template with 'template-type' attribute 122

Required attribute <handle> is absent 6

Required attribute <title> is absent 23

Required attribute <author-name> is absent 8

Required attribute <format> is absent in <file> 264

You cannot repeat (2) this attr <handle> in templ above 1

You cannot repeat (2) this attr <title> in templ above 1

Attribute misplaced; valid in cluster <file> 104

Attribute misplaced; valid in cluster <organization> 32

Attribute misplaced; valid in cluster <person> 2

Attribute misplaced; valid in template <redif-archive 1.0> 14

Attribute misplaced; valid in template <redif-person 1.0> 4

INVALID (unknown) attribute 126

Invalid value of type <email> (attr: contact-email, eval) 1

Invalid value of type <fileformat> (attr: file-format, regex) 112

Invalid value of type <handle> (attr: handle, regex) 10

Invalid value of type <url> (attr: author-homepage, eval) 82

Invalid value of type <url> (attr: file-url, eval) 77

Bad line: 1189

An important source of mistakes is the clustering of attributes. A cluster must be introduced with the key attribute for the cluster; if that is not the case then the template is in error. However given the fact that the cluster concept is more involved than than the meaning of individual data element we are surprised that there are not more clustering mistakes.

Each attribute may be checked for its syntax to correspond to a regular expression. Email addresses have to be of the standard Internet form user@site. URLs have to be of a syntax that is specified in RFC1783. The format indicator that is required in the file-format is an in-house product. It is based on the mime types as defined in RFC1521 but it has additional qualifiers that specify archiving and compression of files. It is therefore not astonishing that this specification produces quite a few mistakes.

Finally an important source of mistakes are bad lines. This error typically occurs in lines that continue the value of a field that has started on a previous line. Any such line must be indented by at least one blank character. Note that the number quoted here is the actual number of bad lines. If an abstract that goes over several lines has no indentation, this problem affects all the lines. The number of templates that are affected by the problem is therefore much smaller than the number reported here. Our calculations show that 170 templates are invalid through bad lines. Note that the same remark also applies for the "missing template-type" error.

5.3.2: ReDIF warnings

When a warning is issued, the template is not rejected but the value of the field is ignored. We list the warning messages in Table 4. Dates have to be of the form yyyy-mm-dd. The JEL classification values are all known to the checking software. If a key attribute is empty then the rest of the cluster data is ignored.

5.3.3: Guildford protocol mistakes

When preparing this paper we also found some violations of the Guildford protocol. Recall that the Guildford protocol sets rules on how ReDIF metadata should be stored on a RePEc archive. (When ReDIF is checked these errors are not detected because they are not violations of ReDIF.) Three files contained series templates but they are not called ???seri.rdf and placed in the directory of individual series. There is also a file that contains a mixture of ReDIF-paper and ReDIF-software templates. This is not licit because no file may contain templates of different types.

Since we have been working very thoroughly through the data we are confident that there are no further Guildford protocol mistakes in the data. To prove that we would need to write special software.

5.4: The composition of the dataset into archives

In Tables 5 and 6 (in the Appendix) we report the contents of the total document-related datasets for loose and controlled archives respectively. For each archive, we give its name first; two names have been shortened in order to save space. It is clear that the name of the archive is not really a well defined concept. For most loose archives the choice of the name reflects the name of the provider or sometimes simply the name of the type of papers.

The largest archives tend to be found among the controlled archives. The reasons are simple enough. Many of the controlled archives come from large providers. A member of the RePEc team has been working to make their contents available, because the bargaining power of large providers is such that they have little incentive to provide data but RePEc has strong incentives to include them. The majority of loose archives are run by small institutions, in general for their departmental paper series. There are some personal archives; we will look at them separately.

description of error number

A bad date value format: 273

An invalid JEL value 3344

Empty value of a key attr. <author-name> 301

Empty value of a key attr. <author-workplace-name> 1411

Empty value of a key attr. <file-url> 176

5.4.1: Controlled versus loose archives

There are two basic reasons why mistakes occur in an archive. Either the maintainer is incompetent or (s)he is lazy. In the case of the controlled archives we exclude the incompetence cause. We assume that these people are competent but that they have not yet corrected the mistakes because they have been too lazy, too busy with other things or for whatever other reasons. The main issue for us is whether people who have neither formal cataloging background in general nor have had personal training in particular are able to catalog the documents. If they can do that as well as people who have perfect knowledge of the cataloging procedure than it is possible to build a catalog simply by pointing a willing individual to the documentation.

It is therefore important to consider the comparative number of mistakes in these two groups of archives. Because of the fact that the controlled archives are so much larger a comparison of the quality of the controlled versus the loose archive would be misleading if it would solely based on statistical measures like the total number of errors divided by the total number or documents. Since the large collections are generated out of ASCII dumps from bibliographic databases, this comparison would be misleading.

From an examination of the data in Table 5, there seems to be an accumulation of mistakes in certain archives whereas others are almost completely error-free. Even some of the controlled archives are affected by a series of problems. It therefore seems fair to write that in principle the decentralized cataloging works for a large number of archives. However a small number are affected by clusters of mistakes. There is some human intervention required to fix these mistakes. Note that because web cataloging is still in its infancy, we can be hopeful that sooner or later the archives that are affected by serial errors will take corrective action.

5.4.2: Personal archives

Four archives are in fact personal archives where paper of only one person and his co-authors appear. Three of them take their name from the person who is the main author. RePEc is ill-prepared for the wide-spread use idea of a personal RePEc archive. Since all archives have to be registered with a central archive, if many of authors open such archives this would put the central archive under considerable strain. The three archives are operated by leading gurus and therefore the RePEc team did find it difficult to refuse them to open an archive. If authors would make more material available in homepages, one would need to find intermediate collectors that would gather the ReDIF data not through the Guildford protocol, but through a protocol that is more suitable for the storage of ReDIF data in homepages.

One important motivation to store papers in personal archives is that many journals allow authors to redistribute their work as part of a collection that is their own. Providers of personal archives provide reprints of their paper and hope that these reprints are protected by such clauses.

6: Conclusions

The free provision of educational material can be implemented through a central institution. Such an institution needs to be subsidized by central funds. The alternative is to provide the resources by a large number of agents. Then the cost of providing access can be absorbed within each institution. In that case the question of a comprehensive catalog arises. Such a catalog is needed to provide access to the collection in a unified way.

In this paper we have dealt with the provision of a key resource i.e. academic papers. We have presented a collection of metadata that is provided by decentralized archives. We have found that it is possible to build such a collection to a reasonable degree of accuracy if some archives where mistakes occur are aided by others. There needs to be a small group of people who actively support the collection. However this support can be given in decentralized fashion without the need for much coordination between supporters.

The academic library commonity in the United Kingdom as a whole has made a important contribution to RePEc by donating funds to the work of the WoPEc project. This has allowed the WoPEc project to collect metadata about papers that are published by institutions that are not yet contributing to RePEc. This was a vital aspect of WoPEc project. The data collected by WoPEc constituted 90% of the RePEc data when RePEc was founded. However nowadays that proportion is falling. The funding for WoPEc has run out but the WoPEc web site continues to expand because of the contributions by made by RePEc archives. The software is maintained by volunteers.

Librarians should carefully consider the vision of the project. This is a kind of academic self-organization where academics publish and catalog their own work. RePEc benefits from network externalities. The more academics join the more those who have not joined will feel pressure to join. If the data is freely available than authors can commonicate with their peers without the need of intermediaries. The providers of intermediation services have every reason to be worried. They include publishers and librarians. If librarians do not play a more active part by supporting developments like RePEc there will be no more rôle for them in the future.

<hr> The work discussed here has received financial support by the Joint Information Systems Committee of the UK Higher Education Funding Councils through its Electronic Library Programme. We are grateful to Christopher F. Baum, Robert P. Parks, Thorsten Wichmann and Christian Zimmermann for comments on the questionnaires. William L. Goffe and Christian Zimmermann made many helpful suggestions. Sophie C. Rigny kindly pointed out many stylistic and grammatical errors in an earlier version.

Appendix: The composition of the dataset by archive

School of Economics and Management, University of Aarhus, Denmark

RePEc:aah 2 series 200 papers 0 articles 162 errors 5 warnings

Ecological Economics Program RePEc Archive

RePEc:anu 1 series 14 papers 0 articles 0 errors 0 warnings

Bank of Canada

RePEc:bca 2 series 47 papers 0 articles 0 errors 0 warnings

Brazilian Electronic Journal of Economics

RePEc:bej 2 series 0 papers 3 articles 20 errors 0 warnings

RePEc archive at Berlecon Research

RePEc:ber 1 series 4 papers 0 articles 0 errors 0 warnings

Working Papers Archive of the Bank of England

RePEc:boe 1 series 94 papers 0 articles 25 errors 0 warnings

RePEc archive at Bonn University

RePEc:bon 2 series 682 papers 0 articles 86 errors 348 warnings

Brown University, Department of Economics

RePEc:bro 1 series 56 papers 0 articles 1 error 14 warnings

Banking & Finance Conference Papers

RePEc:caf 1 series 6 papers 0 articles 1 error 0 warnings

Department of Applied Economics Working Paper Archive

RePEc:cam 3 series 81 papers 0 articles 0 errors 0 warnings

Concordia University, Department of Economics

RePEc:ccd 1 series 2 papers 0 articles 0 errors 0 warnings

Centre for Economic Performance, London School of Economics

RePEc:cep 3 series 362 papers 0 articles 57 errors 613 warnings

CEPII

RePEc:cii 1 series 81 papers 0 articles 42 errors 0 warnings

Canadian Journal of Economics

RePEc:cje 1 series 0 papers 76 articles 0 errors 0 warnings

Levine's Working Paper Archive

RePEc:cla 1 series 43 papers 0 articles 1 error 0 warnings

CEPREMAP Archive

RePEc:cpm 1 series 770 papers 0 articles 1 error 7 warnings

Department of Economics Discussion Papers Archive at City University

RePEc:cty 1 series 34 papers 0 articles 0 errors 0 warnings

Working Papers of the Department of Economics at Dalhousie University

RePEc:dal 1 series 0 papers 0 articles 0 errors 0 warnings

DEGREE

RePEc:dgr 18 series 1203 papers 0 articles 663 errors 292 warnings

Archive for RePEc at UPV-EHU

RePEc:ehu 1 series 7 papers 0 articles 2 errors 1 warning

Fondazione Eni Enrico Mattei

RePEc:fem 1 series 54 papers 0 articles 371 errors 0 warnings

Financial markets Group, London School of Economics

RePEc:fmg 3 series 429 papers 0 articles 75 errors 449 warnings

Universitaet Frankfurt, Wirtschaftswissenschaften

RePEc:fra 1 series 31 papers 0 articles 5 errors 0 warnings

Economics Working Paper Archive at the University of Glasgow

RePEc:gla 1 series 54 papers 0 articles 30 errors 0 warnings

Heriot Watt University, Edinburgh

RePEc:hwe 2 series 62 papers 0 articles 1 error 0 warnings

IGIER-Innocenzo Gasparini Institute for Economic Research

RePEc:igi 1 series 58 papers 0 articles 0 errors 0 warnings

Economic Working Paper Archive at University of Jena

RePEc:jen 2 series 56 papers 0 articles 0 errors 1 warning

CoFE Discussion Paper

RePEc:knz 1 series 6 papers 0 articles 78 errors 1 warning

University of Leicester Disccussion Papers in Economics

RePEc:lec 3 series 89 papers 0 articles 2 errors 0 warnings

Lester Ingber Papers Archive

RePEc:lei 1 series 50 papers 0 articles 0 errors 0 warnings

Universite Laval, Departement d'economique

RePEc:lvl 2 series 152 papers 0 articles 1 error 1 warning

Economics Working Paper Archive NUIM

RePEc:may 1 series 31 papers 0 articles 5 errors 0 warnings

Non-monetary economics

RePEc:mce 1 series 2 papers 0 articles 0 errors 0 warnings

Economics Working Paper Archive at McGill

RePEc:mcl 1 series 7 papers 0 articles 0 errors 0 warnings

McMaster University Department of Economics Working Paper Series

RePEc:mcm 5 series 274 papers 0 articles 4 errors 1 warning

Economics Working Paper Archive at Universidad Publica de Navarra (UPNA)

RePEc:nav 1 series 1 paper 0 articles 0 errors 0 warnings

Economics works by Nir Dagan and Oscar Volij

RePEc:nid 2 series 25 papers 0 articles 0 errors 0 warnings

Netnomics

RePEc:nnm 0 series 0 papers 0 articles 0 errors 0 warnings

Economics Discussion Papers Archive at Nottingham

RePEc:not 2 series 110 papers 0 articles 0 errors 0 warnings

National Institute of Economic and Social Research

RePEc:nsr 1 series 126 papers 0 articles 7 errors 0 warnings

SUNY-Oswego Economics Department Working Paper Archive

RePEc:nyo 1 series 5 papers 0 articles 0 errors 0 warnings

Ohio State University, Department of Economics

RePEc:osu 1 series 18 papers 0 articles 1 error 0 warnings

Working Papers and Published Papers by Peter Cramton and Co-Authors

RePEc:pcc 1 series 38 papers 0 articles 0 errors 0 warnings

Queen Elizabeth House Papers Archive at Oxford University

RePEc:qeh 1 series 21 papers 0 articles 5 errors 0 warnings

Centre for International Business Centre

RePEc:sbu 3 series 4 papers 0 articles 2 errors 11 warnings

CSEF-Centre for Studies in Economics and Finance Working Papers

RePEc:sef 1 series 20 papers 0 articles 1 error 9 warnings

University of Liège (Belgium) Faculty of Econonomics, Management and Social SciencesDepartment of Economics Service of International and Interregional Economics (SEII)

RePEc:sei 1 series 7 papers 0 articles 0 errors 0 warnings

Economics Discussion Paper Archive at the University of Siegen, Germany

RePEc:sie 1 series 19 papers 0 articles 0 errors 0 warnings

Economics Working Paper Archive at the University of Sussex

RePEc:sus 1 series 51 papers 0 articles 0 errors 0 warnings

University of Toronto, Department of Economics

RePEc:tor 1 series 91 papers 0 articles 2 errors 0 warnings

Economic Working Paper Archive at UPO

RePEc:uca 1 series 2 papers 0 articles 0 errors 0 warnings

Department of Economics, University of Iowa

RePEc:uia 1 series 135 papers 0 articles 0 errors 0 warnings

Department of Economics Discussion Papers Archive at the University of Kent

RePEc:ukc 1 series 58 papers 0 articles 0 errors 0 warnings

Electronic Working Papers, Department of Economics, University of Maryland

RePEc:umd 1 series 11 papers 0 articles 34 errors 3 warnings

Economics Working Paper Archive at Universitat Pompeu Fabra

RePEc:upf 3 series 227 papers 0 articles 0 errors 0 warnings

Department of Economics, University of Victoria

RePEc:vic 2 series 16 papers 0 articles 2 errors 0 warnings

Centre for the Study of Globalisation and Regionalisation (CSGR)

RePEc:wck 2 series 51 papers 0 articles 78 errors 0 warnings

SFB504 Working Paper Archive

RePEc:xrs 1 series 110 papers 0 articles 6 errors 3 warnings

Economics Working Paper Archive at York

RePEc:yor 1 series 190 papers 0 articles 0 errors 0 warnings

Australian Prudential Regulation Authority

RePEc:apr 1 series 4 papers 0 articles 0 errors 0 warnings

Birkbeck College Economics Working Paper Archive

RePEc:bbk 3 series 111 papers 0 articles 1 error 4 warnings

Boston College Department of Economics

RePEc:boc 3 series 373 papers 0 articles 0 errors 1 warning

Economics Working Paper Archive at Brunel Universtiy

RePEc:bru 3 series 119 papers 0 articles 0 errors 0 warnings

C. E. P. R.

RePEc:cpr 2 series 1406 papers 0 articles 0 errors 0 warnings

Research Center on Economic Fluctuations and Employment (CREFE)

RePEc:cre 5 series 142 papers 0 articles 0 errors 1 warning

Department of Economics and Finance Discussion Papers Archive at the University of Durham

RePEc:dur 1 series 54 papers 0 articles 0 errors 0 warnings

Econometrica

RePEc:ecm 1 series 0 papers 346 articles 0 errors 0 warnings

ELSE - ESRC Center for Economic Learning and Social Evolution

RePEc:els 1 series 57 papers 0 articles 0 errors 0 warnings

Fed in Print

RePEc:fip 107 series 4682 papers 8819 articles 0 errors 0 warnings

BibEc Project

RePEc:fth 400 series 28759 papers 0 articles 0 errors 0 warnings

S-WoBA, Swedish Working Papers in Business Administration

RePEc:hhb 2 series 12 papers 0 articles 2 errors 0 warnings

S-WoPEc, Swedish Working Papers in Economics

RePEc:hhs 9 series 536 papers 0 articles 2 errors 0 warnings

National Bureau of Economic Research, Inc.

RePEc:nbr 4 series 7684 papers 0 articles 1 error 2966 warnings

Russian Working Paper Archive for Economists and Sociologists

RePEc:nos 1 series 8 papers 0 articles 0 errors 0 warnings

Reserve Bank of Australia

RePEc:rba 1 series 249 papers 0 articles 334 errors 758 warnings

Review of Economic Dynamics

RePEc:red 2 series 0 papers 62 articles 0 errors 4 warnings

University of Surrey Economics Department

RePEc:sur 1 series 10 papers 0 articles 0 errors 0 warnings

Trinity College Dublin

RePEc:tcd 2 series 8 papers 0 articles 0 errors 1 warning

Economics Working Paper Series at Texas

RePEc:tex 1 series 20 papers 0 articles 6 errors 1 warning

Universidad de Valencia, Facultad de Ciencias Economicas

RePEc:val 7 series 151 papers 0 articles 0 errors 0 warnings

WoPEc Project

RePEc:wop 205 series 4763 papers 804 articles 58 errors 12 warnings

WUSTL archive

RePEc:wpa 25 series 1142 papers 0 articles 1 error 0 warnings

WoPEc UK Archive

RePEc:wuk 46 series 1585 papers 0 articles 0 errors 0 warnings


José Manuel	Barrueco Cruz		Thomas	Krichel

Biblioteca de	Ciències Socials "Gregori Maians"			Department of Economics

Universitat de València			University of Surrey

Campus	dels Tarongers s/n		Stag Hill

46071 València			Guildford GU2 7XH

Spain		United Kingdom

jose.barrueco@uv.es			T.Krichel@surrey.ac.uk


	ReDIF-paper		ReDIF-article

field	all	max		all	max

template-type	58254	1		10112	1

handle	58251	2		10110	1

title	58235	2		10110	1

author-name	98321	14		13855	6

creation-date	52730	1		8819	1

revision-date	536	8

publication-date				510	1

abstract	22984	3		1896	1

classification-jel	20194	2		436	1

keywords	39219	3		9084	1

keywords-attent	457	1

publication-status	6227	3		1568	1

note	9011	1		1479	2

series	4124	2

number	16021	2		1501	1

price	4175	3

file-url	17259	22		1853	2

order-url	2417	1

contact-email	1141	1

availability	7169	2

length	33342	12

pages				7920	1

month				489	1

issue				8705	1

volume				1293	1

year				1293	1

journal				488	1

paper-handle				19	1


file		person		organization

name	all	max	name		all	max	name		all	max

url	19112	1	name		112176	1		name	8598	1

format	19024	1	postal		8	1		postal	2118	2

size	2630	1	homepage		1557	2		homepage	596	2

function	1661	1	email		3166	2		email	1451	3

restriction	2548	1	phone		282	1		phone	164	1

			fax		259	1		fax	197	1

			workplace-name		8598	4


description of error		number

Start template with 'template-type' attribute		122

Required attribute <handle> is absent	6

Required attribute <title> is absent	23

Required attribute <author-name> is absent		8

Required attribute <format> is absent in <file>		264

You cannot repeat (2) this attr <handle> in templ above		1

You cannot repeat (2) this attr <title> in templ above		1

Attribute misplaced; valid in cluster <file>	104

Attribute misplaced; valid in cluster <organization>		32

Attribute misplaced; valid in cluster <person>		2

Attribute misplaced; valid in template <redif-archive 1.0>		14

Attribute misplaced; valid in template <redif-person 1.0>	4

INVALID (unknown) attribute	126

Invalid value of type <email> (attr: contact-email, eval)	1

Invalid value of type <fileformat> (attr: file-format, regex)	112

Invalid value of type <handle> (attr: handle, regex)	10

Invalid value of type <url> (attr: author-homepage, eval)	82

Invalid value of type <url> (attr: file-url, eval)	77

Bad line:	1189


	description of error		number

	A bad date value format:		273

	An invalid JEL value		3344

	Empty value of a key attr. <author-name>		301

	Empty value of a key attr. <author-workplace-name>		1411

	Empty value of a key attr. <file-url>		176


School of Economics and Management, University of Aarhus, Denmark

RePEc:aah	2	series	200	papers	0	articles	162	errors	5	warnings

Ecological Economics Program RePEc Archive

RePEc:anu	1	series	14	papers	0	articles	0	errors	0	warnings

Bank of Canada

RePEc:bca	2	series	47	papers	0	articles	0	errors	0	warnings

Brazilian Electronic Journal of Economics

RePEc:bej	2	series	0	papers	3	articles	20	errors	0	warnings

RePEc archive at Berlecon Research

RePEc:ber	1	series	4	papers	0	articles	0	errors	0	warnings

Working Papers Archive of the Bank of England

RePEc:boe	1	series	94	papers	0	articles	25	errors	0	warnings

RePEc archive at Bonn University

RePEc:bon	2	series	682	papers	0	articles	86	errors	348	warnings

Brown University, Department of Economics

RePEc:bro	1	series	56	papers	0	articles	1	error	14	warnings

Banking & Finance Conference Papers

RePEc:caf	1	series	6	papers	0	articles	1	error	0	warnings

Department of Applied Economics Working Paper Archive

RePEc:cam	3	series	81	papers	0	articles	0	errors	0	warnings

Concordia University, Department of Economics

RePEc:ccd	1	series	2	papers	0	articles	0	errors	0	warnings

Centre for Economic Performance, London School of Economics

RePEc:cep	3	series	362	papers	0	articles	57	errors	613	warnings

CEPII

RePEc:cii	1	series	81	papers	0	articles	42	errors	0	warnings

Canadian Journal of Economics

RePEc:cje	1	series	0	papers	76	articles	0	errors	0	warnings

Levine's Working Paper Archive

RePEc:cla	1	series	43	papers	0	articles	1	error	0	warnings

CEPREMAP Archive

RePEc:cpm	1	series	770	papers	0	articles	1	error	7	warnings

Department of Economics Discussion Papers Archive at City University

RePEc:cty	1	series	34	papers	0	articles	0	errors	0	warnings

Working Papers of the Department of Economics at Dalhousie University

RePEc:dal	1	series	0	papers	0	articles	0	errors	0	warnings

DEGREE

RePEc:dgr	18	series	1203	papers	0	articles	663	errors	292	warnings

Archive for RePEc at UPV-EHU

RePEc:ehu	1	series	7	papers	0	articles	2	errors	1	warning

Fondazione Eni Enrico Mattei

RePEc:fem	1	series	54	papers	0	articles	371	errors	0	warnings

Financial markets Group, London School of Economics

RePEc:fmg	3	series	429	papers	0	articles	75	errors	449	warnings

Universitaet Frankfurt, Wirtschaftswissenschaften

RePEc:fra	1	series	31	papers	0	articles	5	errors	0	warnings

Economics Working Paper Archive at the University of Glasgow

RePEc:gla	1	series	54	papers	0	articles	30	errors	0	warnings

Heriot Watt University, Edinburgh

RePEc:hwe	2	series	62	papers	0	articles	1	error	0	warnings

IGIER-Innocenzo Gasparini Institute for Economic Research

RePEc:igi	1	series	58	papers	0	articles	0	errors	0	warnings

Economic Working Paper Archive at University of Jena

RePEc:jen	2	series	56	papers	0	articles	0	errors	1	warning

CoFE Discussion Paper

RePEc:knz	1	series	6	papers	0	articles	78	errors	1	warning

University of Leicester Disccussion Papers in Economics

RePEc:lec	3	series	89	papers	0	articles	2	errors	0	warnings

Lester Ingber Papers Archive

RePEc:lei	1	series	50	papers	0	articles	0	errors	0	warnings

Universite Laval, Departement d'economique

RePEc:lvl	2	series	152	papers	0	articles	1	error	1	warning

Economics Working Paper Archive NUIM

RePEc:may	1	series	31	papers	0	articles	5	errors	0	warnings

Non-monetary economics

RePEc:mce	1	series	2	papers	0	articles	0	errors	0	warnings

Economics Working Paper Archive at McGill

RePEc:mcl	1	series	7	papers	0	articles	0	errors	0	warnings

McMaster University Department of Economics Working Paper Series

RePEc:mcm	5	series	274	papers	0	articles	4	errors	1	warning

Economics Working Paper Archive at Universidad Publica de Navarra (UPNA)

RePEc:nav	1	series	1	paper	0	articles	0	errors	0	warnings

Economics works by Nir Dagan and Oscar Volij

RePEc:nid	2	series	25	papers	0	articles	0	errors	0	warnings

Netnomics

RePEc:nnm	0	series	0	papers	0	articles	0	errors	0	warnings

Economics Discussion Papers Archive at Nottingham

RePEc:not	2	series	110	papers	0	articles	0	errors	0	warnings

National Institute of Economic and Social Research

RePEc:nsr	1	series	126	papers	0	articles	7	errors	0	warnings

SUNY-Oswego Economics Department Working Paper Archive

RePEc:nyo	1	series	5	papers	0	articles	0	errors	0	warnings

Ohio State University, Department of Economics

RePEc:osu	1	series	18	papers	0	articles	1	error	0	warnings

Working Papers and Published Papers by Peter Cramton and Co-Authors

RePEc:pcc	1	series	38	papers	0	articles	0	errors	0	warnings

Queen Elizabeth House Papers Archive at Oxford University

RePEc:qeh	1	series	21	papers	0	articles	5	errors	0	warnings

Centre for International Business Centre

RePEc:sbu	3	series	4	papers	0	articles	2	errors	11	warnings

CSEF-Centre for Studies in Economics and Finance Working Papers

RePEc:sef	1	series	20	papers	0	articles	1	error	9	warnings

University of Liège (Belgium) Faculty of Econonomics, Management and Social SciencesDepartment of Economics Service of International and Interregional Economics (SEII)

RePEc:sei	1	series	7	papers	0	articles	0	errors	0	warnings

Economics Discussion Paper Archive at the University of Siegen, Germany

RePEc:sie	1	series	19	papers	0	articles	0	errors	0	warnings

Economics Working Paper Archive at the University of Sussex

RePEc:sus	1	series	51	papers	0	articles	0	errors	0	warnings

University of Toronto, Department of Economics

RePEc:tor	1	series	91	papers	0	articles	2	errors	0	warnings

Economic Working Paper Archive at UPO

RePEc:uca	1	series	2	papers	0	articles	0	errors	0	warnings

Department of Economics, University of Iowa

RePEc:uia	1	series	135	papers	0	articles	0	errors	0	warnings

Department of Economics Discussion Papers Archive at the University of Kent

RePEc:ukc	1	series	58	papers	0	articles	0	errors	0	warnings

Electronic Working Papers, Department of Economics, University of Maryland

RePEc:umd	1	series	11	papers	0	articles	34	errors	3	warnings

Economics Working Paper Archive at Universitat Pompeu Fabra

RePEc:upf	3	series	227	papers	0	articles	0	errors	0	warnings

Department of Economics, University of Victoria

RePEc:vic	2	series	16	papers	0	articles	2	errors	0	warnings

Centre for the Study of Globalisation and Regionalisation (CSGR)

RePEc:wck	2	series	51	papers	0	articles	78	errors	0	warnings

SFB504 Working Paper Archive

RePEc:xrs	1	series	110	papers	0	articles	6	errors	3	warnings

Economics Working Paper Archive at York

RePEc:yor	1	series	190	papers	0	articles	0	errors	0	warnings


Australian Prudential Regulation Authority

RePEc:apr	1	series	4	papers	0	articles	0	errors	0	warnings

Birkbeck College Economics Working Paper Archive

RePEc:bbk	3	series	111	papers	0	articles	1	error	4	warnings

Boston College Department of Economics

RePEc:boc	3	series	373	papers	0	articles	0	errors	1	warning

Economics Working Paper Archive at Brunel Universtiy

RePEc:bru	3	series	119	papers	0	articles	0	errors	0	warnings

C. E. P. R.

RePEc:cpr	2	series	1406	papers	0	articles	0	errors	0	warnings

Research Center on Economic Fluctuations and Employment (CREFE)

RePEc:cre	5	series	142	papers	0	articles	0	errors	1	warning

Department of Economics and Finance Discussion Papers Archive at the University of Durham

RePEc:dur	1	series	54	papers	0	articles	0	errors	0	warnings

Econometrica

RePEc:ecm	1	series	0	papers	346	articles	0	errors	0	warnings

ELSE - ESRC Center for Economic Learning and Social Evolution

RePEc:els	1	series	57	papers	0	articles	0	errors	0	warnings

Fed in Print

RePEc:fip	107	series	4682	papers	8819	articles	0	errors	0	warnings

BibEc Project

RePEc:fth	400	series	28759	papers	0	articles	0	errors	0	warnings

S-WoBA, Swedish Working Papers in Business Administration

RePEc:hhb	2	series	12	papers	0	articles	2	errors	0	warnings

S-WoPEc, Swedish Working Papers in Economics

RePEc:hhs	9	series	536	papers	0	articles	2	errors	0	warnings

National Bureau of Economic Research, Inc.

RePEc:nbr	4	series	7684	papers	0	articles	1	error	2966	warnings

Russian Working Paper Archive for Economists and Sociologists

RePEc:nos	1	series	8	papers	0	articles	0	errors	0	warnings

Reserve Bank of Australia

RePEc:rba	1	series	249	papers	0	articles	334	errors	758	warnings

Review of Economic Dynamics

RePEc:red	2	series	0	papers	62	articles	0	errors	4	warnings

University of Surrey Economics Department

RePEc:sur	1	series	10	papers	0	articles	0	errors	0	warnings

Trinity College Dublin

RePEc:tcd	2	series	8	papers	0	articles	0	errors	1	warning

Economics Working Paper Series at Texas

RePEc:tex	1	series	20	papers	0	articles	6	errors	1	warning

Universidad de Valencia, Facultad de Ciencias Economicas

RePEc:val	7	series	151	papers	0	articles	0	errors	0	warnings

WoPEc Project

RePEc:wop	205	series	4763	papers	804	articles	58	errors	12	warnings

WUSTL archive

RePEc:wpa	25	series	1142	papers	0	articles	1	error	0	warnings

WoPEc UK Archive

RePEc:wuk	46	series	1585	papers	0	articles	0	errors	0	warnings