|
| Working towards an Open Library for Economics: The RePEc project
|
|
|
|
|
|
|
| 1: The main user services
| I list them by order of historical appearance.
|
| BibEc&
| WoPEc provide
| static html pages for all working papers that are only available in print
| (BibEc) and all papers that are available electronically (WoPEc). Both
| datasets use the same search engines. There are three search engines, a
| full-text WAIS engine, a fielded search engine based on the
| mySQL relational database and a
| ROADS fielded search engine. The
| mySQL database is also used for the control of the relational components in
| the RePEc dataset. BibEc and WoPEc are based at Manchester
| Computing in
| Japan and the United
| States.
|
| EDIRC provides a Web
| pages that represent the complete institutional information in RePEc.
|
| IDEAS provides an Excite index
| of static html pages that represent all Paper, Article and Software
| templates. This is by far the most popular RePEc user interface.
|
| NEP: New Economics Papers
| is set of reports on new additions of papers to RePEc.
| Each report is edited by subject specialists who receive information on all
| new additions and then filter out the papers that are relevant to the
| subject of the report. These subject specialists are PhD students and junior
| researchers. They work as volunteers. On 14 March 2000, there are 2753
| different email addresses that subscribe to at least one list.
|
| Tilburg University working papers & research
| memoranda This site also
| operates a Z39.50 server for all downloadable papers in RePEc is available
| at dbiref.kub.nl:9997. The name of the
| database is "repref". The attribute set is Bib-1, and the record syntax
| supported are USmarc, SUTRS, GRS-1 (only string tags, tag type 3).
|
| RuPEc is a server in Russian. It
| offers search facilities to Russian users. Its maintainers also provide
| archival facilities for Russian contributors.
|
| INOMICS not only provides
| an index of RePEc data but also allows simultaneous searches in indexes of
| other Web pages related to Economics.
|
| HoPEc provides a
| personal registration service for authors and allows to search for personal
| data.
| The "Tilburg University working papers & research memoranda"
| service is operated by a library-based group that has received funding
| from the European Union. INOMICS is operated by the Economics
| consultancy Berlecon Research. All the
| other user services are operated by junior academics.
|
| 2: The ReDIF metadata format
| The ReDIF metadata format is inspired by
|
| Deutsch et alii commonly
| known as the IAFA templates. In particular it borrows the idea of
| clusters from the draft There are certain classes of data
| elements, such as contact information, which occur every time an
| individual, group or organization needs to be described. Such data as
| names, telephone numbers, postal and email addresses etc. fall into
| this category. To avoid repeating these common elements explicitly in
| every template below, we define "clusters" which can then be
| referred to in a shorthand manner in the actual template definitions.
| ReDIF takes a slightly different approach to clusters. A cluster is
| a group of fields that jointly describe a repeatable attribute of the
| resource. This is best understood by an example. A paper may have
| several authors. For each author we may have several fields that we
| are interested in, the name, email address, homepage etc. If we have
| several authors then we have several such groups of attributes. In
| addition each author may be affiliated with several institutions. Here
| each institution may be described by several attributes for its name,
| homepage etc. Thus a nested data structure is required. It is evident
| that this requirement is best served in a syntax that explicitly
| allows for it, such as XML. However in 1997--when ReDIF was
| designed--XML was not available. We are still convinced that
| the template syntax is more human readable and easier understood.
| However the computer can not find which attributes correspond to the
| same cluster unless some ordering is introduced. Therefore we proceed
| as follows. For each group of arguments that make up a cluster we
| specify one attribute as the "key" attribute. Whenever the key
| attribute appears a new cluster is supposed to begin. For example if
| the cluster describes a person then the name is the key. If an
| "author-email" appears without an "author-name" preceding it the
| parsing software aborts the processing of the template.
| Note that the designation of key attributes is not a feature of
| ReDIF. It is a feature of the template syntax of ReDIF. It is only the
| syntax that makes nesting more involved. I do not think that this is
| an important shortcoming. Instead I believe that the nested structure
| involving the persons and organizations should not be included in the
| document templates. What should be done instead is to separate the
| personal information out of the document templates into separate
| person templates. This approach is discussed extensively in the main
| body of the paper.
| The examples of Subsection 3.1 emphasizes that ReDIF is not
| just a format to encode preprints or other online publications. Of
| course it is suitable for that, but it objects of description is far
| broader. The ReDIF vocabulary as currently defined aims the output
| aspects of an academic discipline. These include the resources
| (digital or digitisable objects) that it creates, the creators of
| these resources as well as the institutions that support the creation
| process. ReDIF allows to describe things in the world that are
| important to the work of an academic discipline. We will use the word
| "item" to refer to any such things. ReDIF describes three classes of
| items: resources, bodies and collections. A "resource" is something
| that is digital or can be digitized. For example, a book is a
| resource. Resource templates include ReDIF-paper and ReDIF-article.
| A "body" is something that is neither digital nor can be
| digitized. Body templates include ReDIF-person and
| ReDIF-institution. A "collection" is something that has a manifold
| nature that makes it difficult to say if it is digitisable or not.
| ReDIF-archive and ReDIF-series belong to the collection category.
| As far as version 1 of ReDIF is concerned, the grouping of record types is
| not important. The record type structure is in fact a flat finite list of
| templates types. This is the main limitation of the current version of
| ReDIF. It is hoped that the version 2 of ReDIF will object
| oriented. Record types will be classes. For example a preprint record type
| will be defined as subclass of a document record type. Thus commonities
| using ReDIF will be able to refine existing types as they wish. ReDIF
| records will be syntax independent. Version 2 of ReDIF is an important area
| of further research.
| ReDIF is a metadata format that comes with tools to make it easy to use in
| a framework where the metadata is harvested. A file that is simply
| harvested from a computer system could contain any type of digital
| content. Therefore the harvested data must be parsed by a special software
| that filters the data. This task in accomplished by the
| rr.pm module written
| by Ivan V. Kurmanov. It parses ReDIF data and validates its syntax. For
| example, any date within ReDIF has to be of the ISO8601 form
| yyyy-mm-dd. A date like "14 Juillet 1789"
| would not be recognized by the ReDIF reading software and not passed on to
| application software that a service provider would use.
| The rr.pm software uses a formal syntax specification
| redif.spec. This
| formal specification is itself encoded in a purpose-built format code-named
| spefor. Therefore it
| is possible for ReDIF-using commonities to change the syntax restrictions
| or even design a whole new ReDIF tag vocabulary metadata vocabulary from
| scratch.
|
|
|