Martin Paul Eve bio photo

Martin Paul Eve

Professor of Literature, Technology and Publishing at Birkbeck, University of London

Email Books Twitter Google+ Github Stackoverflow MLA CORE Institutional Repo Hypothes.is ORCID ID   ORCID iD

Email Updates

One of the hardest parts of typesetting articles for scholarly publication in the JATS standard, especially when using homemade tools, is the bibliography. JATS (and its NLM predecessors) expects references to be broken down into their constituent components where possible in order to be semantically rich. For example:

<ref id="royall">
        <element-citation publication-type="bookchapter">
          <person-group person-group-type="author">
            <name>
              <surname>Royall</surname>
              <given-names>Tyler</given-names>
            </name>
          </person-group>
          <article-title>The Contrast</article-title>
          <source>The Norton Anthology of American Literature, Vol. A: Beginnings to 1820</source>
          <person-group person-group-type="editor">
            <name>
              <surname>Franklin</surname>
              <given-names>Wayne</given-names>
            </name>
            <name>
              <surname>Gura</surname>
              <given-names>Philip F.</given-names>
            </name>
            <name>
              <surname>Krupat</surname>
              <given-names>Arnold</given-names>
            </name>
            <name>
              <surname>Baym</surname>
              <given-names>Nina</given-names>
            </name>
          </person-group>
          <publisher-name>W. W. Norton &amp; Company</publisher-name>
          <publisher-loc>New York</publisher-loc>
          <fpage>765</fpage>
          <lpage>805</lpage>
        </element-citation>
      </ref>

This is all very well, but it also creates a problem. How do we get from the author's plaintext citation to this structured format? Parsing references is hard. Very hard. My closest efforts in the past have been to write a cascading regular expression engine, meCite, to which anybody is willing to contribute. I do intend to do more on this at some point.

Late last year, however, Martin Fenner was investigating whether CSL could be used to generate a JATS bibliography. His current efforts were in the pandoc-JATS repository. These efforts stopped, however, following a discussion on the xbiblio mailing list where it was decided that CSL was not ideal for generating structured XML.

This may be true. However, there are a lack of viable alternatives for typesetting references. Furthermore, Zotero and Mendeley (both of which use CSL to generate their citations) have vast databases publicly available for the scholarly literature. If we could use CSL to generate valid JATS XML, this would substantially reduce the time needed to typeset a JATS bibliography. To that end, I have taken on maintenance of a fork of Martin's original efforts. Last night, with the first commits, I fixed DOI display, added book chapter support, added support for editors and changed the book title field to the correct "source" implementation. My fork can be found at the JATS-CSL repo.

While the approach may not be recommended, it is far better than nothing and I'll push it as far as I can in an effort to save some time!