As I have been thrashing around with RDF in the context of TDWG over the past couple years, I have wondered if there was anyone at
Vanderbilt besides me who was working on anything remotely related to RDF,
Linked Data, or the Semantic Web. I
never searched systematically, but when I brought the issue up with science and
computer people, I usually encountered blank stares and the question
"What?"
Recently, I started following Clifford Anderson
(@andersoncliffb) and David Michelson (@davidamichelson) on Twitter and got
interested in the "Topics In Digital Humanities" course
they were teaching this semester. I
decided to get out of my biodiversity informatics silo and attend part of their
final student presentations
this past Monday. I couldn't stay for
the whole thing, but I was fascinated by the part I saw.
The three talks I saw were related to digitizing metadata
and images related to early Christian artifacts - particularly in the context
of the syriaca.org project. Although it
seems like there would be little relationship between those projects and
biodiversity informatics, one thing that really struck me as I watched the
presentations was how similar the problems they faced were to those involved in
digitizing natural history museum specimens and recording species occurrence
metadata. They struggled to find terms
in controlled vocabularies to describe their artifacts. They dealt with issues of demarking segments
of an image that documented several features of interest. They were working out how to work collaboratively
on common data sources.
At the same time, I was struck how the tools they used were
different from those that are used or talked about in TDWG.
First and foremost, all of their work involved using
XML. I've heard almost nothing positive
about XML in the context of TDWG: it's too verbose and takes too long to transmit,
it's confusing and not readable, etc. So
I was surprised to see that it was central to what they were doing. There seems to be a simple reason for this:
it enables them to mark up text using very simple tags (looked to me at a
glance like XHTML) and then use existing technology (primarily XQUERY, I think)
to search the marked up text. In other
words, they are immediately accomplishing useful things using off-the-shelf
technology. This is in marked contrast
to the biodiversity informatics community where years have been spent arguing
about whether GUIDs and RDF are going to solve our problems, or if they are a
useless waste of time, and then having nothing functional to show for all of
the arguing and effort.
The second thing that struck me was how little emphasis
there was on URIs or any sort of GUID, including DOIs. I was a bit surprised by that. I asked a question about URIs and it seemed
to go right past the speaker. I suppose
this is a function of the fact that the documents on which they are working
exist in a local database and there isn't a requirement at this point for them
to link to records elsewhere. But it
seems that they will have to face that issue at some point.
The final thing that seemed really odd to me was the whole
identification as "digital humanists". I have to say that I don't exactly
understand what that means, but after looking at things like https://www.hastac.org/
and https://my.vanderbilt.edu/digitalhumanities/ I'm getting a better
idea. I think that one reason why this
puzzles me is that the Linked Data world (with which I'm more familiar) is
fixated on connecting all information of all kinds and therefore Linked Data
advocates in the biodiversity informatics community aren't interested in
calling themselves "Digital Museum Curators", "Digital Scientists", or something like that
because they consider their interests to include agents, literature references,
and geography in addition to collections.
I think that some of the differences I've seen here in
approach are related to a difference in scale: biodiversity informatics
involves assembling many small individual records that are scattered in many
places vs. digital humanists marking up larger works that are localized in a
few places. In any case, I'm impressed
with what the Digital Humanists at Vanderbilt have accomplished and I'm looking
forward to learning more from them.
No comments:
Post a Comment