''XML-Centered Digital Library Metadata Management for the Repository, the Catalog, and Beyond''
//ALCTS Exchange, May 2017//
This hypertext-style poster illustrates the use of XML schemas and technologies for digital library metadata management at the Dartmouth College Library. Click one of the below links to get started, and please explore from there to learn more about our workflows, implementations, and skill development opportunities. Thanks for reading!
—Shaun Akhtar, Metadata Librarian
[[Instructions for first-time readers->Instructions]]
[[The Beginning]]
[[Acknowledgments]]
Contact me at (link-repeat: "shaun.y.akhtar@dartmouth.edu")[(open-url: "mailto:shaun.y.akhtar@dartmouth.edu")] or (link-repeat: "@ShaunAkhtar")[(open-url: "https://twitter.com/ShaunAkhtar")]
<img alt="ALCTS Exchange logo" width="250px" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAfQAAACcCAMAAAC+2Kk7AAAA81BMVEW/v79/f39AQEC/6ebi8M/7
z8395Mf+8/L81qr0iILxZFz4+/Pv+fnv7+9gyMG324f5u3MQEBD3mSyf3drT6befn58gICDf39/9
6tX++PGh0GMwt68QrKKSyUv95+bPz88wMDBQUFDw+Off9PNwcHCvr69gYGAgsajp9NuazFfwWE/+
8ePP7uyPj4/3oDr5t7PvTEPF4p9/0s32n5r4rVb6yI71lI7b7cP6w8Co1G/ycGhAvLXzfHVwzceP
2NRQwrv4q6f83bi+3pOv4+D4p0j5tGXM5quv13v7z5z6woH829nuQDYAppyLxT/2kh4AAAD////q
WvsUAAAAAWJLR0QAiAUdSAAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB+EEGxQpGsFfeeQA
ABelSURBVHja7Z2JVhpLEIZhQNEgICqgRtkXN9yyGDXGGzXrQOj3f5rbe1cvM4Bo4tJ17rlJEMdh
vunqv/6uHhPIx6uLhL8EHroPD92Hh+7DQ/fhofvw0H146D48dB8eug8P3YeH7sND9+Gh+/DQfXjo
Ph4b+tFihv0ls3jkL+5rgF79ePwmk5kjf/2Uybw5/lj11/eFQ/90EYaY+ZvqHI4q+VsYfv7kr/DL
hZ55txcy5p/R4tzcIvrMqId77zL+Ir9I6HOLYciZh3PVcG4uzHwKOfUwXJzzl/mlQc8c7YWK+R56
R6D/h/YU9fDNkR/uLwl6dfE4BMwpbQx9D/0XAurh8aIXdS8F+tFFGGrMj3FeJ9DDT5lQox6GF76G
ewHQq+/2QoN5uIguGHSs50zqWNT54f68oc99DkOLefi7GjLoYfV3aFH3ou45Q8/8txe6mL9BHwX0
j+iNi3q4958Xdc8R+m8p3gzm4VHmWEA/RkehkzoWdb/9dX9m0I8MhoArB02h8xvARd2LuucFvfrx
OIxkLlI6g85SfQT10BvzzwX6p89hGMOcizcOnYm6SOremH8e0OlSihUZ+fpv/rdMhv5RReItVZRx
fqsf7M/RnPHx8kf64nFMbmfWK0jv1JKNyfDemn0G0DNVtbjiYikZC+hk8SWS+t4Rnhd8yf7kR/oi
qbT4MqqDJLHdNehkmTXivcSZO7pY9Nf/KUPPZrNra2iR2ufSczc4os8mdGbAW9RJY0X13TH+jtXV
paUlD+FpQl9Lp9fQ9hpl+HmOtUaZGVsNawk9rM7Zc8HFJ27cY+Y9tJpKrXoKTxL6Zjq9iU6GjDq1
z6sfL3Tm0ouB0N+hPZ36xceqMO4x88Ep6qVSl57CU4S+M0ynh+nmUFDn9vlHKMyO0LENfY8b8KFw
7IhxL2aD1cGglRqkUoMDj+EJQi8Q6AX0Q1Eno1YTdccZhVdBlwY8xYxvFJkdKPNrlCfQ8xOdaDKZ
01+pJJMR782V2kEQNJI1+N1WVORXO+TtQbvjoQMVNyTQh9mzoaT+hve1yzXWj+jCBf1C3CRsSjgW
OoAyH6RaAwJ9axItVx6N+sapj0butyaLIx6BwJ4c2SE+eqcsXimXPHQR3xn0E7QrqAt2xD5nhjy3
3U3oYbWqv499J2PeQ6cU+uDLBOdJmEwEPVfH70wmEolOF8MvGSO9PAqMkd7Gb2/jtyfaGH4j56Gz
2GXQdwl9Sh3odj6CL2BxBqEvootjXOZlZGsV+V7GHLPuMei98efQGHXV2IyDjpkXRUrINSR1EcHI
mBM6o1EXqXxQ99BpYN4U+vAsO2TUdd1OjJYqOnZDP8Zf0gwdqgYo862l2wGDPkiNO4fKaFTpjhoT
QA9GRTWVI+tOsaDXR4H6R8mcQl4t9B8C+g+s6HBsImCvMo6ftHYaBZ2q/E+G6/4fQm8J9Dy6FNDH
SrkkRoMhV8ZCLxmUG/VaLPSc/v6Kn9NpNIcC+rC5Q5hns2cwmbN6/B0QddJ7J/b6O/MOWUS3S0uE
+kFrIKAPWmNlXIn8LzkWelklawY1F5/eI9Xg64Z+oqDfoE3CfFNVbsB5U8Y8g87sdd2A57r9LaH+
Fl0p6Kfx59AZFfH/+7qUc/GqGdnAkf0t6DkP3YptBX0brVHmQ4P6InfquDFPdrgoj1558qpWI9RX
l7YU9N44GddmqbgzBnp7nBKz5vTR65jGp4O+NlTQh2tZxtygrgYzHd7VqrYal5kzmVPq/G8M+mB1
nIxjsqwxBrrFdOwbbIHvoaNzCP0H4sw16nDdPNyjnW9w3R0Y8II5oY6uIfRYA77NFbYu5VzQi6PO
lNAruJgPSjkPHcTOEEDfzDJZZ1AHHTJig2oGNErLjhrAnEg3quYE9FgpVxSDUZNyLuhmhTZBKqgE
xIyrtzs5D51HAUDH8/l3PvIhdWC7M/E2N6d3W9DNjAbza/SFaXgB/Sr6HEpUxiEq5YoPDh0fqMt8
20bJQyeR3VbQiYbbpga8Tn2R2+7cnGPqHdpwF+J9ivngdolpeAk9xoAPqIzjVXXp4aFTm5aO93LN
QxcyjkJnuv07NeA16r8Ne13U6YYxrzHvoS9Mw0vo0VKuAmbyLvDP3NA794JO68LuSHPzXi30XQmd
12q7rG4H1C/Qx1BbcFOOnBj7ZOeLxpzZ7pS6hP42WsaNAhFlcAM8jHrXqvyiuajzGqGnBd10U9Rq
6eZQp36EPhv9ktqCy28yy6MjnfmA2u6UektAjzLgc0V9RbT9sHW6adXUXj30goQua7Uf6kVG3dDq
BnSi5+mhNOZ5Xq+Ryi2lXoyScQkVbSXlXNA70zpyVp2QeO3Q5aDezMr6fNhMA+a7GHrG3NSoQyeb
FQn0HoCeag1kvc4qNzr8nSdRV2Nbl3JO27xoeO9TQq976CeSuUr0vJeCjfo0fZux+mJBD1l/e+pa
kma2O0/qkrqzl8J005WUc0JPGgm6ZLRFxEPPjTz0XTnOwfDe5ZJ+96SJmic3qGCuvljQ8XyeR1en
S6h1usWF+hYY9HKsOw14qNf1ade9QFYflXOaNOvHQq8VS9otU3ztQu5M5fY0yOlrpHg/x3k9XWCN
FSZ1AzrVcES55bFUW70kJTmY4FMpleFvnTLOsEzk4qkbOlHg8kN1iqawM6E3sDLMKfkwTvy/fOjn
aj6H0M/RSaGJsmu7soIzqOvQmW5nNVpvFQ/3/BfaPSGhq3n9Os6NQ8qVywnoQOElaor6qEs81Vwp
wMXemPV00lFVpm2zuU7wSrql4qA3gYaD0Dfx13YKzKrjXo1OXYPOazXaBEmMtzwx2XsadEXdNuDL
mozTpFxCL+UCyJFHMTleyJXKoBzMvXboN0C3K+iFtPgSCeHKrkU3RvJUnloSmInLnspD6JK6ZcBX
gsAqwZIBuw9qgRbg5qi0ialabPRthu3ANthL3Tp5e5B8Hc1ScdCp7S5qNQ599ySLxdtuVrizdAOE
RR1AV54M2djA/dal3mkLLZ32wNIqp76FfPxT6GuwPqfQz8+wePtBCfN1mB3XSiuArq2lpoDtPrjG
ou72Uq2ycep+M+O/hb4JPZl0evsGi7fvu7xqO+Gz+83QQV1C17zXU9YDSzYtMvhfsKi72hLeO6P+
1hP5l9B3NB8unSXibaiseM55e+igLqDrfvsW/5cy27fyB2hpKaV5c34z47+EXgDMCzs4228Cvmwu
35Zzu06dQzfWWAasEzIv53ZKGufzgzygnvdI/h30rGS++x2Lt2xawzuk/yYd0S7qvAXaYD64pLiV
7S5UPRZ1X3qC+pZ/MMW/g/6dMyf2+tm5VqdLA765Mxy6qFPoFnO2u0HU67BOv77Fou6aU//imfwz
6D8I8+0TLN5OdodDCzrZzHiuL7Iq6gS6gzndx7QKjRlZsvW4MY+pX3sm/y69ZzeFvT50QCee+1l2
e+ikjqG7mBPPXbPdYQs0N+bf+ucO/UshdybsdTf0c/Sd7Vx2UCe/osvBnFThmu2uQSeibgkd3Hok
/7JObxbAQLagD5sItkiavTRO5mTttDWIhI5TwZVpvucSkSGXzWvqNc12rUS8To/bSXYD+oiSksN7
VUeMOBm3XwtOJOE08cEZ6UcYc2CU6CeZ1Zx0H7gWfZlyU0FPw8hm00Y0jXfI2CHfnHIHhm6+hAt1
/R366Y0iQy6wgEeLaF0zgVp30T95rl+HBypbnrv6zoiTcS7A5uBBnS304Ez15bzYA5ca2ueuO7bj
BNGXKTEV9KeRiCaAnis7P2JJvar1UeSSRetY3dwDQC/BIzbGQNfPKebAjrMtmo9cejjoazB72+k9
jdC2M73fkG/OO9P7FkKpuPQ+6K3eAzrphrRfBE202qjqFF0HK3Zmh64PyNwY6Fr2iTxwre787OXE
40DfUdLdBX0XpaHxDs26hfX1BTf1U5QyKzYInQj4g3tAh5dbptW2+3Mnow5XmhV6RT9efwx0bSqK
OnBpNMnpPiD0TdoDtxsBfQ1t7zTdzP/Mz/9xU2+l2BKbCzor1Xv3gl5Ro1c0yFUcjfKIPoNmgst4
L+h9/XD1cdAhjogDRzPXP9VDQb9BO9vCjnNAJ7a7WkzXmRPoTupkSf12acsF/fKWdstuHaCre0CH
VzMZp+LiriKgfi/oZiKujINeHwc99mwjUtNs0AuEOjPeb7Yt6AXSQAc3MwLmFLqL+u2SMOB16KRS
o/Y7Zp6/F3SktFyxYkzz4OLUjCMEAZzgi5VZoFdih6JzbknGQzfOth4E9SicDwS9OWTUh9uFHZRd
axrQm82hsZlRMWfQberMdm8dGNBbcqGNMDfa5NR17kbW6cbN0dBvgiBiMDbYJ08EDgj3gd62xNZY
6Ooucx4Yni1/bEIONPTBhn51wrPV6eeC+nC4uYZQtqD7cQVrM6NkzqFb1FlD7JXuyeWXEFpl3RWU
+TWaRjA7J+uEdnVrznQJ1LqQ8+Cl+0AvW2OsNg66qutcBy65VSE/htauH3XC09uwQ0V9uI21vGib
ka3v8mmCBnMB3aTONi1C9500z6ADPslT5mbr+8TQtQoNCLu2ew6oWWnUPXAmPhmZi8sNp1PkLh0S
MdDLEVKd3gxGu+hDQSepW1FPp3+c4RuBK7dtvslF28womUvoOnWxaVFub2FtcvKhBJR5D90TOhTP
pa4l5o15vm9+a3nWOl3+yLYcocXx0MvR0CPP1rUl48GgnwwBdfps2BMm6sB2NjDVK+YKukb9oAU3
sm2JhlgOnTG3nic3OXQwBQJ11nHOAIF5zdozO3LyZ3Zyzh8eYRIkI6E3oou/ZA09FvTmEFDn6p00
va+dK9YnsncGMAfQAXW1aTHVGlyuytZ3Bp0ztzauTgE9MU7hF+ON8Zmgd8DwbkRasTZ0ruUcB578
bB8SOkvdnLos2TbX8HCXWV3keY05hK6oq02LeYQH+epbWKcL5nl0f+i2ftZrZVUBFdGDQ++CibwU
acUmo25L+8CJyc/2QaHzwU2pgzp9+waIOq7oNOYadEFd9MLS7Wwov6WZM4K5/TCKyJKtEqvlXGw6
kdk95hpqP7MfCV1L6bnIMaqgB/ocYB+4H790M9kJJxJTQ+cTN6UOzZlddCZ7aljtpjPXoXPqV/zJ
z6Q/JmXsZZPM306+ypYct8xlVcrJyZNGMBoTyaifXNSm4yASuqovqNS0DzzF2cad8PTQZere2YbQ
iSezK4x5YsAbzA3ojDrxZMhuphYWb5oBn0pJ5o79LVNBNz97IuKKlx4cul6m9aOsWHUKBtQ4f0h+
inZgxKNAFz1wmPoOSO/cfWWiroDOTeYmdEL9Ev93Kx9GcbsEoB9I5o725+mg10YxS9pBrDM5E/SK
nqwrUcUWgA6KjcqE0INoog8IXeq1AnjOFLPdh6Ijnsg6g7kFHVPH0k10t2ubGek0no9+0tB00PUP
3/5r0PuG6KpHVFsQeg1KjKcEfUeBbsqOCbiiSox5tHE3DvodIvb61sDazIhHd0vdAK1ZoSfifNBH
hF43TLh+xClA6KDY6Dwp6OpBsGnpzW3qvRO0T2Z+OQ768jx5zxV0ZLmso7o9pXbAzAq9HFOlA7fj
oaFXTDumEpFtNOiqy6uYe1LQ5V61dFZQX9O63fF8Po8+7KOND4du6Ic/N9D+LzSv+/CigMPM5QZG
5zblqaBbZXD/76j3tlVS18cVEHoN2X5S0GUbXDotVlrRmrHGsoy+/fn2FaGv32zo/HXyFp0628xI
dLtsl+rFJ+zxdXrFqtO1PrTkKHolZLY6Xf0yP/Er3wK3FatDN5rqxpZsE0GfuU5H6kly9NduEura
pkWm2/fn5YhegdBXfuEM8JNkgPl9c/XlkqR7WqtJ6Kfx0MeXq47bves8VHnyI01yMrW4nNCNgW67
SbHmDC/Z6rHQZ3XkpAHPbVhKHW5a5LXaB8RS+/I6Qgt3Avr7BTnXH6IP1prbQYt7MhJ6a0bonTF9
I7m4le5ZoHfjoBdjoJttdW4b1jRz+48OXbTBUXMGUy+AxVRRnx+in0KlL2yg9eWV+fkVfANsLNzJ
im3FWnPLo2tWn8u9bGg26KD7veHufq5H5vdabRboxYlb2UzodnJydFCZXlL78aGfwW7YAv1lu9a6
2sLGipjEaUrHIVM9eW1jwV5pJUZMHnbDpmaE3gYrVyWn5OpHDXV8v+hrq1NB78RrvkYc9Fo09EbU
bFR+fOjcgOc27I2q14EP9x4ta04Mjp+wZEPv7ZVWXJ9fwRboHpoNek27cIGjDw22LtZzZjGnbR+Y
Cnp3jNLPxUC3Co6kayGhHbHA8HjQv0PoJ0hUbpr3ur8uKzRcvu3Tkf5BjvT9fXulFc/n/OkEHPqX
GaEH2sBIOEda1029azUaTgM9N66868dBz5Uja8Gi8xC14t+AnoXQs2e8ctP99l98LHPxhoUcMWQW
3jvzQF7Uai3YRLE0G/S+od26Li0Hu5Tlr0pP1O1myWmgl8ZBr8dBN42IpNt1aIh8VSqO/gZ0ptzE
nP6DaXhjjYXO2ivLeDpfOBTq/RCLuv3lFW3GF9Spbs/DOT0/RTuMS5kXjZHt2PNiJtNyFxfU7bJr
TE4DvRHR7FB2LLU5oBuzQ9KpTMnHSnYSpXZ5WnPG3TwwHvqOgk6fJIap35hrLAvoPRPuWp1O74L3
9nuvmG5nBjyDfjAb9K7VLJN0Xsj6ZJJrCuiVqJLA4bI5oevFenLCz/7Y0JF8Nuwm82qwhjfX1T4g
mc01R47ke1akg/jKdTt7iiCFfolmgp6wrxocJ2qo5WKo1+81p/ej2iBrDvXtgq7PD8kx3VV/Dfqa
gM6fFYihr8OMffhhA79rxe29r+Avacb8n5V1Dr1HDXgKfXU26GVHLnfuX46hXr9f33s9siGubBeI
TugasSSa2PZ5XOiI98jxZwXi+XwZUGf2OnHXndC/oeVvWNR9fQ+YL3MNTw142iOHZoKedLY8B85X
c42I3J67V8lWiW5la9sllxt6JXodyTXW6/W/Av1G2LDnQrcL6spe3/jqhj6/oRvzhLnQ8JfChr2a
CToQbYG7ctcWXkplh1lauqcjF9OD5ei+dUOP0B/sR1onG+SCvwK9yaBT253rdkr9Ds/Y67we+4kO
XdAPxYzOjXnGXFAnjXMEemsm6I2I7cHdqCaakpHjy8ncfW3YcmR2h4V2Jx46mCPswrSvYS/2wed9
VOiIPjGSNr7KWg1TF+srhgGvQQf3Ar1F1kXRTqmTpwmSJ0ZG/+RKMjo44Zx6RddS4AvWb1LuN/iF
Lgf2fhFUcn+fOpmE9SMczZYl6+2JiPOpWe/Uv8p3VNe7He0orh9lRem+0M8IdNI9IZmThI1+Qjn3
dcMBfQVm/T8rPxGfDES9vrRKoPtnx/3lmOzpUrvpNPnNXII5lWYfdA3/TRlvCjqw3dl8/kF1WxDq
q2grlep5Ck8S+kk6fYJ2GfMV0R21rFPfn7ehr+/rzJdZgceMeUz9LTpNpfwjgJ8m9Gyz2UxT5qA3
wqD+C92Z0O/QL5O56JQkVs4CMeBbLf802KcJnWp4tKDsdQf1Fcf+dGC7K+aq24L0w/t4ytAxbr6E
8sdNXSIW0FeAXasxl8b8hgfwxKFDe91F/b1I5gK6SvgWc2nM+3ji0Pfn7dgAr6IN/pr4E8kv7Yuv
abHvr//TH+k+XuFI/7Xyxw6V4YUVw9O7tGtcuZ0b9z6ewUhfuIujzk1XBl0as27md35Gfz7pfX05
mjpfXmHQxR4IF3PSF+/jOc3p0j63qdP9Sxw6z/UO5tS49/HchNz8Nzd1ZsBT6Lytwmb+bd5f9Weq
3kFfO6RODXgKnQ16kznpi/fxjEs2zarh1OlETqDf0endYO7tmBdQp0NTllE/VA/5XzGYE+PVx0sw
Z8DyC6NODHgMnW1/AMwPvc/+YqAj8MAZSp1sYsLQyUYnwHzZi7eXBZ2IukNFfX+dQMf/l8xZX7yP
lwUdib52Qn0ZvZ+fx+NdMH//1V/kFwqdG/OY+uHGwvz8wsYhZe7t9ZcNHdHnjWDqX9HGBlogzL29
/vKhI2LML4u/Lnt7/ZVAR2SUa3/6eAXQfXjoPjx0Hx66Dw/dh4fuw0P34aH78NB9eOg+PHQfHrqH
7sND9+Gh+/DQfbyE+B/dGtZuY+5MVgAAAABJRU5ErkJggg==">Welcome! The (link-repeat: "Digital Library Program (DLP)")[(open-url: "https://www.dartmouth.edu/~library/digital/")], a cross-departmental committee, has recently approved proposals for multiple new projects, and work is underway.
DLP members assessed each proposal in multiple impact areas, including digital production, conservation, metadata, rights, technology, text markup, and digital preservation.
These new projects have progressed through conservation and digital production stages. In addition, we are reviewing the metadata for earlier DLP projects to ensure they meet our current standards. We're ready to initiate metadata work on one of these collections. Where would you like to begin?
[[The Occom Circle]]
[[Jewelry Design Books of Jaques and Marcus]]
[[Sanborn Fire Insurance Maps]]MODS, the Metadata Object Description Schema, is an XML schema that has been maintained by the Library of Congress since 2002. We use a locally-developed implementation influenced by the //DLF/Aquifer Implementation Guidelines for Shareable MODS Records// and the RDA-to-MODS mapping. For a given project, both collection-level and item-level records are created in this schema.
(link-repeat: "Dartmouth College Library MODS Documentation")[(open-url: "https://www.dartmouth.edu/~library/catmet/metadata_nonmarc/mods_docs/")]
(if: $project is "The Occom Circle" or "Sanborn Fire Insurance Maps")[[[Where will we write and run our XSLT?->Oxygen XML Editor]]](else-if: $project is "Jewelry Design Books of Jaques and Marcus")[[[How will we create MODS?->Oxygen XML Editor]]]
Current project: $projectTEI, the (link-repeat: "Text Encoding Initiative")[(open-url: "http://www.tei-c.org/index.html")], is an XML schema widely used to accurately represent the content of manuscript material. We have applied TEI to encode the text of letters, diaries, codices, and reports.
Although TEI is primarily for content, not metadata, the header of a TEI document created according to local standards will contain a significant amount of descriptive information—including titles, names, summaries, and genre headings—that can be transformed into other schemas.
In order to efficiently convert hundreds of documents from TEI to MODS, we use an XML technology called XSLT.
[[More about XSLT->More About XSLT]]
Current project: $project(either: "The records validated successfully!", "The validation found a couple of minor errors, but we have cleaned them up.") Now we can upload them to our repository.
Content and metadata for our digital collections are stored in a locally-developed repository called "XCDAS", "XML Collections and the Directory-based Archive System." It implements multiple open-source tools to provide version control on text documents, run fixity checks on binary assets, create derivative versions of content for display, and deliver them through interfaces optimized for image, TEI, and EPUB collections.
XML Collections, which stores metadata and encoded text, has a web interface that supports batch upload and download. A file or set of files must be locked before it can be downloaded, which prevents update conflicts between users.
[[Upload MODS->Next Steps]]
Current project: $projectMODS records for (unless: $project is "The Occom Circle")[the] $project have been added to our metadata repository. As we make changes in response to the evolution of our metadata implementation guidelines, or if we have data errors to correct, we will initiate the process by downloading the records from XML Collections.
Now we can use (if: $project is "Jewelry Design Books of Jaques and Marcus")[an XML technology called ]XSLT to export metadata from MODS to other formats, as necessary. One primary example is MARCXML, which can be losslessly converted to binary MARC for loading into our local catalog. Another is XML for Crossref, our DOI (Digital Object Identifier) registration agency.
(if: $project is "Jewelry Design Books of Jaques and Marcus")[[[More about XSLT->More About XSLT]]
][[Let's create MARCXML->Create MARCXML from MODS]]
[[Let's create Crossref XML->Create Crossref XML from MODS]]
Current project: $project//(print: (passage:)'s name)//
XSLT (Extensible Stylesheet Language Transformations) is a programming language optimized for processing XML documents and producing text output, such as new XML documents, HTML web pages, or delimited formats (e.g. CSV) for data transfer or spreadsheet applications.
Constructs called "templates" are written to match specified portions of the input document(s), select and process the data therein, and format the result within the output document(s). A related XML technology, XPath, defines the syntax used to address parts of the document and the functions that perform data manipulation.
XSLT is itself an XML schema and can be created using XML editors. XSLT processors can easily apply a single transformation file to a large batch of input documents.
(if: $project is "The Occom Circle" or "Sanborn Fire Insurance Maps")[[[More about MODS->More About MODS]]](else-if: $project is "Jewelry Design Books of Jaques and Marcus")[[[Let's create MARCXML->Create MARCXML from MODS]]
[[Let's create Crossref XML->Create Crossref XML from MODS]]]
Current project: $projectThis poster was assembled using a popular hypertext tool called Twine. Each section contains text, and one or more links that allow you to view another section. There is an "undo" button to the left of each section title that will return you to your previous location.
Internal links, which connect sections of the poster, will appear in blue if you have not previously selected them, or purple if you have:
[[Back to the introduction->Introduction]]
Sometimes you will encounter multiple options, based on branches in our workflow. Select your favorite! There's no wrong way to go, and in this poster, no dead-ends.
Following a direct path from start to finish should take between 5 and 10 minutes. To learn about other parts of our metadata management processes, return to an earlier location and make different selections.
External links, such as those going to supplemental information, will appear underlined in orange, and open in a new tab:
(link-repeat: "More about Twine")[(open-url: "https://twinery.org/")]
Twine does require allowing the use of JavaScript on this page.
That's it! [[Let's get going!->The Beginning]]My Dartmouth College Library colleagues in Cataloging and Metadata Services, especially Bill Ghezzi, Mina Rakhra, and Cecilia Tittemore
Rose Reynolds (Smith College) and Amanda Wise Pizzollo (Simmons College and Amherst College), for inspiring examples of applying Twine to library contexts
Members of the Dartmouth College Library Research Collaborative, for invaluable feedback and suggestions
[[Back to the introduction->Introduction]]That's it! Congratulations, descriptive metadata has been created for (unless: $project is "The Occom Circle")[the] $project collection in MODS, and transformed to multiple formats in support of improving discoverability and reuse of these resources. The MODS records are preserved in our local metadata repository.
Please consider joining the ALCTS Exchange conversation in the Poster Discussion Forum at 4:30 EDT on Thursday, May 11. Which metadata schemas and technologies do you employ in your own work? How do your workflows for digital library metadata vary?
Feel free to contact me with any questions, comments, or feedback over (link-repeat: "e-mail")[(open-url: "mailto:shaun.y.akhtar@dartmouth.edu")] or on (link-repeat: "Twitter")[(open-url: "https://twitter.com/ShaunAkhtar")].
Thank you for reading!
[[Acknowledgments]]
[[Return to the introduction->Introduction]](set: $project to "The Occom Circle")The Occom Circle includes 525 manuscripts related to the life of Samson Occom, an 18th-century Mohegan leader, minister, and author. These documents have been drawn from multiple archival print collections at Dartmouth's Rauner Special Collections Library.
The metadata assessment for this project determined that item-level metadata did not previously exist. However, the project did involve transcription of the imaged manuscripts by the Library's Text Markup Unit, using the TEI schema. As a result, the new metadata created during transcription can be extracted from these files, and converted to MODS, our primary metadata schema for local digital collections.
[[More about TEI->More About TEI]]
[[I want to work on a different project->The Beginning]](set: $project to "Jewelry Design Books of Jaques and Marcus")These books contain drawings of custom-made jewelry created by the Jaques and Marcus firm in New York City around the turn of the 20th century.
The metadata assessment for this project determined that while there is a catalog record for the collection, which is held in Dartmouth's Rauner Special Collections Library, there is no item-level metadata available for the eight individual volumes that comprise it.
That's okay! We will create original metadata for the books in MODS, our primary metadata schema for local digital collections.
[[More about MODS->More About MODS]]
[[I want to work on a different project->The Beginning]](set: $project to "Sanborn Fire Insurance Maps")The Sanborn Fire Insurance Maps collection contains detailed insurance maps of New Hampshire towns and cities published by the Sanborn Map Company in the late 19th and early 20th centuries. These maps are selections from a print collection held by Dartmouth's Evans Map Room in Baker-Berry Library.
This digital project was originally completed a few years ago. However, we are now enhancing the existing descriptive metadata to meet our current standards for new projects. We will upgrade the existing metadata and convert it to MODS, our primary metadata schema for local digital collections.
The metadata assessment for this collection determined that there are opportunities to record additional transcription elements, expand the use of authorized access points, and add coordinate data for the geographic subjects represented.
[[More about earlier projects->More About Earlier Projects]]
[[I want to work on a different project->The Beginning]]Double-click this passage to edit it.Metadata for earlier digital projects may exist in a flat XML format such as Dublin Core, or a custom field structure, within an older content delivery platform or other repository. As metadata standards and delivery systems evolve, this metadata should periodically be reviewed for accuracy and completeness.
[[Where can we conduct this review?->OpenRefine]]
Current project: $projectWe primarily create and process our XML documents using a desktop application called (link-repeat: "Oxygen XML Editor")[(open-url: "https://www.oxygenxml.com/")]. It includes a variety of XML-specific editing capabilities and an XSLT processor.
The most important Oxygen feature we use in working with MODS is the document type association, which provides a way to integrate a custom "framework" of files into the creation and management of each document that uses a particular XML schema. We have developed a framework for MODS that encodes and enforces the decisions of our local implementation guidelines.
(link-repeat: "MODS Oxygen Framework")[(open-url: "https://github.com/akhtars/mods-oxygen-framework")]
This framework also facilitates transforming MODS metadata records to multiple other formats for distribution across local and external systems.
(if: $project is "The Occom Circle")[[[Let's transform TEI to MODS->Create MODS from TEI]]](else-if: $project is "Jewelry Design Books of Jaques and Marcus")[[[Let's create original metadata in MODS->Create Original Metadata in MODS]]](else-if: $project is "Sanborn Fire Insurance Maps")[[[Let's transform the flat XML to MODS->Create MODS from Flat XML]]]
Current project: $projectOur TEI-to-MODS XSL transformation primarily extracts descriptive information from the header of the TEI document, much of which has been created according to the rules of DACS. It also retrieves authorized names for authors, recipients, and locations that are stored in separate contextual files created for the project. Some data elements, such as an extent statement with subunits, can be populated through processing the encoded text portion of the document.
Having created an initial set of MODS records, we can now review them, make any necessary changes to the transformation, and proceed with further batch processing.
[[How do we ensure these records adhere to our local standards?->Schematron]]
[[Extending MODS->MODS Extensions]]
Current project: $projectThe framework also uses (link-repeat: "Schematron")[(open-url: "http://schematron.com/")], a validation language for XML documents. It can be applied in conjunction with schema definition files such as XSDs or DTDs. Schematron defines rulesets that are used to evaluate each input document, asserting the presence of individual elements or attributes, the co-occurrence of multiple elements or attributes, or the use of specific data values. Each test may be accompanied by a customized error message.
Schematron is especially valuable for a schema such as MODS, where the canonical XSD file provided by the Library of Congress permits a very broad range of technically-valid implementations. We use Schematron to enforce requirements established by the DLF/Aquifer best practices and our own local guidelines. The validation layer is combined with modifications to the XSD file itself that identify closed controlled vocabularies for certain elements.
Rules are associated with a particular XML context, such as a specific element or data value, and are only evaluated if that context is found within the input document. This facilitates designing efficient validation routines for application to metadata records that describe a variety of collection and item types.
[[Let's validate our records->XCDAS]]
Current project: $projectThe Library of Congress maintains XSLT for converting between MODS and MARCXML, in both directions. Our framework includes a customized version of the MODS-to-MARCXML stylesheet that is optimized for our implementations of the MODS, MARC, and RDA standards.
Before loading metadata into Sierra, our ILS, we want to contribute it to WorldCat and mint OCLC record numbers that can be stored in our local systems. We use the (link-repeat: "WorldCat Metadata API")[(open-url: "http://www.oclc.org/developer/develop/web-services/worldcat-metadata-api.en.html")] to upload batches of MARCXML to OCLC, and merge administrative metadata from the returned MARCXML into our MODS records to support future synchronization efforts. That MARCXML can now be converted to binary MARC using MarcEdit, and loaded into the catalog.
[[Upload MARC->The End]]
Current project: $projectFor reviewing and remediating these records, we use an open-source data processing application called (link-repeat: "OpenRefine")[(open-url: "http://openrefine.org/")], which provides powerful methods for normalizing and transforming tabular data. It simplifies batch edits, the cleanup of typos, and combining or splitting fields.
OpenRefine also offers a feature called "reconciliation", which allows the lookup of local values against various external vocabularies, once services have been written for them. For example, Christina Harlow developed terrific reconciliation services for matching data to, and retrieving authoritative values and corresponding URIs from, (link-repeat: "GeoNames")[(open-url: "https://github.com/cmh2166/geonames-reconcile")] and (link-repeat: "multiple LC vocabularies")[(open-url: "https://github.com/cmh2166/lc-reconcile")].
We can apply OpenRefine to the $project by normalizing the preferred labels for names and places, and enhancing our metadata with point coordinates and authority record identifiers determined through reconciliation. Incorporating these identifiers as URIs helps prepare our records for use in future linked data environments.
Data may be exported from OpenRefine in multiple ways, including as a delimited text file or, using custom templating functions, as more complex text formats such as JSON, YAML, or an XML schema. We will export this metadata in a flat XML schema. In order to generate MODS records, we can use an XML technology called XSLT.
[[More about XSLT->More About XSLT]]
Current project: $projectOur framework includes multiple pre-defined record templates that support original metadata creation in MODS. We have templates for generic collection-level and item-level records, and can create new ones as needed for individual projects. For the eight volumes in this collection, we can use our regular item-level template, and record unique data elements, such as title and extent.
[[How do we ensure these records adhere to our local standards?->Schematron]]
[[Extending MODS->MODS Extensions]]
Current project: $projectOne fundamental XML feature is the ability to extend a document created using one schema by importing elements and/or attributes from another schema. Each schema referenced in the document is identified by a unique prefix corresponding to a namespace URI, disambiguating the semantics of each XML node.
We can therefore easily extend MODS as necessary to record additional metadata properties and types. We use a local namespace to record certain additional administrative metadata. We have also begun implementing (link-repeat: "RightsStatements.org")[(open-url: "http://rightsstatements.org/en/")], a vocabulary developed by Europeana and the Digital Public Library of America specifically to identify the rights status of digital cultural heritage objects, using a subset of the California Digital Library's copyrightMD schema.
[[Great! How do we ensure these records adhere to our local standards?->Schematron]]
Current project: $projectWe generally assign DOIs at the collection- and item-level for Digital Library Program materials. Crossref defines its own XML schema, which may be used to submit batches of descriptive metadata to register the assigned DOIs. This process activates the technical resolution of the DOI and permits us and any user to begin providing it as the cited URL for an object.
[[Upload Crossref XML->The End]]
Current project: $projectOur XSL transformation for reformatting flat XML documents maps existing data to the fairly-high level of granularity available in MODS—sometimes through splitting fields that contain multiple values or certain kinds of (ISBD or non-ISBD) punctuation—and provides some boilerplate descriptive and administrative elements.
Having created an initial set of MODS records, we can now review them, make any necessary changes to the transformation, and proceed with further batch processing.
[[How do we ensure these records adhere to our local standards?->Schematron]]
[[Extending MODS->MODS Extensions]]
Current project: $project