How a Digital Repository Is Democratizing Science From a Duke Basement
October 18, 2022
Marie Claire Chelini, Trinity Communications
Doug Boyer was a hit at his daughter’s kindergarten show and tell.
The associate professor of Evolutionary Anthropology came armed with a life-sized, 3D-printed vertebra belonging to the world’s largest living snake, the green anaconda (Eunectes murinus). Once the students were done oohing and aahing over the plum-sized bone replica, he pulled a second vertebra, ten times larger than the anaconda’s, out of his bag. It was a life-sized replica of a vertebra belonging to Titanoboa, a snake that went extinct around 60 million years ago.
“If the anaconda is the length of a truck, can you imagine how big this one was?” he asked. The room erupted.
Thanks to MorphoSource, an NSF-funded and Duke-hosted digital repository of museum specimens’ 3D scans, he isn’t the only one able to pull this trick. And that is precisely why Boyer created MorphoSource: to democratize access to specimens previously hidden away in museum drawers.
From a basement at Duke to the world
Museum specimens are often rare, if not one of a kind. That’s why museum exhibits are such a hit. But what we see in these exhibits is a small fraction of what a museum holds, the tip of a massive collection iceberg hidden away and kept safe in drawers, vials and fire-proof cabinets.
To ensure the protection of these collections, museums restrict access to all except accredited researchers willing and able to jump through multiple bureaucratic hoops — and often buy a plane ticket — to visit them in person.
Though necessary, these safeguards prevent the public from seeing or learning from the vast majority of specimens in museum collections. Even among researchers with the correct credentials, geographic distance and cost of travel can be unsurmountable obstacles blocking access to these resources.
It houses what one would see in a typical natural history museum exhibit, such as skulls or shells — you can even find Sue the T-rex within its ranks — but also specimens like grains of pollen, battle wounds from the civil war, live animals in their natural environment, and much, much more.
The repository currently houses scans of over 53,000 biological, paleontological and archeological specimens from over 1,000 museum collections located in all six inhabited continents. Researchers can upload and download CT scans, 3D models, photos, X-rays and a variety of other file types. Data has been contributed or downloaded by over 17,000 researchers, students, teachers and artists all over the world. By the end of 2021, MorphoSource had been cited as a source of material in over 1,300 scientific publications. And it is still growing.
Originally envisioned as a way to store 3D scans produced by the lab Boyer worked in as a postdoctoral researcher, MorphoSource is now one of the world’s most important scientific data repositories. In a recent survey asking natural sciences researchers which repositories were most important for their work, it tied up in first place with GenBank, the National Institutes of Health genetic sequence database holding all publicly available DNA sequences. And it reached this status in less than 10 years.
MorphoSource was kickstarted in 2013, soon after Boyer started his faculty position at Duke. With funding included in his job offer from Duke, Boyer hired a web development firm, and MorphoSource slowly went from an idea to a concrete product hosted by servers in the Biology department. While it solved the fundamental need to have a place to archive and access 3D research data, it was still relatively limited in capacity and usability.
Three years later, Boyer recruited a graduate school lab mate turned software developer, Julie Winchester, to join him in the project. Having worked extensively with museum specimens as a Ph.D. student, Winchester had become a passionate advocate for data sharing.
“The only reason I was able to work with 3D data as a grad student is because I got a grant to travel to museums in the United States and around the world to go gain access to the physical specimens and 3D scan them,” she said. “Not everyone has even the possibility of getting these grants.”
Having the means to travel to museums to scan specimens was only half of the problem. Winchester says that without a public repository, researchers who were fortunate enough to be able to collect their own 3D data had no way of sharing it.
“We knew so many people working in this field who had a stack of hard drives full of 3D data just sitting in someone’s lab,” she said. “Scientific data should be shared, especially since a lot of it is taxpayer funded.”
In 2017, Boyer and Winchester obtained a grant from the National Science Foundation, and MorphoSource took a big leap forward. The grant allowed them to hire a team of two additional developers and a digital curator, all with skills complementary to their own.
Simon Choy, a computer scientist, and Jocelyn Triplett, a library scientist with a masters in classics, helped refactor the software underpinning MorphoSource from the initial proof-of-concept version to a more complete repository solution, in partnership with Duke Libraries, and developed more efficient methods to upload and store large amounts of data in a searchable way, using widely-adopted concepts and tools used by other academic and industry data repositories.
Mackenzie Shepard, then an undergraduate student working in Boyer’s laboratory, joined the team to sort through troves of data and ensure that researchers and institutions upload their scans correctly.
“We basically started from ground zero,” said Winchester, who leads the development team. “It took us almost three years to rebuild, expand and improve.”
There was no shortage of motivation. “I liked working on commercial web applications,” said Choy, who used to work for TiVo, “but working on a product that is actually helping a community, helping researchers and educating kids gives me a much greater sense of accomplishment.”
From scientific resource to educational tool
Boyer and Winchester weren’t satisfied with giving users the ability to download data from MorphoSource’s website.
“If you download the data,” said Boyer, “then you just have a large file on your computer, which means you need to have software. You need to have a computer that has a powerful enough graphics card, processor, or what have you.”
“Internet connections aren’t always phenomenal, and teachers in schools often don’t have full ability, or sometimes any ability, to install software on educational computers,” said Winchester.
The new MorphoSource platform solved this problem with an online, interactive visualizer. Thanks to work by Winchester and an open-source software developer, MorphoSource can “optimize literally gigantic files for web viewing,” Boyer said. “We’ve provided a resource and functionality that isn’t replicated by anyone. Not even commercially, or outside of science — it’s unique.” Almost anyone can open the website, enter a keyword in the search bar or browse by object type, choose a sample and visualize it with no need to download the data.
“The interface is still not the greatest for a non-technical user,” added Boyer, “and we want to continue working on it to take MorphoSource from a scientific database to an educational tool in a much broader sense.”
Still, thanks to the online visualizer, any teacher with a computer and internet access can design lessons without the need for expensive software. Artists can get inspiration without leaving their ateliers. And researchers from anywhere in the world, with any type of budget or constraint, can study these specimens in detail.
If the possibilities already sounded interesting in 2017, imagine what such a tool can do in times when travel becomes severely restricted, museums around the world close their doors and classes move online.
In 2020, MorphoSource extended a free and accessible helping hand to researchers whose plans were halted by the COVID-19 pandemic, including graduate students on a timeline, as well as teachers scrambling to find educational yet novel content that could keep students engaged. The number of downloads per quarter almost tripled between early 2020 and early 2021, and that data doesn’t include the number of users who worked directly with the online viewer.
“We were getting a lot of emails asking how to download data, or just, ‘Hey I need access to these data because I’m not able to travel to this or that museum, can you help me?’” said Shepard. “There was a lot of connecting users to uploaders in order to get their requests approved or their questions answered.”
And users aren’t the only ones who benefited from MorphoSource. Museums and research institutions can specify conditions of use for the specimens they upload, such as prohibiting commercial or non-attributed use, to maximize reach and impact without compromising the legitimacy of their collection.
“Museums are only as valuable as the collections they hold,” said Boyer. “A digital presence through which you can trace, track and record how many people are using your collection offers a phenomenal opportunity to promote it and increase its value, even when its doors are closed.”
Reaching beyond the natural sciences
Now that the MorphoSource team has a solid system for managing biological specimens, they are expanding their reach towards another type of museum specimen: cultural heritage objects.
Truly unique, and therefore priceless, cultural heritage objects are subject to even more protection than many biological specimens. By expanding its repository into these specimens, MorphoSource is becoming a vital resource for researchers beyond the natural sciences, and Duke institutes are eager to collaborate.
The Nasher Museum has now begun contributing museum object scans, and professors in the Department of Classical Studies, such as Maurizio Forte, have been uploading archeological findings from the Vulci excavation site in Italy.
Adding a completely different type of museum specimen may seem easy (they’re all objects, right?), but requires overcoming big technical hurdles in the way the data associated with them gets entered in the database.
“The biggest challenge we’re facing right now is integrating the cultural heritage collection with the biological collections,” said Triplett, who has taken on the role of cultural heritage objects liaison.
Winchester exemplifies this by pointing out differences in the way biological and archeological objects are dated. Where an archaeologist working on historical timescales may be able to date an object with relative precision, a paleontologist uploading a dinosaur fossil may have a margin of error of a few million years.
“Making the system work really well for both types of objects isn’t going to be easy,” said Triplett. “But we have a very strong foundation and we can use it to move into new types of objects and expand on this strength.”
And the team already has its eyes set on the next set of challenges: objects that, well… aren’t objects.
Museum specimens can typically be picked up — albeit sometimes with a crane — and stored. But what about priceless architectural landmarks?
Using a technique called photogrammetry, researchers can produce the equivalent of a 3D scan of virtually any object, as well as entire monuments, architectural elements, cave walls or landscapes.
“We are trying to build a resource that uses the same general language to describe a fossil 150 million years old and a statue created 400 years ago,” said Winchester. “So we are always trying to work out how do you make these things speak the same language, how do you describe them with the same terms, where possible, and where do the terms need to be distinct.”
Ready to take flight
This July, the National Science Foundation awarded Boyer and Winchester over $1.6 million to continue serving researchers, museums and the public with their platform, and to ensure its continuous growth from a Duke-funded repository to a platform with long-term sustainability. What’s more, NSF recognized MorphoSource as its sole partner for 3D data and workflows in national, NSF-funded efforts to digitize non-federal museum collections.
“MorphoSource would not exist without Duke,” said Boyer. “Many people are surprised to hear that neither Duke nor NSF can support MorphoSource for the long term. But this grant will give us essentially a five-year runway to begin to shift the cost burden away from NSF and Duke.”
Boyer clarifies that users will never be charged to access or download data from Morphosource. “It would obviously be diametrically opposed to our mission of increasing access to this data,” he said. Rather, their aim is to aggregate a voluntary consortium of paying organizations that benefit and rely on MorphoSource, such as museums, journals and possibly scanning facilities.
“We’d like it to be voluntary, so larger institutions can carry a bigger burden and we can waive management fees for smaller museums that may not have the resources to pay,” said Boyer.
Boyer says his long-lasting love for museums and their capacity to awe compel him to further MorphoSource’s democratizing mission. No one visited his kindergarten classroom with giant vertebrae, but his parents never missed a chance to take him to natural history museums. While still in high school, Boyer had the opportunity to intern at the Museum of Paleontology at the University of Michigan. This internship led to a job, which he kept for six years, until graduate school took him out of his home state.
“Museums are basically in my blood,” he said. “And the sense that their doors have to be open, and their specimens have to be available is something that I have felt ever since.”