Science Museums Take Stock of 1.1B Objects from Around the World
Holy crap, if that number is even remotely accurate then museums have roughly 1 object for every 106 humans that are estimated to have ever lived in the last 192,000 years . Obviously a power law is in effect and a tiny fraction that have ever lived have objects that survive them to this day, and a significant fraction of those objects are animal fossils and other stuff that is unrelated to humans, but it's still a staggering statistics.
I figured there were at most thousands of museums with tens of thousands of objects each, on average.
A very significant fraction will be natural history specimens. A long time ago, I did the London Natural History Museum's MSc course in biodiversity. For one of the modules, in curation, we visited the museum's external storage site, a warehouse complex in southwest London. I barely have the language to describe the scale of the collections there except to say it was on an Indiana Jones 'Raiders of the Lost Ark' level and there were still vast unprocessed archives of material from biological expeditions dating back to the 1800s. There were entire slowly decomposing cetaceans in enormous metal chests, aisles of taxidermy zooming off to infinity, floors and floors of fossils, large to tiny. At the museum's main site the entomology collections alone have over 34 million items.
This collection only covers 73 natural history museums. One example from the article: "The Smithsonian Museum of Natural History alone holds 148,033,146 objects."
Doesn't exhaustive cataloging explain much of this for human artifacts?
A relatively mundane "object" might result in a large number of separately cataloged objects, e.g. every ring and necklace in a jewelry box cataloged separately.
flirted with this a decade ago and we were then guestimating maybe 2-3 billion extant records world wide. lots of big basements.
Does that mean I can keep my dream of finding a Gutenberg Bible alive? :-)
This reminds me of one of my favorite Reddit comments:
> I met a woman one evening at a dive bar in Fells Point years ago when I was playing the National Theatre with a broadway show. She enquired about getting a couple tickets to see the show that she would trade for a tour of the archives of the Smithsonian American History Museum where she was an archivist/ conservator.
> And that’s how I ended up blubbering like a little kid, tears in my eyes… holding Kermit the Frog.
There are many billions more waiting to be curated and digitized in collections around the world. See also GBIF, for tons of data on the tiny fraction that has been digitized. There are > 2 billion records there alone.
And now for a shameless plug. If you want to help one of the handful open-source tools (there are numerous commercial tools used as well) that are used to digitize these materials so that you can access all that goodness please check out TaxonWorks (https://taxonworks.org). We don't serve many collections, but we're growing, and doing some cool things. As a hint of scope of the problem, for those handful of collections we do serve, we have ~1 million specimens digitized from an estimated total of well over 20 million individuals. Lots of cool things to consider throughout, including this "AI" stuff.
A bit off topic but a reminder of the BBC's brilliant tour through history via 100 Objects in the British Museum:
You can listen to 100-odd of them in their original bitesize form or try the omnibus themed editions:
Original format first episode: https://www.bbc.co.uk/programmes/b00pwmgq
First Omnibus: https://www.bbc.co.uk/programmes/b00r2hky
Coincidentally I have just started listening to this keynote from the Everything Open conference that was held in Melbourne, Australia a few weeks ago. Seb Chan has had digital curation experience with the likes of the Smithsonian and the Sydney Powerhouse museum. An interesting talk so far, covering important issues like copyright, metadata and transparency. https://youtu.be/Lj6v-eX_D-0
Imagine if all of these are carefully annotated and digitally imaged and assembled into a museum object corpus for searching but also for training some large multimodal models.
Crafted by RajatSource Code