Introducing the ACMI Public Collections API

Simon Loffler
ACMI LABS
Published in
7 min readNov 11, 2021

--

It has been just over a decade since the first public APIs appeared in the museum community. Usually released by institutions to encourage software developers and creative technologists to ‘play’ with the museum’s collection data, APIs have much more become commonplace in recent times — especially as APIs are now part of most medium to large institution’s IT back office wheelhouse.

If we think of a website as an easy way for people to read our museum data, then we can also think of an API as an easy way for computers to read that same museum data.

ACMI first released its collection data as a downloadable spreadsheet in 2016. We released the descriptive metadata of the former lending collection of films under the most open license available — Creative Commons Zero. It was integrated with a few ‘union catalogues’ and became the subject of some interesting data visualisation experiments in 2017. Then the ACMI renewal project took over and an expansion of this data aceess was put on hold while we got the new galleries into shape. Through the Renewal project, too, we rearchitected the technology stack at ACMI and made significant strides forward in improving the underlying collection data too.

Since re-opening the new ACMI in February 2021, the ACMI Labs creative technologists have been hard at work building the infrastructure for our first public collection API which compliments, and in many cases mirrors, our internal private APIs that drive our galleries, website and fall under the collective umbrella of XOS.

A screenshot of a website browser showing the ACMI Public API, api.acmi.net.au

We wanted to make it easier for researchers and software developers to explore our ACMI Collection data — and we also wanted to better automate the publishing of the collection data in simple downloadable formats.

We think it might be useful for things like:

  • Building exciting new ways to visualise our collection data, revealing absences, strengths, and biases in what we hold, and how the collection has been described over the past 80 years
  • Matching our collection data with collection data from other institutions or data sources such as TMDB, IMDB, Wikidata or VIAF
  • Creating applications that extend and transform our collection data

For some of these tasks it will be easier to use the TSV or JSON data dumps, and for others, it may be easier to build an application that connects to our API.

The first of our APIs to go public is the ACMI Collection API which also includes a Search API

The ACMI Collection API contains textual metadata for roughly 45,000 catalogued objects in the ACMI collection from films, videogames, artworks, home movies, and objects. This represents the majority of works in ACMI’s collection but does not include carrier-level information — a specific work may appear on several carriers. For example, our 1960 instructional film, Mice Milking, has one work (title) record but six carriers — one analogue 16mm and 5 digital surrogates, each of which have different storage locations.

For complex licensing reasons and the in-Copyright status of almost all works in our collection we cannot provide access to images or videos, however we do provide identifiers which will allow you to find images of the same work in other online sources such as TMDB, and videos, where we do have permissions, in ACMI’s YouTube.

Objects and works that are on loan to ACMI from other institutions and private collections are not included, nor are works from the wider worlds that are included in the Constellation interactive experience but do not also appear in our accessioned collection.

Some of the features we enjoyed building were:

  • Complete archive of JSON files included in the repository
  • TSV (tab separated values) spreadsheet of the entire dataset of works
  • Nightly updater to keep our public API data fresh
  • Docker development containers for researchers and software developers to run our API offline (including an ElasticSearch container)

Some playful examples

Before releasing it to the world, we thought it would be best to put ourselves in the place of a person using it. Here’s a few experiments we built which you can play with, and pull apart.

We chose to present these as Juypter Notebooks on Google Colab so you don’t need to spin up any software on your own computers to try them, or to see how they work.

Machine Dreaming stepping through its predictions of what Mad Max looks like based on the textual metadata written by our cataloguers and curators (no images were supplied

Machine Dreaming is an experiment that uses the VQGAN+CLIP machine learning algorithms to process the textual description of a work and to ‘dream’ images based on that metadata. When you have a collection but cannot show any images, perhaps it might be interesting for a computer to generate some ‘alternative’ images instead!

This sort of ‘machine dreaming’ can also be quite revealing about what the image generation algorithm has been ‘trained’ on — showing specific biases that make some things appear closer to our own understandings of the meaning of a paragraph of text than others.

Here’s what the machine dreamt from our Mad Max metadata against the ImageNet training dataset:

['Mad Max', 'George Miller’s Mad Max revolutionised movies when', 'it tore through cinemas in 1979. Bursting', 'with stunning practical effects, explosive set pieces', 'and iconic costumes, the original Mad Max', 'trilogy (1979-85) cemented its place in pop', 'culture before roaring back to life 30', 'years later.']
Mad Max, as dreamt by a computer from the ACMI Public API Work metadata for Mad Max.

And here a rather Cronenberg-inspired body horror imagining of the aforementioned Mice Milking metadata:

[Mice Milking’, ‘Please be advised this film contains scenes’, ‘of animal restraint that some viewers may’, ‘find disturbing.This film, made by the Victorian’, ‘Department of Agriculture, documents the process by’, ‘which mice are milked in a laboratory’, ‘at the Safe Research Farm in Werribee,’, ‘for use in cancer research. The narrator’, ‘explains that in mice, leukemia and mammary’, ‘cancer are similar to cancer in humans’, ‘and the virus is present in their’, ‘milk. The mice milking machine in use’, ‘in the film was developed by research’, ‘officer Graeme Mein for demonstration in an’, ‘open day at Werribee, based on those’, ‘used by the National Cancer Institute, USA.’, ‘The film shows laboratory technicians attaching a’, ‘mouse to the machine and milking them.’]
Mice Milking as dreamt by a computer from the ACMI Public API Work metadata for Mice Milking.

You can try it for yourself in our Jupyter Notebook on Google Colab. You will need to enter a record ID and can choose different training datasets from ImageNet to WikiArt. A single record may take up to 20 minutes to tun through the full generation process.

The output from Google Colab showing it matching Wikipedia biographies to our ACMI Public API’s creators

Wikipedia Biographies is a very simple search example that prompts for a Work ID and then looks for matches for all its related credits in Wikidata/Wikipedia biographies. With the output of this you build different indexes of named entities within ACMI’s collection data, complete with biographies and other external authority IDs which can be used to disambiguate people with common names.

Try it yourself in our Juypter Notebook on Google Colab.

API documenation

There’s a page of documentation over on our website at www.acmi.net.au/api with instructions for exploring our API and collection data.

How did we build the API?

During the renewal project the ACMI Labs team built a bunch of private APIs that are run on XOS (ACMI’s eXperience Operating System). The computers in our centrepiece exhibition use these private APIs to talk to each other:

  • Media Players use our Playlists and Videos APIs to download and play moving image content
  • Digital Labels use our Playlists, Collection, and Labels APIs to display interactive museum labels
  • Lens Readers use our Taps API to allow visitors to collect objects in the exhibition they like
  • Constellation Tables use our Constellations API to show visitors everything they collected during their visit, and connections to other objects in our collection they might also like

Our ultimate goal is to make all of these private APIs public, and to do this we needed to decouple the public API from the museum completely, to allow it to be as accessible and fast as possible, and also to keep our museum network of ~400 computers safe from outside attacks.

We wanted to use Python, the same programming language that we’ve used for XOS, the ACMI Website, and all of our SOMI gallery devices, so we chose Flask to build our public API.

We’ve open sourced our API server here: github.com/ACMILabs/acmi-api

If you get stuck, or find a bug, feel free to get in touch with us either via Twitter at @ACMILabs, or via the contact information on our website: https://www.acmi.net.au/contact/

--

--