How different might museum catalogues be if they had been designed for public consumption from the start? A couple of weeks ago, Mia Ridge mused on this question via Twitter. Her timing was impeccable, for even as she asked I was setting up a chat with Matthew Israel, Director of The Art Genome Project at Art.sy which could be a good starting place for those interested in an answer.
Here, I speak to Matthew about the project. (Fair warning – this is a long post, but an interesting read.)

[S] Hi Matthew. Can you tell me a little more about The Art Genome Project, how it started and how it all works? What exactly do you mean when speaking of “genes” in an art context? How do they differ from traditional classifications?
[M] Hi Suse. Thanks so much for your interest in the project and for including the project and Art.sy on museumgeek. Before I begin I should say that the best way to learn about how Art.sy and The Art Genome Project work is to go to the website and if you’re interested in learning more of the specifics of The Art Genome Project, we have set up a tumblr.
In short, The Art Genome Project is sophisticated, nuanced metadata that informs and enables related art search. Genes are our names for this kind of metadata (you can also think about them as the possible characteristics that one might apply to art). Examples include art-historical movements, time periods, techniques, concepts, content, geographical regions and even aspects of an artwork’s appearance. There are currently over 500 “core” genes in the project and another 400+ which capture influences on artists of both other artists and art-historical movements.
It’s important to note that the genes we have created are not by any means just invented by us. Fortunately we are the beneficiaries of hundreds of years of art-historical scholarship; we source from discussions in books, periodicals and on the web surrounding contemporary art; and most importantly, we have established consistent communication with all of our partners (i.e. the galleries, museums, foundations, collections and estates that feature their work on Art.sy) in order to understand their thoughts on the genome and the genoming process.
What’s also really significant to explain is that every artist and artwork has their own genome in order to show how different, for example, Pablo Picasso’s oeuvre (his collected works) is in comparison to individual works he created and how greatly individual works can differ from each other. For example, in the case of Picasso, this enables us to explain to users the differences between Blue and Rose period works or between papier-collé works done in 1912 and his almost surrealist works of the 1930s.
Additionally, genes are not tags — though we have many tags on the site — because tags are binary (something is either tagged “dog” or not). Genes, in contrast, can range from 0-100, thus capturing how strongly a gene applies to a specific artist or artwork. While the specific numbers are not important, this enables us to explain to users that Warhol is highly related to the term Pop Art, while an artist who might have been associated with Pop at points during their career–say Ray Johnson–can be represented as less associated to Pop.
[S] I’ve been following The Art Genome Project Tumblr for a little while now. Many of the posts are about the evolution of the genes; about how and why particular genes come into being. For instance, on this post on “Double Genes” Holly Shen discusses how, in early work, you sometimes “combined two separate but related characteristics into one gene, knowing that eventually the time would come when each characteristic would gather enough artworks and artists on its own”, whilst this post speaks of the evolution of economics-related genes. What prompts an evolution in the ‘genetic code’? Were any genes initially created that you’ve since discarded?
[M] It’s really so great for us to hear you’ve been following the tumblr. Please feel free to comment on any of our posts! Your question is a good one. It gets to the crux of how a new format for organizing knowledge evolves (and must evolve) over time. I think at base, since this is the first time anyone has tried to systematize the vocabulary for art history in this particular way, and as with any significant research project, initial ideas always need to be re-evaluated. People say writing is 10% writing and 90% editing and I would say the same of the genome project. That’s the fascinating part. It’s like more traditional art-historical work. You’re given a set of objects, you create sets according to certain similarities and then see how they hold up to scholarly inquiry and then you re-evaluate and re-evaluate again, etc.
A basic example of how we have split up a gene is a gene we created for Light. We initially thought having works by Impressionists in which light is a central aspect of the image would be interesting to see next to contemporary works which use Light as a medium. However as time went on, we realized more and more that this was confusing for users and these were really satisfying as their own categories. And there were enough objects in each category to split them apart.
[S] How do you ensure consistency across the different genetic coders or mappers, particularly with a system that is still evolving?
[M] Consistency is a huge priority, maybe one of the top priorities of The Art Genome Project. Historically, the kind of data entry we are doing and the production of a shared set of knowledge for how to use a system such as ours has been undervalued or unsystematized because the set of users is a set of specialists working in disassociated institutions or contexts. Hopefully our focus on the public and emphasis on the fact that this vocabulary is an educational tool will bring more exposure to and appreciation for this type of work. Though we can improve on consistency at Art.sy, we are doing various types of things to establish standards, such as our establishment of an Art Genome Wiki (a kind of knowledge base for genoming); weekly Genome Team meetings; and our use of programs like Basecamp, Trello, Pivotal Tracker and teamwide chat to keep revisions and discussions as transparent as possible. In this way, we have realized maintaining consistency is not just about top-down review and establishing rules, but it’s also about constant dialogue.
[S] When you and I recently spoke about the Project, you mentioned a desire to document aspects of art that are more integral to an artist’s practice or concerns than might be included in traditional classifications. What genes have emerged in response to this idea?
[M] Good questions. Regarding other criteria for art, traditional art object classification systems (and I generalize here, because there are various exceptions) really have focused on the specific details of objects (dimensions, medium, provenance) and one subject heading, however The Art Genome Project–while we capture all of these more specific details–is focused much more on what is going on in the work and in an artist’s practice. It’s more like what one would lecture about to educate someone about an artist or artwork.
Regarding “new” genes, yes, we represent the well-known aspects of an artist’s works but also try to show additional aspects that maybe most users don’t know about in order to give voice to the diversity of ways to understand and interpret artists and works of art but also to the complexity of works of art in a way maybe traditional avenues have not had the ability to do. What’s also interesting is we have the advantage of not having to be held within the boundaries of a book or its formatting. This is definitely a liberation for art-historical thinking I think, yet at the same time it is something entirely different to create what such an educational experience looks like, feels like and translates to the user.
[S] Something else that resonated when we spoke was your expressed desire to create ‘valuable educational metadata’. There are a couple of things that I want to explore about this idea. The first is the implication that you are rethinking art classification with a public end user in mind; and more specifically, a public learner. What impact do you think this has had on the planning and execution of the Project?
[M] I would say this has been the major priority of the project. We see The Art Genome Project as the structure for a new pedagogical experience. Many of those involved in the project are educators or come from an educational background and I think this experience informs so much of what we do. One major example is the fact that we define our genes on the site, so that in the process of searching and clicking on things you like or gravitate towards or find interesting, you are being given educational texts that explain specifically why you might enjoy these connections. We also have made a real effort to create text on all parts of the site (but primarily in our artist biographies and gene definitions) that is very clear to the user but not “dumbed down.” I don’t think this kind of content is that available to the mass audience.
[S] You’ve written about mapping serendipity with the Project. Do you think that the Project could actually challenge or disrupt art education, by drawing equivalencies and parallels between works of art that might be “genetically” related, but not historically? In a Time Magazine article (behind paywall), the equivalencies drawn by The Art Genome Project are problematised thusly:
Another problem Art.sy faces is its classification system, which rubs some artists the wrong way. “I don’t think what I am doing has anything to do with Cindy Sherman,” says British artist Jonathan Smith after being told the site links his work to hers via a staged-photography gene. “That sounds like something a programmer would think of.”
Given that classification has played such a major role in the history of art, do you think drawing new equivalencies between historical works of art, or between historical and contemporary works, could have a disruptive effect? As an art historian, how do you feel about this?
[M] Honestly, while the term “disrupt” is often used with new websites to describe how they deal with a particular historical space, I don’t think about The Art Genome Project as disruptive. In truth, the job of art historians, and furthermore, art critics, curators, etc. is to draw new equivalences anyways and we are just doing the same thing in a different way. Yes, here there is the term “gene” or as you say “genetically related” but it’s not really all that different from an historian exploring new relationships between artworks. I also should say that our genoming is based on historical information so it would by no means contradict historical connections. I should also say that we have gotten some of our best feedback from art historians, which we have used to improve the site.
[S] Of course, the Art Genome Project isn’t being done with strictly educational outcomes in mind; it also has commercial ones. Do you think the overlapping interests of the Project could compromise its educational value?
[M] It’s funny, I don’t often get asked that. To be completely honest, there are really very few (if any) commercial constraints on The Art Genome Project. As I am sure you realize, this situation is extremely important as we have many non-profits involved (museums, foundations, estates). Art.sy’s goal is to make all the world’s art accessible to anyone with an Internet connection and that’s really been the focus of our efforts over these past few years.
[S] Finally, I am curious about the maintenance and scaling of such a labour-intensive approach to classification. Is the Project, and therefore Art.sy, limited in how big it can get? Do you have curators in the same way a museum does; making decisions about what is included in the Project?
[M] We’ve been thinking about this a lot lately. We do have a review process, of which I have already spoken, but we are definitely looking at how much we can scale without losing any of the quality we demand out of our genoming. What’s amazing about being here is that Art.sy is half engineers and half art professionals so we are not tackling this question alone (with the tools of art historians), but with significant input from our engineers. I’m honestly consistently amazed at how helpful those involved in tech (which again, is half of Art.sy) are in helping us deal with our problems, particularly regarding more efficient processes and workflow. One thing we are looking at is how much of the process we can automate so that an input we are needing to make is processed in the optimal manner. We also accomplished a lot this summer in a collaboration between members of the genome team and an engineer on our appearance genes. The eventual goal was to use our data to train a program to understand more specific recognition of the visual characteristics of artworks and we’re currently in the process of evaluating some of our conclusions. You can read more about this on our most recent blog post.
[S] Ok, now it’s your turn. What haven’t I asked about that excites you with the Project? What else should the museum community know about it?
[M] Great way to end. Hmmm. What excites me? I hope you don’t mind that I made a list…
- Creating a classification system that retains the nuances and mysteries within art and allows anyone with an Internet connection the opportunity to learn about art and art history.
- Open sourcing our research on our tumblr.
- Working on a truly collaborative project, with those from the arts but also with computer scientists, engineers and mathematicians.
- Research we have undertaken on The Art Genome’s roots, specifically the history of art classification systems.
- Giving people (who have wanted to learn about art but didn’t know where to start) a place to start.
- Trying to capture “happy accidents” in the classroom, i.e. mapping the serendipitous connections that happen when you teach, to help educate.
- Trying to create an educational experience that is active, exploratory, and self-motivated.
- The possibility of educators using Art.sy as a teaching tool, to explore the history of a movement or to see how a term’s interpretation has changed over time, such as Collage or Documentary Photography.
- Getting other people (like you) excited and involved in what we are doing!
[S] Thanks so much, Matthew! There is much to think about here.