An Interview with Dr. James Smithies, the Director of King’s Digital Lab at King’s College London – 中国数字人文

作者：James Smithies，刘菲英；转自：公众号 DH数字人文

教育教学

James Smithies Liu Feiying(刘菲英)

———————————————-

Introduction of the interviewee: James Smithies, the Director of King’s Digital Lab. He was previously Senior Lecturer in Digital Humanities and Associate Director of the UC CEISMIC Digital Archive at the University of Canterbury, New Zealand. He has worked in the government and commercial IT sectors as a technical writer and editor, business analyst, and project manager. In 2017 he published a monograph for Palgrave Macmillan titled Digital Humanities and the Digital Modern.

Introduction of the Interviewer:Liu Feiying, PhD, Lau China Institute, King’s College London

Place: Room 2.50, Virginia Woolf Building, King’s College London

Time:November 12th, 2019

———————————————-

Liu:Can you tell us a bit about your research background, how you started to work in digital humanities?

Dr. James Smithies: I did my doctorate in New Zealand History of Ideas at the University of Canterbury in New Zealand. The subject was late 20th century New Zealand literature and culture. It didn’t have a computational aspect, but I have always been interested in the history of technology and computing. I lectured for a couple of years in New Zealand History after my doctorate, but couldn’t find a permanent position, and so decided to work in the IT industry. I did that for 5 years off and on, working in commercial and government information technology. I worked as a technical writer for small software development start-ups, including in Fintech and the energy sector, and as a senior business analyst and project manager in the Department of Solutions Delivery at the New Zealand Ministry.

But my first love has always been scholarship. Around early 2000 when digital humanities started to develop, or started to evolve out of what was previously called humanities computing, I was continuing to work on scholarly subjects and started to engage with the international digital humanities community. I built a website called university history. org that aggregated digitalized historical sources, and that was around maybe early 2000. It doesn’t exist anymore.And I started engaging with global digital humanities community in the United States because there wasn’t a lot in New Zealand, Australia and the UK. I’m grateful to the DH community for being so open and welcoming someone working outside academia, because without their open attitude my scholarly career would have ended. I found my way to university digital humanities because of an earthquake. There was a large quake in Christchurch, New Zealand, in 2011. One of my colleagues, Paul Millar, an English professor from the University of Canterbury, came up to Wellington, where I was working, and asked what we could do to help, because the scientists were on TV talking about geoscience. It was a major event: almost 200 people were killed and a large part of the city was destroyed. I pointed my colleague towards the 911 archive, built by the Centre of History and New Media at George Mason University, built to store the digital content that was generated after the 9/11 attacks in New York. After these events (natural disasters, terrorist attacks etc), people immediately started taking photos and videos, but there wasn’t (and still isn’t) really a global or local infrastructure to store their content, so future historians won’t have sources to study the history. So I suggested to Paul that we should build an archive to store digital content related to the earthquakes. He pitched this idea to his vice chancellor of Canterbury University (Rod Carr) who gave him a budget. Paul then hired me to be New Zealand’s first senior lecturer in digital humanities and set up the digital humanities archive. That archive still exists; it is called ceismic. org. nz. The ceismic archive is a national repository that aggregates c. 190,000 items including articles, images, audios, research papers, videos, across different national agencies, such as national museums, local museums, national libraries, and archives. They all have their repositories, and this surfaces earthquake-related content with them. We also built a related archive called Quakestudies for research related content. This type of archive is called a federated archive. It stores digital content related to the Canterbury earthquake from 2010 to 2011.

So that took a lot of my time, but at the same time I set up a national digital humanities programme in the university. In New Zealand any university course or programme needs to go through university level approvals but also needs to be approved at government level, so it was quite an involved task. I had to submit all sorts of documents describing the programme, including the graduate profile: all sorts of stuff to get the course to start. This is what I did in the University of Canterbury, and then a job opportunity came up at King’s College London to develop King’s Digital Lab.

When I was at Canterbury, I started to work more on the History of Technology and Computing, I am interested in science and technology studies, but in particular the intersection of humanities and computing, not only in terms of the computational methods we can apply to humanities research, but also the history of computing and technology.

In my philosophy, if we are going to apply computational methods to humanities research, we also need to understand the historical and cultural context of those tools as well: we should not just apply them blindly to humanities research. We need to do it in a critical way. I am interested in, for instance, using theories of science and technology studies to understand the digital humanities and make sure we deploy the tools in a critical way. To me that is the real intellectual core of digital humanities. A lot of people would say digital humanities isn’t very scholarly, or isn’t very intellectual, and it is too technical, but actually there are quite deep methodological and epistemological issues in play, in my opinion. They exist at the intersection of humanities and computing.

So as far as King’s Digital Lab (KDL) goes, they have been doing digital humanities at King’s, as you know, for I think 40 or 50 years. King’s has been leading digital humanities for 40-50 years, and we have encountered lots of problems that lots of other universities and countries will encounter as well. There are inevitable problems. When you put people together with computers, a range of issues crop up – they seem to crop up in lots of different contexts. I’ve seen the same issues emerging in New Zealand, Australia and now the UK, and in many ways the solutions are the same too.

Digital Humanities evolved at King’s in quite complex ways. In 1991 the Centre for Computing and Humanities was established by Harold Short. He hired Willard McCarty, who is also still quite active in the community. They were mainly doing the sort of work that KDL does, which is using digital tools to help people answer their research questions. So they were working with historians to create prosopographical databases of the middle ages, or digitized maps, they are creating scholarly digital editions using Text Encoding Initiative (TEI), they really helped along with other universities in London and the UK and Europe, helped develop the technical basis or foundations of digital humanities as we know today. But they started teaching as well. They merged with the Centre of E-research in the Humanities in 2006 (they had developed out of King’s IT department), creating the Department of Digital Humanities (DDH).

2006 is a crucial year for digital humanities, I think it was that year NEH (National Endowment of Humanities)in the United States put up something like one million dollars funding for what they termed “digital humanities”, and their term started to get popularized around the world, so after 2006, students started to want to take digital humanities courses.

Liu:What is most special about digital humanities in King’s College London?

Dr. James Smithies:I think the reason why digital humanities got established so well at King’s was just the luck of having people like Harold Short and Willard McCarty being around. Digital humanities relies on people, on humanists who are interested in technology and want to apply it. Most universities, especially in the early 2000 and before, had very few people like that. So the universities that did have people who want to explore the intersection of humanities and computing got a head start. But King’s has also got a strong tradition in arts and humanities, as well as the sciences, and it is close to the British Library, the British Museum, and London’s cultural heritage sectors, so the conditions were quite good for it to develop here. Then more and more students started coming to King’s, they continued to build projects, but it got difficult to manage the software engineering as well as the teaching. This is a real trick for digital humanities world-wide, as academic departments don’t tend to be good at software engineering, and vice versa. To help get to the next stage, they took the software developers out of the academic department of digital humanities, and put them into a stand-alone lab, and in 2015 they hired me to develop the lab.

By doing this, it means we have this unique model of digital humanities at King’s, which we sometimes refer to as “DH at Scale”. So we have an academic department, composed of 40 to 50 lecturers, over 600 students (now primarily postgraduate but increasingly undergraduate) and they do a wide range of digital humanities from pure critique, or criticism or analysis of the digital world, right through to building digital products, running through the gamut from digital humanities to digital social science. It is also what some people refer to as “Big tent digital humanities”. So we have that on the academic side. There have been conversations on digital humanities about whether you need to code in order be able to call yourself a digital humanist or not. King’s embraced the liberal definition of digital humanities, which I agree with, which basically says so long as you want to engage with the digital world, and understand it in a scholarly way and bring some sort of technical awareness or knowledge to that technical domain that you are analysing, then it is fine. But equally one of the benefits of digital humanities is that it can produce graduates who not only understand the humanities, but understand technology, contemporary technology as well, to a certain degree. They might (and should, in my estimation) be able to do some basic coding, or understand how database work too.

So King’s established a department and also a lab. The lab is set up like a contemporary software engineering team. We use industry-based practices for software engineering, with a team of 14 research software analysts, engineers, UI/UX designers, project managers, and a system manager. We manage about 200 virtual machines (200 servers) and inherited 100 digital projects from the Department of Digital Humanities when we were established. The lab is a practical implementation of the ideas in my book, and other people in the lab who think in similar ways. In sociotechnical terms, it is an interesting space or site. We are hosted by the faculty of arts and humanities, but it is a site where engineering, business, scholarship, finance and technology collide. In a way, it is a metaphor for what is happening with the humanities all around the world. Even if you haven’t engaged with technology, you still use search engines, you still use electronic databases and digitalised texts. The humanities are dealing with the collision between those different aspects of technology and culture, and scholarship. But the lab is very concentrated, we are not just trying to understand the collision, we are also using the tools to advance the scholarship. There is an element of empowerment of humanness in what we do in the lab. Because we don’t want corporations to give us all their tools, we are quite happy to use them when there are good tools, but we also believe that scholarly questions are often best answered by bespoke, tailored digital tools. And in order to build those tools, you need technical knowledge, but you also need quite deep domain knowledge, about the humanities research domain as well. So, by implementing a lab like KDL, universities, arts and humanities departments empower themselves to control the direction of technological development.

Liu:Do other universities have similar kind of digital lab?

Dr. James Smithies:The short answer is no, but there are some universities around who have some facilities like KDL, such as the University of Sheffield Digital Humanities Institute and the University of Exeter Digital Humanities Lab, but they are quite rare.

But a trend is developing. Urszula Pawlicka-Deger, from Aalto University, is studying the development of digital humanities labs around the world, and she has found since the turn of the century there have been around 150 new labs. So between 1980 and the turn of the century there might be a dozen or two, but there has been an explosion around the world of new digital humanities, labs or engineering teams. I get contacted regularly from people in South Africa, across Asia or Australasia, wanting to build theirown lab like KDL, and understand how we do digital humanities at King’s. So I think the future is already here in many ways.

King’s is one of the first institutions to start doing the kind of work we are doing in King’s Digital Lab, but there are lots in the United States. The Centre for History and New Media in George Mason University was established in 1997; University of Victoria in Canada has a very well developed programme; there is a Humanities Lab in Stanford; there is Maryland Institute for Technology and Humanities as well; University of Virginia has a team called Scholars’ Lab. And in Europe, there’s been a lot of work in the University of Cologne; University of Gothenburg has quite a rich history; and University of Glasgow as well: this has been going on for decades, and it is only just coming into wider scholarly consciousness. A lot of the work has been quite bespoke or niche. In many ways KDL can be viewed as a contemporary reincarnation of things that have been going on for quite a long time. We’ve resolved a lot of the issues that teams like ours have come across as they grow: we had to find ways of working around them, and dealing with the problem of complexity. The more digital projects you have to manage, the more complex the technical side of things grows, and the more they compete with the scholarly humanist side of things. In simple terms, that is the balance we have to strike: how do we efficiently manage infrastructure, efficiently manage the technical tools we use, and not override scholarly creativity? It is about technological determinism in some way: to what degree can we gain control over technology and not let it control us?

Liu: How does King’s Digital Lab interact with other departments in King’s?

Dr. James Smithies: The basic thing we do in the lab is to meet with colleagues who want to put in a grant application with a digital component, which might be a historical database that they need in order to understand their research questions, or digitalization of manuscripts they would not otherwise be able to handle or analyse, it might be visual reality, tools, or just an algorithm, or data visualization.

We have analysts (we call them Research Software Analysts), who perhaps have a PhD in Digital Humanities or English or History but also quite a bit of technical knowledge. They meet with an academic colleague, understand their scholarly problems, their research questions, and recommend research tools and methods they could use to answer those questions, and then write what we call the “product quote”, which defines the technical requirements, what they need (what we need to build basically), as well as a prose overview of the technical research solutions we will build for them and the costs. They plug that into their grant proposal and when they get the funding, they come back to us and we build it, then maintain it, and archive it if necessary. That is basically what we do: we exist to increase digital capability across the arts and humanities at King’s.All teaching is done in the Department of Digital Humanities, but we provided some guest lectures and we would like to do more. We also host some interns, and would like to host more. In our ideal model, King’s has an academic department and the lab, and the lab is almost a finishing school for humanities graduates, who have some technical knowledge and want to move on and become a Research Software Engineer. But our problem is capacity. We primarily get projects from King’s, because we get subsidies from King’s, but also some national, Europe and international projects. We are in very high demand, and we have to turn projects down quite regularly. So if time permits and we have a bigger capacity in the future, we would like to provide more lectures and internships for the students.

Liu: Can you talk a bit more about the digital humanities infrastructure building at King’s?

Dr. James Smithies:We inherited digital humanities infrastructure-rack servers that were hosted in the University of London Computer Centre. We upgraded two years ago. It is not very high-performance in terms of computer science, but in the context of digital humanities it is quite a significant setup: 200 virtual machines, 750 GB RAM, 40 TB of data, and we manage about 5 million digital objects. It is basically a web-hosting infrastructure, in simple terms. If we need to use any more high-performance machines, we go out to the university eResearch infrastructure.

Liu:So far what is the biggest achievement in King’s Digital Lab?

Dr. James Smithies:The biggest achievement in KDL so far is the lab culture. When I started in 2015, we had one woman and six men, and now we’ve got six women and eight men, and nine countries of origin, eleven languages, so we are diversifying. We have developed our own processes and culture that help us to manage the lab quite efficiently. I think people are the most important thing in digital humanities and labs like ours, though. We’ve got something like 60 years of experience in digital humanities research. We don’t tend to lose staff, people want to work with us, and want to stay. We have flexible working hours, get half day a week to work on personal projects. We always struggle to find time to do it, but “hacking” culture is a big part of us. We also support permanent career paths for our staff. When we established the lab, the tradition was for technical digital humanities people to be on short term contracts. They often worked on six or twelve month funded contracts, but the majority of our staff are on permanent contracts. It helps with team culture but it also helps with technical maintenance and knowledge maintenance-we don’t have people leaving with their expertise.

We have a particular focus on Research Software Engineers careers, a new initiative in the UK that is starting to be taken up internationally as well. It defines a new career path for Research Software Engineers. Research Software Engineers, or RSEs, work across all the disciplines. They work with bioinformatics and physics, and they occupy the same space that our lab members do, which is between pure academia and professional services or administration.

It is sort of this grey zone where you need technical administrative skills but you also need a lot of domain knowledge. My argument has always been that universities that want to retain their world rankings, are going to have to navigate their transition from analogue to digital research, and it is not trivial. We tried to do it a decade ago but didn’t achieve it. It is more complex than just plugging IT into the business world, because there is humanistic uncertainty, scientific uncertainty, and the domain is just highly complex. It is the people that resolve the problems. Research Software Engineers-these experts navigating between academia and technology- translate between humanistic and scientific technologies. They are the key to enabling the transition. There is a broader initiative called eResearch which I think is common phrasing in the United States and Australasia and started to be used in Europe as well. So my second role is as the deputy-director of eResearch for King’s College, at university level, and our eResearch strategy provides a college-wide strategy or strategic umbrella that KDL is a part of. So KDL supports the Faculty of Arts and Humanities, but there are other hubs within this eResearch network that support other disciplines as well.

Liu: Among all the digital projects going on in KDL, do you have any collaboration with China?

Dr. James Smithies:As a university, we do have collaborations with Chinese universities. As for the lab, although there has been opportunities to collaborate with Chinese universities, we’ve been too busy. But the Department of Digital Humanities has a lot of Chinese students, and the university wants to build relationships with China. I would like to provide consultancy to Chinese universities to help develop labs like King’s Digital Lab if I can find the time, and perhaps even develop sister labs in China working with similar methods. One of the issues with the kind of digital humanities that we do is that achieving and maintaining scale is difficult. And international collaboration is perhaps more difficult, because you are not just dealing with international collaboration in terms of humanities questions, you are also trying to collaborate at a technical level as well. There is a quite interesting challenge there.It’s worth noting that there is a large global digital humanities community enabled by the Alliance of Digital Humanities Organization (ADHO), which is a global umbrella organization for digital humanities. There is also Centernet, an international network of digital humanities centres. The network is highly global: they have a yearly conference, and the last conference was in Utrecht, with over 1,000 people there, but I didn’t see many people from China. My understanding through being involved with the global digital humanities community over the years has been most of the activities are in North America, Europe and Australasia, but significant work has been done over the years in Japan, Singapore and (I have been told) China, but China hasn’t been as active in global DH organizations as some of the other countries. So the point is, I think, that there is a huge opportunity to collaborate more with Asia, and the global digital humanities community wants to increase its diversity. We are working more in Asia, and working more in the Middle East as well. I think with China, in terms of capability and scale, there are some quite interesting collaborations to be had.

This might be especially the case after Brexit (although Brexit won’t be a big issue for our lab as we will still be able to collaborate as service providers), because the United Kingdom funding landscape is likely to become more international. We are looking beyond Europe more, there are funding schemes and collaborations with different parts of the world we’d like to explore. Brexit is personally disappointing to me, but it also opens up new opportunities.

Liu: What is your take on the application of AI in digital humanities in general?

Dr. James Smithies : If you read my book, you would know that I am a little bit sceptical about the AI thing. It is hugely important, but not as straight-forward as you might think. We have started to get involved in more machine learning, which depends on so-called “AI”, text mining legal documents, and other things. We’ve also produced visualizations data analysis, another growth area. We have one project called “Applying AI to Storytelling”, and that was in the context of a new movement or initiative called “digital creativity” which is to look at the application of digital tools to creative industries. In deploying AI, there is a collaboration between King’s Digital Lab, the Department of Culture, Media and Creative Industries (CMCI), and a London start-up company called “To Play For”. They are gaming start-up, and they got a product called charisma. ai, a writing platform, that allows writer to create branching narratives. So we can apply AI to robotics and industry, and it is working OK, and has some amazing applications. But when you put it up against the humanities, it often looks inadequate, because a coded machine, no matter how sophisticated in terms of using neural nets, or natural language processes or whatever, always seems inadequate when we compare it to Shakespeare.

This is where the digital humanities become quite interesting, and sits at the forefront of the creative industries. Because the next generation use cases for the application of artificial intelligence technologies, if we can get artificial intelligence to render stories or enable us to write stories of the quality of Shakespeare, then we will achieve something. But the gap, as you well know, between contemporary creative technologies and Shakespeare is massive at the moment, and it really illustrates the failure in the technology and the inadequacy of them. I admit that some AI generated texts, poems or songs, are not bad, and they are getting better, but this is where we get the question of epistemology and meaning: is it just in a dumb way picking stuff randomly out of database, applying a few rules to it, and then creating it, or is it engaging in what we would understand as creativity, working from a blank canvas, generating ideas and meaning in a way that is understandable to humans? Is it a machine or is it a creative entity? I think that is where we get into philosophy-the vanishing point between humanities and computing-and that is why humanists who don’t have a real interest in technology can get interested in digital humanities, because that is the real reason we do digital humanities-to understand the relationship between humans and computers, and to work out how much meaning we can gain from machines. It isn’t just about how much practical utility we can gain from them. In terms of GDP, it is whether we can work with them as creative equals.

I think we’ve failed currently and that’s part of the reason why I am interested in collaborating with people in China: the linguistic differences, the different approaches you will have to take, for instance, to optical character recognition. Even at a very basic level you have to work in a different paradigm from people in the UK, Europe and the United States. I am not sure what that would imply, but it means collaboration could be very productive. You’ve also got different datasets, different cultural datasets, different cultural histories that could be drawn on. There are industrial applications that are important, but the culture in humanistic outcomes is more intrinsic to us as humanists. There is lot of computational analysis being done over large corpuses of texts in the Western tradition. There is a corpus of 19th century British novels, every British novel ever written: people are doing computational analysis over it, to answer research questions. What would the equivalent be in China? What questions do you ask that dataset and what answers do you get back? Western digital humanists are asking particular questions of a particular dataset and getting particular answers. I am interested in knowing what questions Chinese scholars are asking.

Liu:What is the biggest difficulty in digital humanities projects in KDL ordigital humanities research in general nowadays?

Dr. James Smithies:Because we’re dealing with human creativity, there are very difficult problems. We still have problems with Optical Character Recognition (digitalizing texts, and then rendering it into plain text), especially of hand-written texts. We are a lot better with typed texts, but we still have problems OCRing handwritten 18th century texts, for instance. We’re involved in a project called the Georgian Papers Programme for example, that aims to digitize 400,000 pages of handwritten 18th century texts, and we haven’t been able to automatically transcribe them. We are using students in the United States, at the College of William & Mary, who are transcribing them. There are computational techniques we can use to help that process: computational models can achieve maybe eighty percent accuracy, but a hundred percent accuracy is very difficult. Apart from OCR, we also have problems in virtual reality and augmented reality, at the intersection of creativity and technology. There are a lot of potential applications, but designing those applications in a way that people want to use them or engage with them for more than one hour before they getting bored and take the goggles off, is really difficult. There are problems of creating high quality narratives too-there are many number of humanist problems that we struggle with.

We also have lots of trouble with infrastructure, with implementing large scale infrastructure that is computationally tractable. The dream, of course, is to have beautifully curated massive datasets of content that we can just run algorithms against, and get answers from, but it was never that easy. The datasets are always fragmented, they have uneven quality, both the content of the dataset, but also the metadata that describes them. There are great problems and barriers to the kind of high scale computation intensive work that we like to do.

The problem with the digital humanities when it comes to machine learning, intensive computational analysis, is that we are not working with straightforward structured dataset like a genome dataset. The scientific datasets, engineering datasets, chemistry datasets, are very well structured, and have discrete pieces of information, but humanities datasets often contain prose that needs to be interpreted. It also tends to be unstructured, and not very well described, so in computational terms, it is not very tractable for digital humanities analysis. Again, that is why I am intrigued by the potential for Chinese digital humanities, because it could be that you have better datasets, larger, more tractable datasets that can help us answer those scientific questions better.

Liu: As an expert in digital humanities research, do you think there is any misunderstanding about this area among the general public?

Dr. James Smithies: I don’t think they know what it is: the first question you always get is what is digital humanities? As soon as you explain what it is, although they understand, there is very low level of awareness about what digital humanities is. Possibly they think it is more technical than it is, rather than running a spectrum from critic of digital culture through to software engineering; they might think it is more mathematical than it always is, or more complicated than it always is, but the main issue is just the lack of awareness-digital humanities isn’t a mainstream discipline, and I am not sure if it ever will be.

There are a lot of humanist who are against digital humanities, but that is a different question. They are against it because they associate it with a post-Enlightenment, post- Cartesian tradition of increasingly narrow rationalism. They position the humanities in quite a romantic way (poems, landscapes, what have you), and as a realm that is supposedly free of managerialism, technology and rationalism. Because of that they think the application of computational techniques to the humanities will diminish them by taking the soul out of them. I sympathize with them. Maybe in some far-flung future there is a risk that the digital humanities will take over all of the humanities; maybe we will just ask computers for the meaning of life and it will spit out an answer on a piece of card, but that is a cartoon image of what is possible, isn’t it? In reality, computers are really limited, humans have control of them, and we are applying them critically to humanities questions to help us improve our understanding of the world. In an equal sense, I think critics of DH just remind us that there is a conflict in our culture at the moment between human nature and machines. We are highly conflicted by the effects machines have on us, so putting these two together creates intellectual attention. But as scholars I think we should be embracing the complexity of this as an intellectual moment – I don’t really see DH as something inherently good or bad.

Thinking of the little bit of Chinese history that I have done, my understanding is that China doesn’t have as strong a tradition of a rigid demarcation (or post-Cartesian demarcation) between humans and technology. Perhaps I’m wrong about that, but if there is a different historical attitude towards technology in China it might mean you encounter less backlash to DH. So maybe the backlash against digital humanities is a Western phenomenon: currently it’s been seen in North America (in particular the United States) and Australasia, and to a lesser extent Europe. We need to be mindful of the backlash to digital humanities, because it is something very real that we need to be careful about, but I think we also need to see it as biased, and not particularly useful.

编辑 | 肖爽

原刊《数字人文》2020年第2期,转载请联系授权。