About the CDT
Natural Language Processing (NLP) is an area of AI operating at the intersections of computer science, linguistics, and interaction design that has rapidly jumped from the research lab to routine deployment in-the-world. NLP covers a range of applications that: synthesise, summarise, simplify and translate complex text (e.g., Genei); enable people to interact with services via spoken (e.g., Alexa) and written (e.g. Genesys) language; model text and speech to create personalised services (e.g., Netflix); and multi-modal generative AI systems where language inputs create new texts (ChatGPT), speech (Speechify), images (Midjourney), and video (InVideo).
Mature NLP systems offer powerful capabilities to create new products, services, and interactive experiences grounded in natural language. However, they also bring significant challenges to responsible and trustworthy design, adoption and deployment. This includes:
- Biases resulting from how data is gathered, processed, and subsequently how models are trained and deployed, causing unexpected harms to certain groups of users.
- The black-box nature of models making interpretable representations and reliable explanations of outputs difficult, particularly for Large Language Models (LLMs).
- Underestimating social complexity (e.g., issues faced by Microsoft’s Tay or Meta’s Galactica) and how language data, models, and interactions with people co-evolve over time.
- A disconnection between rapid technical development of NLP systems and the much slower pace of developing governance frameworks to oversee these technologies.
- Concerns from researchers, the public and specific industries that such systems negatively impact human autonomy, interaction and skills, posing significant safety risks.
We believe that these challenges persist, in part, because developments in NLP have been driven by a focus on technical advances, and have been primarily conducted in technical silos. In order to ensure NLP is truly responsible and trustworthy by design, a more radical interdisciplinary approach is needed. The NLP research and innovation teams of the future not only need expertise in the technical foundations of these systems, but also in: understanding the social experience of NLP in real-world settings with real people; knowledge of how NLP should be governed and overseen; and the expertise on how to build and deploy systems so they are inspectable, accountable, and legible. Crucially, this expertise must be integrated into technical development, working in harmony rather than siloed.
Our Designing Responsible NLP PhD training programme aims to develop doctoral graduates that represent a new paradigm of interdisciplinary NLP researcher, who are ready to realise the full potential of NLP-based systems and enable richer interactions that allow genuine partnerships between humans and AI. Our training will ensure students have foundational knowledge across five fundamental skills domains, while attaining deep expertise in at least one of these. The training programme aims to create an NLP practitioner culture of responsibility, with graduates that are confident in integrating technical expertise with ethics, governance, and deep consideration of users and use contexts.
Students will train together in cohorts formed of a balance of disciplines and complimentary background experiences and expertise. The students will be supported to work together in collaborative “team science” projects on applied NLP projects. In doing these projects, the students will also get the chance to work with some of our 70+ industry, public, non-profit and research sector partners.
A cohort training approach ensures students experience working with the diverse disciplines that must come together in future NLP R&D teams. Our students will work in interdisciplinary cohorts, training together on real-world responsible NLP deployments with our 71 partners.
To find out more about current studentship opportunities, please visit our page with the call for applications.