CDT Skills Domains

Our CDT focuses on training students across five “skills domains” – in more simple terms, topics and associated skills that are critical to ensuring NLP systems are responsible designed and deployed and trusted by diverse members of society.

At the start of the training programme, all students will go through a training needs analysis with the CDT team and their supervisor. This will explore with them which of our five skills domains students wish to develop deep expertise in, and which they would like to develop more foundational skills in. We expect most students will specialise in one skill domain and develop deep expertise in it, and choose to develop more foundational expertise across one or two other domains. Based on the training needs analysis, we will guide students to specialist elective courses at the University of Edinburgh to develop skills in these domains.

Below we provide an overview of each of the five skills domains:

Responsible NLP data & models: Fundamental to developing NLP applications that are trusted is the development and adoption of responsible data practices. This includes understanding data sources, the processing of data into models, and future downstream uses and users of datasets and models. Students will develop techniques for building and curating inclusive and diverse datasets, as well as practices leading to fairer and less biased models. Our training will focus on locating bias in NLP data and models – including historical, representation, and measurement bias. NLP also needs to account for the environmental impacts of different techniques, and the reproducibility of models, including documentation, dataset availability, model sharing, and engagement with how NLP datasets and models translate (or not) across cultures and in non-Anglo-centric contexts. Critically, training will focus on deployed applications and uses of these techniques in-the-world.

We expect this skills domain will be of particular interest to students from computer science and informatics backgrounds, and students from other disciplines who have been exploring issues of bias and fairness in data and models from a more technical perspective.

Explainable NLP for users in-the-world: Trust requires user legible systems. Recent advancements in explainable AI techniques have primarily focused on technical explainability – but to ensure trusted adoption of NLP systems they must also be explainable to users in a range of different contexts. Visualisation techniques are important here, enabling end-users to examine datasets and models through explorable visualisations, visualised narratives and interrogable interfaces. Our students will be trained to explore areas that cater to a wide range of users, from explainability for a general public, to specific populations who frequently engage with NLP based systems, to domain experts (e.g., decision support systems in control rooms). We will also focus on new explainable model architectures, which provide justifications for their predictions, attribute errors to specific components, and where NLP-based technologies might provide human readable explanations of datasets and models.

We expect this skills domain will be of particular interest to students from an information or data visualisation background, and students from a technical discipline that have been exploring issues of explainability and transparency in AI models.

Designing for Human-AI partnership: As it gets embedded in real-world applications, we need to bring more attention to how NLP will support new forms of human-AI interaction. This includes developing new techniques to design systems that adapt for evolving interactions between people and AI systems. NLP systems could better harness their natural language basis as a capacity to work in partnership with people. Our students will develop new human-centered methods to create adaptable human-AI partnership systems and human-in-the-loop modelling techniques. This will include research into approaches to engaging with the usability, user experience and social impact of NLP based systems in everyday situations. Work in this area will also support exploration of new models system-side to support the responsible monitoring of natural language human-AI interactions, and the design of adaptable interfaces to support human-AI collaborations, teams and hand-offs.

We expect this domain will be of particular interest to students from an interaction design or human computer interaction background, or students from a behavioural or social science background that have been looking into the relationships between people and AI systems.

Governance & accountability: Governance of NLP is critical to establish a responsible and trustworthy environment. Responsibly deploying NLP requires understanding the intersection of current (IPR, GDPR, Equalities, Copyright) and emerging (EU AI) laws and governance, and where state-of-the-art techniques might take governance next. We will also engage with future governance developments – such as how EU AI regulations will be connected to accessibility and human rights legislation, which prioritise intersectional rights and competing/conflicting rights and are deeply challenging in a context of NLP systems. We will also connect with sector specific regulation challenges, and context specific forms of governance that work in-the-world and not just on paper. We will train students to consider how NLP systems might balance citizen rights and the responsibilities of developers and organisations. Critically, we will explore governance that works in-the-world, alongside how this intersects with ethical positions on building values into NLP.

We expect this skills domain will be of particular interest to students from a law background that have been studying the relationships between law and AI, or students from a philosophy or ethics background that are looking to bridge ethical frameworks and moral philosophy with design and technical practices.

Co-creation and NLP futures: Developing NLP applications that are trusted and people-centred requires user involvement in their design. There is a need to ensure people affected by and living with these technologies can define problems and application use cases, and ensure they address real-world challenges. However, the complexity of the datasets, models and emergent qualities of the more complex NLP systems present a challenge to traditional co-creation and elicitation methods. Students will develop new methods to ensure diverse members of society can meaningfully shape technical and design decisions. This will set the direction of application-oriented research conducted by trainees throughout their time with the CDT and focus on developing scenarios and application areas that are novel and driven by the needs, interests, aspirations of population seldom engaged in AI oriented research.

We expect this skills domain will be of particular interest to students from a design, human-computer interaction or creative background with an interest in participatory practices, or students from a technical legal or behavioural / social science discipline looking to develop new ways of engaging users in technical decision making processes.