Centre for Language Technology
Honours Units
We offer a number of Language Technology units at Honours Level. These are listed below with brief descriptions, starting with the core unit COMP448, which we encourage all students interested in Language Technology to take:
- COMP448: Advanced Topics in Natural Language Processing
- COMP449: Speech Recognition
- COMP450: Human Computer Interaction
- COMP451: Formal Languages and Grammars / Agent-Based Simulation
Note that the topic of COMP451 will be offered, depending on relative demand.
See also our information on:
COMP448: Advanced Topics in Natural Language Processing
Convenors: Robert Dale and Rolf Schwitter
When Offered: First Semester
This unit is the core honours-level Language Technology unit. Taking what you have learned in COMP348 and COMP349 as a base, the aim of the unit is to provide a rich backdrop against which the contents of other more specialised honours-level units can be understood. We will investigate state-of-the-art algorithms and techniques for both text-based natural language processing systems and spoken-language dialogue agents. All the investigated technologies will be related to each other showing how they can best be applied to various tasks such as spelling and grammar correction, information extraction, question answering, dialogue modelling, and natural language generation. This unit is ideal for students who want to learn more about computational techniques in natural language processing and the linguistic underpinnings of the field, and is essential for those students whose honours projects are in the area of language technology.
Visit COMP448 Home page.
COMP449: Speech Recognition
Convenor: Steve Cassidy
When Offered: First Semester
This course covers the basic methods and algorithms used in modern speech recognition systems. The aim of the course is to give you an understanding of the inner workings of a speech recogniser and some experience in working with speech signals and digital signal processing. Topics covered will include phonetics and phonology, digital signal processing, pattern matching over time, hidden markov models, neural networks in speech recognition, language modelling for speech recognition and speaker identification.
The course will consist of one two hour lecture per week with associated readings from papers and book chapters which will be supplied. Practical work will include making recordings of your own speech and annotating it and working with an HMM based recogniser to train it with Australian English speech. There may also be some experiments with speaker identification.
You may want to take a look at the notes for last year's offering of SLP806 for an idea of what this unit covers; this unit will add some pre-requisite material which was assumed in SLP806.
Visit COMP449 Home page.
COMP450: Human Computer Interaction
Convenor: Debbie Richards
When Offered: First Semester
Human Computer Interaction is concerned with the design of systems that will be used by people with specific tasks in mind who will want to use them in a way that fits seamlessly into their everyday work. This course will teach user-centred design and dispel a number of HCI myths. In particular students will learn that just because a designer is a human they are not their user, HCI issues are critical for the acceptance of systems, HCI is not always intuitive and easy and that most mistakes are the designer's fault and not the user's.
The course will include material from a range of disciplines including: psychology, cognitive science, sociology, linguistics, ergonomics and computer science and engineering. This material will be brought together to provide a match between human capabilities and computer technologies. Topics to be covered include: the human information processing system, models of interaction, strategies for and process of design and evaluation.
COMP451: Formal Languages and Grammars / Agent-based Simulation
Convenor: Mark Dras
When Offered: Second Semester
Only one topic will be offered, depending on the relative demand for each.
Formal Languages and Grammars
One of the questions to ask in computational linguistics, and one of Chomsky's original motivations, is: What sorts of rules or structures should be used for describing natural language? Chomsky's early work was mathematically oriented, and looked at what sorts of languages particular kinds of grammar formalisms, such as context-free and context-sensitive, could generate; a lot of mathematical results used in various places, such as compilers, came out of it.
The first part of the unit will cover fundamental concepts and results such as pumping lemmas, closure properties, AFLs (abstract families of languages), etc; extending string languages to tree languages, including the properties of tree sets and tree automata (the extension of finite automata to trees); strong generative capacity -- what structures can be generated -- as opposed to weak generative capacity -- what string languages can be generated; and formalisms intermediate in the Chomsky hierarchy, including indexed grammars and mildly context-sensitive grammars.
The second part of the unit will relate this to the description of natural language. This will cover a range of formalisms used in computational linguistics: Tree Adjoining Grammar, Head-driven Phrase Structure Grammar, Lexical-Functional Grammar. We'll compare how each describes similar constructions, and the implementation of large-scale grammars in each.
A certain amount of mathematical confidence would be useful in this unit.
Agent-Based Simulation
Traditionally, models in economics, biology, ecology, etc have been based on mathematical functions such as recurrence relations, differential equations, and so on. These types of models require individuals to be aggregated into homogeneous classes, and that many simplifying assumptions be made so that the mathematics is tractable: e.g. in traditional micro-economic models there are assumptions of perfect knowledge and perfect mobility, and the existence of fictional entities like the Walrasian auctioneer.
Agent-based simulation is an alternative approach to building models. It is beginning to be used for describing complex systems, by modelling individuals and their interaction, and observing emergent behaviour of the population not explicitly programmed into the system. In modelling phenomena this way, it is possible to avoid making the traditional simplifying assumptions, and unexpected results appear. For example, the prediction of neoclassical economics that a price equilibrium is always reached, guaranteeing optimal allocation of resources, is not necessarily the case.
In this unit we'll look at the fundamentals of agent-based simulation, with a focus on two existing models, of economic behaviour and of the evolution of language (my own work), and we'll be extending their implementation. The unit will involve programming in Java or Objective C, using the Swarm agent libraries. Assessment will be by a small project of your own choice.
COMP453: Question Answering
Convenor: Diego Molla Aliod
When Offered: Second Semester
This unit explores methodologies and approaches for the implementation of text-based natural language question-answering systems, that is, systems that, given an arbitrary question formulated in plain English, scan text documents and retrieve the answers to the question. These systems are a response to the increasing demand to find specific information from text documents (such as manuals, reports, or webpages). This unit will cover a varied range of approaches, from keyword-based approaches -- like most Web search engines -- through approaches that use partial or full syntactic information to approaches that rely on the logical form of the questions and the target text. An important source of information will be Question Answering track in the annual Text REtrieval Conference, where several research groups compete to find the question-answering system that finds the text fragments that best answer a large list of preselected questions.
Visit COMP453 Home page.