You understand the degree to which text (either written or spoken) can be processed automatically and the purposes for which this processing can be used. You are able to select the proper resources or algorithms for given tasks and situations, and can follow up on this selection by finding and obtaining relevant resources and software, and judging their appropriateness. In case the available resources or software are not fully suited for the task/situation at hand, you can describe how they should be adjusted or extended (note that this does not imply your ability to implement such adjustments/extensions yourself). |
|
|
In this course, we examine the various linguistic levels that are being processed automatically (e.g. language's syntax, speech recognition and speech synthesis), the information that can be derived from that processing (e.g. for purposes of information extraction and man-machine dialogue), and the techniques that are used for the actual processing (e.g. Hidden Markov Models, word embeddings and transformers).
As for literature, we are currently in a transition phase, with the seminal handbook Speech and Language Processing (2nd edition) by Jurafsky and Martin starting to show its age, the 3rd edition more up-to-date but not quite finished, and a similar handbook (Natural Language Processing by Jacob Eisenstein) also not quite finished. As a result, we will work through selected chapters from the publicly available drafts of all three of these handbooks, and do some of the exercises. In selected cases, you will obtain hands-on experience with available systems and/or techniques.
As for technology, it is expected to use BERT, Kaldi, Whisper and wav2vec2.0.
|
|
|
|
Some experience with programming (not any specific programming language, but programming in general) and statistics
|
|
|
|