Project and master thesis work in statistics supervised by Erlend Aune
In addition to the projects suggested on this page, I am open to project proposals within statistical learning. My main interests are within
- Learning when little training data is available
- Neural networks for time series modeling
- Applying models to novel datasets
The most relevant subjects for the these I supervise are:
- Statistical learning: https://wiki.math.ntnu.no/tma4268/2019v
- General statistical methods: https://wiki.math.ntnu.no/ma8701/2019v/start
- Intelligent Text Analytics: and Language Understanding https://www.ntnu.edu/studies/courses/TDT4310#tab=omEmnet
- TMA4285 Tidsrekkemodeller (autumn semester)
The suggestions for theses presented here have clear applications. However, many of the problems will require substantial modification of existing methodology that may generalize to other problems. Whether the focus of the project/thesis is mainly methodological or on the application is up to the individual student.
I will update this page with specific project proposals continuously.
Machine Learning with Food
What is a good recipe? Which ingredients are likely to match with each other? Can I generate a meaningful recipe that I would like to try out? These are possible questions that you could try to answer in this project.
A simple search on Google for “Coq au vin” will give myriads of recipes for this traditional french dish. In this project, you will be working with ten thousands of recipes, some of which have reviews, to extract useful information and analyse this information to hopefully learn more about food. The specifics of the project is based on personal interests, but some examples are: Classifying how good individual recipes are and extract the essential information that makes one recipe “good” or “bad”. Extract and harmonize ingredients/quantities and other information of interest from recipes. This is closely related to Named Entity Recognition in natural language processing. Based on ingredient list, automatically generate cooking instructions for that recipe.
In this thesis, you will be using state-of-the-art deep neural network models for text, e.g. LSTM, trellis networks and/or transformers. Decent python experience is expected.
Which wines are likely to outperform their peers? Are producers, regions, grape varieties and vintages king, or does name and label matter?
In this project, you will be working with wine metadata and sales numbers in the Norwegian market.
Example questions of interest are
- How early can we detect wine trends?
- What are important attributes for a wine to perform better than its peers?
- When is a wine likely to fall out the basic selection?
This project will use data provided by Grapespot. It is likely that you will be using time series models, deep neural networks (such as LSTMs) or similar models in this project.
Active Learning is one way of dealing with limited training data. The underlying goal is to only label data points that a model is uncertain about. I’m primarily interested in the following aspects of Active Learning:
- What are good uncertainty measures for a model? Can we use statistics to find better uncertainty measures?
- One-Shot Active Learning (with heterogeneous data)
- Applying active learning to information extraction
The baseline models are typically flexible models, such as deep neural networks. It is well known that these are data intensive to train, and Active Learning may in many cases help with achieving good performance on such models with less training data.
Deep learning for time series data
I am particularly interested in modeling for short multivariate time series. The modeling problems may, for instance, be: forecasting, change point detection, anomaly detection, imputation, denoising, and resampling/super-resolution.