Geometric Methods in Machine Learning (Spring 2019)Projects (due May 2nd, 2019)You can choose either of the projects below. You can complete this assignment alone or as a group of two, but I will grade them accordingly (I will have higher expectations for an assignment completed jointly by two students). Please send me, in an email to marcocuturicameto+assignment@gmail.com, a zip file containing your repord in pdf (not a .doc) and code in whatever format (you can send a notebook). Studying Time Series using Dynamic Time WarpingDownload the Sales Transactions Weekly Dataset. You will recover 800 time series of sales of different products with 52 timestamps (one for each week). Alternatively, you can also download any other dataset of your liking, with the same type, i.e. labelled time series, with a few hundreds or thousands of time series of moderate length.
Sinkhorn EmebddingsRead and summarize the findings provided in a recent paper on Sinkhorn embeddings, a nice idea published very recently to visualize datapoints as point clouds. You can use this idea to embed any arbitrary family of datapoints (e.g. that introduced in the question abos) to define a simple MDS type criterion whose aim is to compute point cloud representations in 2D. You need to use backpropagation (e.g. using autograd, tensorflow or pytorch) to achieve this. Low rank factorization of kernel matricesRead, summarize the method (not the theorertical guarantees) and implement the approach presented in this paper to carry out low-rank factorization of kernel matrices. Compare the method proposed by the authors with one baseline, the Nyström method, on the ijcnn1 dataset. |