{"id":1621,"date":"2022-05-05T12:20:44","date_gmt":"2022-05-05T12:20:44","guid":{"rendered":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/thomas-newman\/?p=1621"},"modified":"2022-05-05T14:56:36","modified_gmt":"2022-05-05T14:56:36","slug":"gaussian-processes-in-regression","status":"publish","type":"post","link":"https:\/\/www.lancaster.ac.uk\/stor-i-student-sites\/thomas-newman\/2022\/05\/05\/gaussian-processes-in-regression\/","title":{"rendered":"Gaussian Processes in Regression"},"content":{"rendered":"\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t
\n\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t
\n\t\t\t\t\t\t\t\t\t

Lets assume we are interested in making predictions, based on a set of data points represented in Figure 1. Naturally, if we want to make predictions for a specific value x_* <\/span> where\u00a0 x <\/span><\/span>\u00a0is a continuous variable, then it is very unlikely to have already made an observation for this new value\u00a0<\/span> x_* <\/span><\/span>. Thus, having discrete data is very limiting. Ideally, we would like to find a function,\u00a0<\/span> f <\/span><\/span>, which can be use instead of the discrete data to make predictions. Typically, this can be done in several different ways but one of them introduces Gaussian Processes in the context of regression. To obtain the desired function\u00a0<\/span> f <\/span><\/span>\u00a0which goes through all of our observed data points, we attribute a prior probability to every possible functions, reflecting how likely we believe they are to best represent our data. It is immediately obvious that assigning a prior probability to the infinite number of existing functions is a major limitation, as it would potentially require an infinite amount of time. This problem is solved by Gaussian Processes which can be used as a prior probability\u00a0<\/span>distribution over all the functions. Inference in the GP made on a finite subset of the function f while ignoring the infinite number of remaining points will produce the same solution as if we had accounted for them <\/span>(Williams & Rasmussen 2006)<\/a>.<\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t

\n\t\t\t\t\t\t
\n\t\t\t\t\t
\n\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t
\n\t\t\t\t\t\t\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\t\t\t\"Dataset\"\t\t\t\t\t\t\t\t\t\t\t
Figure 1: Discrete time series data plot containing 13 observations.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t
\n\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t
\n\t\t\t\t\t

Definition (Gaussian Process)<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t
\n\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t
\n\t\t\t\t\t\t\t\t\t

A Gaussian Process is a collection of random variables (indexed by time or space), of which, any finite number have a joint Gaussian distribution (i.e. multivariate normal distribution).\u00a0<\/p>

We denote a GP as follows:<\/p>

f(\\boldsymbol{x}) \\ \\sim \\ \\mathcal{GP}(m(\\boldsymbol{x}), k(\\boldsymbol{x}, \\boldsymbol{x'})),\u00a0 <\/span>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0(1)<\/p>

where,\u00a0<\/p>