1.9 Exercises
1.9.1 Practice using the pnorm
function
1.9.1.1 Part 1
Given a normal distribution with mean 57
and standard deviation 100
, use the pnorm function to calculate the probability of obtaining values between 222
and 84
from this distribution.
1.9.1.2 Part 2
Calculate the following probabilities.
Given a normal distribution with mean 51
and standard deviation 4
, what is the probability of getting
- a score of
41
or less - a score of
41
or more - a score of
54
or more
1.9.1.3 Part 3
Given a normal distribution with mean 51
and standard deviation 7
, what is the probability of getting
- a score of
46
or less. - a score between
48
and54
. - a score of
mu+1
or more.
1.9.2 Practice using the qnorm
function
1.9.2.1 Part 1
Consider a normal distribution with mean 1 and standard deviation 1.
Compute the lower and upper boundaries such that:
- the area (the probability) to the left of the lower boundary is
0.37
. - the area (the probability) to the left of the upper boundary is
0.94
.
1.9.2.2 Part 2
Given a normal distribution with mean 50.643
and standard deviation 0.736
. There exist two quantiles, the lower quantile q1 and the upper quantile q2, that are equidistant from the mean 50.643
, such that the area under the curve of the Normal probability between q1 and q2 is 85
%. Find q1 and q2.
1.9.3 Maximum likelihood estimation 1
The function dnorm
gives the likelihood given a data point (or multiple data) and a value for the mean and the standard deviation (sd). Using dnorm
, compute
- the likelihood of the data point
11.664
assuming a mean of12
and standard deviation 5. - the likelihood of the data point
11.664
assuming a mean of11
and standard deviation 5. - the likelihood of the data point
11.664
assuming a mean of10
and standard deviation 5. - the likelihood of the data point
11.664
assuming a mean of9
and standard deviation 5.
1.9.4 Maximum likelihood estimation 2
You are given \(10\) independent and identically distributed data points that are assumed to come from a Normal distribution with unknown mean and unknown standard deviation:
## [1] 503 487 511 488 516 501 488 484 493 505
The function dnorm
gives the likelihood given multiple data points and a value for the mean and the standard deviation. The log-likelihood can be computed by typing dnorm(...,log=TRUE)
.
The product of the likelihoods for two independent data points can be computed like this: Suppose we have two independent and identically distributed data points 5 and 10. Then, assuming that the Normal distribution they come from has mean 10 and standard deviation 2, the joint likelihood of these is:
## [1] 0.001748
It is easier to do this on the log scale, because then one can add instead of multiplying. This is because \(\log(x\times y)= \log(x) + \log(y)\). For example:
## [1] 1.792
## [1] 1.792
So the joint log likelihood of the two data points is:
## [1] -6.349
Even more compactly:
## [1] -6.349
- Given the 10 data points above, calculate the maximum likelihood estimate (MLE) of the expectation.
- The sum of the log-likelihoods of the data x, using as the mean the MLE from the sample, and standard deviation 5.
- What is the sum of the log-likelihood if the mean used to compute the log-likelihood is
495.6
? - Which value for the mean, the MLE or
495.6
, gives the higher log-likelihood?
1.9.5 Generating bivariate data
Generate 50 data points from two random variables X and Y, where \(X\sim Normal(50,100)\) and \(Y\sim Normal(100,20)\). The correlation between the random variables is 0.7. Plot the simulated data points from Y against those from X.
1.9.6 Generating multivariate data
The bivariate case can be generalized to more than two dimensions. Generate 50 data points from three random variables X, Y, and Z, where \(X\sim Normal(50,100)\), \(Y\sim Normal(100,20)\), and \(Z\sim Normal(200,50)\). The correlation between the random variables X and Y is 0.5, between X and Z is 0.2, an between Y and Z is 0.7. Here, you will have to define a \(3\times 3\) variance covariance matrix, with the pairwise covariances in the off-diagonals. Plot the simulated data points as two-dimensional figures: Y against X, Y against Z, and X against Z.