Retail (25%) Hagkaup is a retailer selling groceries, household items, clothes and more. Hagkaup sends
Question 1. Retail (25%)
Hagkaup is a retailer selling groceries, household items, clothes and more. Hagkaup sends a weekly newsletter with special offers. Individuals can register on‐line and instore and then receive by email a list of products on offer. Hagkaup monitors sales and computes purchase probabilities for the various products as well as the variation of these probabilities across customer categories. The company has observed that most purchases involve only one item per product.
The so‐called binomial distribution can be used to describe uncertain situations of this kind where there are two possible outcomes; a customer either buys a certain product or not. The binomial distribution can be approximated by the normal distribution such that the number of products sold is normally distributed with a mean of n∙p and a standard deviation of \[\sqrt{np\left( 1\text{ }-\text{ }p \right)}\] where n is the number of customers and p is the purchase probability.
- Hagkaup wishes to analyse the potential revenue generated by the newsletter from its top‐seller unisex bike whenever it appears on the list in the weekly newsletter. The bike is priced at £220, and offered frequently over the summer period. The purchase probability for this product over its peak‐demand period (based on historical analysis) is estimated at 0.5%. What is the probability that the company will achieve revenues of at least £25K from this product in a certain newsletter, given that they have 20,000 customers signed up for newsletter? Which business actions and situations would help the company achieve these revenues? Please discuss briefly.
- Hagkaup is planning to attract more customers to sign up for the newsletter and has to decide upon its advertising strategy and budget. In particular, they wish to know the number of new customers they need to attract to sign up in order to achieve revenues of at least £30K from their top‐seller product, mentioned above, with 95% probability (each time the product appears on their list). Derive this number for them and comment on it.
- Which assumptions and limitations underlie the analysis? What would you recommend to the company in order to achieve the revenue target stated in part b)?
- While selecting products to include in the niche section of their weekly newsletter, Hagkaup has the option to choose between women’s and men’s bikes:
- A men’s bike, which costs £240 with estimated purchase probability 0.7% across the men on the email list, which involves 7,000 men.
- A women’s bike, which costs £250 with estimated purchase probability 0.3%, across the women on the email list, which involves 15,000 women.
What is the probability that each product will achieve revenues greater than their target of £12K for a niche product? Derive the 95% confidence interval for the expected revenues in each case. Which product would you select?
Question 2. Insurance Website (25%)
MT is an insurance firm that sells car insurance on its website. MT is choosing a new look for its website and has narrowed it down to two designs. The director of marketing is not convinced which design will work better at attracting customers to purchase a car insurance on their website. Therefore, she has decided to run both website designs at the same time and show Design A to part of the visitors to the website and Design B to the rest. This approach is called A/B testing and is often used in the context of Big Data.
Design A is more modern and has a more "stylish" look, which might please new and younger customers. However, Design B is simpler and cleaner, which might please older and less internet savvy customers.
MT ran the experiment on 120 visitors that uploaded its website such that 60 visitors saw Design A and 60 visitors saw Design B. The results can be found in the data file MT insurance.xlsx. The first column indicates which version was shown to the visitor of the website, the values in the second column are 1 if the visitor started the process of purchasing an insurance otherwise the value is 0, and the last column lists the age of each visitor.
-
The director of marketing believes simple designs of websites are more effective.
Does the outcome of the experiment support her belief? Please provide arguments. - Is there any difference between design A and B in terms of the age of the visitors that start the purchasing process? Please provide arguments.
- In order to decide on which design of the website to adopt, what other data would you like to collect on the visitors to the website? How would you use the data?
Question 3. Customer Profitability (25%)
Fjalla bank has similar to other banks offered online services to its customers for several years. However, the bank has never managed to find out whether online banking customers are more profitable than customers that do not use the bank’s online services. The bank recently sampled data on how profitable each customer has been for the bank, how long a customer has been with the bank, whether the customer uses the online services or not and where the customer lives.
The data can be found in the file Fjalla bank.xlsx. Each line in the data table corresponds to one customer. The first data column has the data on the profit of each customer. The second column includes 1 if the customer uses the bank’s online services, otherwise 0. The third column includes information on how long a customer has been with the bank, measured in years. The last column lists the postcodes of the districts that the customers live in.
- Build and interpret a regression model that can be used to forecast the profitability of Fjalla bank’s customers.
- Are online customers more or less profitable than offline customers? By how much? Please provide arguments.
- Are customers in some districts more or less profitable than in others? By how much? Please provide arguments.
- Estimate the profit of a customer who has been with the bank for 15 years, lives in the 210 district, and uses the online services of the bank. Comment on your findings.
- If the Fjalla bank was keen on continuing with this analysis what would you suggest as the next step?
Question 4. Housing Starts (25%)
Housing starts is an economic indicator that describes the number of houses on which construction has started during a specified period. Construction activities are not necessarily the same throughout the year with weather and other external factors playing a role.
Table 1 shows an extract of data on quarterly housing starts over four years as well as mortgage rates (2 year fixed rate mortgage with 75% LTV). The first column indicates the relevant year, the second column the quarters, the third column the housing starts, the fourth column the mortgage rates, the last four columns have dummy variables indicating each quarter.
Table 1. The Data
- Interpret the model coefficients. Why is Q4 not part of the regression results?
- Are construction activities seasonal? Provide arguments.
- Write down the equation for a regression model for forecasting housing starts in Q1.
- Evaluate the goodness‐of‐fit (quality) of the model. Explain briefly what measures you use for assessing the goodness‐of‐fit and what they mean.
- If the t Stat for Q3 would have been 1.23 (instead of 4.19), how would you have proceeded with the model and how would that have affected your arguments for seasonality?
- The government believes that there has been an overall increase in housing starts over the four years considered. How can you modify the model to statistically explore this potential increase?
Deliverable: Word Document
