STAT 250 Fall 2019 Data Analysis Assignment 3

Your submitted document should include the following items. Points will be deducted if the following are not included.

1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #3 centered on the top of page 1 below your name the begin your document.
3. Your document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order. Do not include the questions in your submitted document.
4. Generate all requested graphs and tables using StatCrunch.
6. You may not work with other individuals on this assignment. It is an honor code violation if you do.

Elements of good technical writing:
Use complete and coherent sentences to answer the questions.
Graphs must be appropriately titled and should refer to the context of the question.
Graphical displays must include labels with units if appropriate for each axis.
Units should always be included when referring to numerical values.
When making a comparison you must use comparative language, such as “greater than”, “less than”, or “about the same as.”
Ensure that all graphs and tables appear on one page and are not split across two pages.
Type all mathematical calculations when directed to compute an answer ‘by-hand.’
Pictures of actual handwritten work are not accepted on this assignment.
When writing mathematical expressions into your document you may use either an equation editor or common shortcuts such as: can be written as sqrt(x), can be written as p-hat, can be written as x-bar.
Problem 1: Confidence Interval for Percent of Individuals Vaping (there is no data set)
From a July 2019 survey of 1186 randomly selected Americans ages 18-29, it was discovered that 248 of them vaped (used an e-cigarette) in the past week.
a) Calculate the sample proportion of Americans ages 18-29 who vaped in the past week. Round this value to four decimal places.

b) Write one sentence each to check the three conditions of the Central Limit Theorem. Show your work for the mathematical check needed to show a large sample size was taken.

c) Using the sample proportion obtained in (a), construct a 90% confidence interval to estimate the population proportion of Americans age 18-29 who vaped in the past week. Please do this “by hand” using the formula and showing your work (please type your work, no images accepted here). Round your confidence limits to four decimal places.

d) Verify your result from part (c) using Stat  Proportions Stats  One Sample  With Summary. Inside the box, enter the number of successes, the number of observations, and select confidence interval and click Compute! Copy and paste your StatCrunch result in your document.

e) Interpret the StatCrunch confidence interval in part (d) in one sentence using the context of the question.

f) Use the Confidence Interval applet (for a Proportion) in StatCrunch to simulate constructing one thousand 90% confidence intervals assuming the proportion Americans ages 18-29 vaping in the past week in the population is p = 0.22 and the sample size n = 1186. Once the window is open, click reset and select (or click) 1000 intervals. Copy and paste your image into your document.

g) Compare the “Prop. contained” value from part (f) to the confidence level associated with the simulation in one sentence.

h) Write a long-run interpretation for your confidence interval method in context in one sentence.
Problem 2: Food Delivery Robots

GMU began a robot food delivery service in January 2019. It is expected that your food or drink will be delivered in around 30 minutes. The management team is considering a new policy for 2020: if you do not receive your items in at most 45 minutes, you will not have to pay the delivery fee*. To test this, the management team collected a random sample of 432 orders and stored the data in StatCrunch. The responses are 0 = delivery took less than or equal to 45 minutes and 1 = delivery took more than 45 minutes and the data set called “Food Delivery Robots.” *(please note, this is not a real policy under consideration).

a) Obtain the sample proportion of deliveries that took “more than 45 minutes” using Stat  Tables  Frequency in StatCrunch. Only the value of the sample proportion is needed in your answer. Present this sample proportion as a decimal rounded to 4 decimal places.

The management team does not want lose too much money, so they decide to test if more than 14% of deliveries take more than 45 minutes. Using  = 0.01, is there sufficient evidence to conclude that more than 14% of deliveries take longer than 45 minutes? Conduct a full hypothesis test by following the steps below.

b) Define the population parameter in context in one sentence.

c) State the null and alternative hypotheses using correct notation.

d) State the significance level for this problem.

e) Check the three conditions of the Central Limit Theorem that allow you to use the one-proportion z-test using one complete sentence for each condition. Show work for the numerical calculation. You can assume the population is large.

f) Calculate the test statistic “by-hand.” Show the work necessary to obtain the value by typing your work and provide the resulting test statistic. Do not round while doing the calculation. Then, round the test statistic to two decimal places after you complete the calculation.

g) Calculate the p-value using the standard Normal table and provide the answer. Use four decimal places for the p-value.

h) State whether you reject or do not reject the null hypothesis and the reason for your decision in one sentence.

i) State your conclusion in context of the problem (i.e. interpret your results and/or answer the question being posed) in one or two complete sentences.

j) Use StatCrunch (Stat  Proportion Stats  One Sample  with Data) to verify your test statistic and p-value. Note, a success is a “1.” Copy and paste this into your document.
Problem 3: Credits Earned after Two Years of College
A midsized university collected data on all 10,128 seniors attending. One variable of interest was how many credits they earned after they completed two years of school (i.e. credits they earned by their junior year).

a) Graph the distribution of the population of credits earned after two years using a relative frequency histogram. Please use the width of 5 by entering 5 next to Width under Bins. Properly title and label this graph and copy it into your document.

b) Interpret the shape of the population distribution in one complete sentence. Also provide the proportion of individuals in the population who have earned 70 or more credits in another sentence. Obtain this probability by highlighting the histogram.

c) Use StatCrunch to obtain the mean and standard deviation for the credits variable by using Stat  Summary Stats  Columns. Present the mean and standard deviation in one table and round the values to two decimal places. Are these calculations parameters or statistics? Answer this question in one sentence.

d) Take one sample of size 7 from the population using Data  Sample and calculate the mean and standard deviation of this sample. Present the mean and standard deviation in one table and round the values to two decimal places. Are these calculations parameters or statistics? Answer this question in one sentence.

e) Does the sample mean calculated in (d) come from a normal sampling distribution? Check the three conditions of the Central Limit Theorem using one complete sentence for each condition.

f) Take one sample of size 28 from the population using Data  Sample and calculate the mean and standard deviation of this sample. Present the mean and standard deviation in one table and round the values to two decimal places. Are these calculations parameters or statistics? Answer this question in one sentence.

g) Does the sample mean calculated in (f) come from a normal sampling distribution? Check the three conditions of the Central Limit Theorem using one complete sentence for each condition.

h) Calculate the probability that, in a random sample of 28, the mean number of credits taken is greater than 61. First, draw a picture with the mean labeled, shade the area representing the desired probability, standardize, and use the Standard Normal Table to obtain this probability. Please take a picture of your hand drawn sketch and upload it to your Word document (if you do not have this technology, you may use any other method (i.e. Microsoft paint) to sketch the image). You must type the rest of your “by hand” work to earn full credit.

Problem 3 continues on the next page.
i) Verify your answer in part (h) using the StatCrunch Normal calculator and copy that image into your document. In addition, write one sentence to explain what the probability means in context of the question.

j) Compare your result in (h) and (i) with the probability you obtained in part (b). Use this comparison to comment on the difference between a population distribution and sampling distribution.

Problem 4: File Size of Data Analysis 2
A random sample of 22 STAT 250 students was collected and the file size of Data Analysis 2 was recorded. The data was measured in megabytes. The instructors of the course claim that the file size will be different from 5 megabytes. Consider the population of all file sizes to be right skewed. Using  = 0.01, is there sufficient evidence to conclude that the mean file size of Data Analysis 2 is different from 5 megabytes? Conduct a full hypothesis test by following the steps below. Enter an answer for each of these steps in your document.

a) Define the population parameter in one sentence.

b) State the null and alternative hypotheses using correct notation.

c) State the significance level for this problem.

d) Create a histogram and a box plot of the sample data and copy these into your document.

e) Use the graphs created in part (d) to check the conditions that allow you to calculate the test statistic in one to two sentences.

f) No matter your results in part (d & e), calculate the test statistic “by-hand.” Show the work necessary to obtain the value by typing your work and provide the resulting test statistic. Do not round during the calculation. Then, round the test statistic to two decimal places after you complete the calculation.

g) Use StatCrunch (Stat  T Stats  One Sample  with Data) to verify your test statistic. Copy and paste this box into your document.

h) State the p-value using the output provided in part (g). Use four decimal places for the p-value. In addition, state the degrees of freedom.

i) State whether you reject or do not reject the null hypothesis and the reason for your decision in one sentence.

Problem 4 continues on the next page.
j) State your conclusion in context of the problem (i.e. interpret your results and/or answer the question being posed) in one or two complete sentences.

k) Construct a 99% confidence interval using StatCrunch. Copy the output into your document as your answer.

l) Explain the connection between the confidence interval and the hypothesis test in this problem (discuss this in relation to the decision made from your hypothesis test and connect it to the confidence interval you constructed in part (k)). Answer this question in one to two sentences. ~~~For this or similar assignment papers~~~