STAT 250 Spring 2022 Data Analysis Assignment 4
You may not upload this file to any online homework help sites. In addition, you may not discuss this assignment on any group chats with any individuals (either in this course or not). Please see our course syllabus for honor code rules. Thank you. Your solutions document should include the following items. Points will be deducted if the following are not included.- Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx) right justified and then Data Analysis Assignment #4 centered on the top of page 1 below your name to begin your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4
- Number your pages across your entire solutions document.
- Your solutions document should include the ANSWERS ONLY with each answer labeled by its corresponding number and subpart. Keep the answers in order.
- Generate all requested graphs and tables using StatCrunch.
- Upload your solutions document onto Blackboard as a pdf file using the link provided by your instructor. It is your responsibility for uploading a readable file. STAT 250 Spring 2022 Data Analysis Assignment 4
- You may not work with other individuals on this assignment. It is an honor code violation if you do.
- What type of data are collected for each sample (categorical or numerical)? How many samples have been selected? Answer these questions in one sentence each.
- Have the samples been collected independently or dependently? Answer the question and provide a reason why in one sentence. STAT 250 Spring 2022 Data Analysis Assignment 4
- Define the population parameter in the context of this question in one sentence.
- State the hypotheses you would use to test the claim stated in the question.
- Calculate the statistic you plan to use to estimate the stated parameter in part (b). Round this statistic to a whole number. STAT 250 Spring 2022 Data Analysis Assignment 4
- Produce one image that displays both samples’ data using horizontal boxplots. Please title and label this graph correctly and copy the graph into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4
- Write a one-sentence interpretation of this plot. Also, comment on whether it is appropriate to use the t-distribution for inference based on this interpretation and the conditions necessary.
- No matter your answer to part (g), use Stat à T Stats à Two Sample à With Data to compute the test statistic and p-value for this hypothesis test. Select Sample 1 as “After” and Sample 2 as “Before.” Copy the output into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4
- Based on the p-value from the output produced in part (h), state the decision you would make in this hypothesis test. Provide a reason for this decision in one sentence.
- State your conclusion in the context of this hypothesis test. Write your answer in context in one or two sentences. STAT 250 Spring 2022 Data Analysis Assignment 4
- Comment in one sentence if you believe your conclusions are accurate based on your interpretation in part (g).
- Provide at least one confounding variable that may have had an effect on this study’s results in one sentence.
- Calculate the difference between the number of vehicles for the first Monday of each month using StatCrunch (go to Data à Compute à Expression). Please subtract (2018 – 2017). List the difference for each of the pairs in one column in your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4.
- Obtain the sample mean of these differences and the sample standard deviation of these differences using StatCrunch. Copy the table that you obtain from StatCrunch into your solutions document and round the values for the sample mean and sample standard deviation to two decimal places.
- Define the population parameter in context in one sentence.
- State the null and alternative hypotheses using correct notation.
- Create a frequency histogram overlaid with a Normal curve of the sample differences and create a horizontal boxplot of the sample differences. Title and label each graph appropriately and copy these graphs into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4
- Provide a one-sentence comment about these graphs that allow us to continue conducting inference using the t-distribution in this problem.
- No matter your answer to part (f), calculate the test statistic “by hand” and be sure to show your work (please type this work). Use the rounded values you obtained in part (b). Round the test statistic to three decimal places. STAT 250 Spring 2022 Data Analysis Assignment 4
- State the degrees of freedom for this test and show how you calculated the degrees of freedom.
- Calculate the p-value using the T-calculator in StatCrunch (Stat à Calculators à T). Present the T-Calculator image in your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4
- Use StatCrunch Stat à T Stats à Paired and enter “2018” for Sample 1 and “2017” for Sample 2 to verify your test statistic from part (g) and your p-value in part (i). Copy the output into your solutions document.
- State your decision whether you reject or do not reject the null hypothesis and the reason for your decision in one sentence. STAT 250 Spring 2022 Data Analysis Assignment 4
- State your conclusion in context of the problem in one or two complete sentences.
- Compare the standard error from the StatCrunch output in part (j) to the standard error obtained in part (h) of Investigation 1. How did the different study designs affect the outcome of each hypothesis test? Answer these in two or three sentences.
- Calculate and label the two sample proportions separately and round the values to four decimal places. Next, calculate the difference between these sample proportions by subtracting (Fairfax County – Prince William County). Type all of these calculations, label each of them.
- Check the Central Limit Theorem conditions for a confidence interval for the difference between two proportions. For the large sample condition, check and show that each sample has at least 10 successes and 10 failures. STAT 250 Spring 2022 Data Analysis Assignment 4
- Construct the 95% confidence interval in StatCrunch using Stat à Proportion Stats à Two Sample à With Summary. Use Fairfax County as Sample 1 and Prince William County as Sample 2. Copy the output into your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4
- Interpret the 95% confidence interval in context in one sentence. In your interpretation, round the confidence limits to four decimal places. STAT 250 Spring 2022 Data Analysis Assignment 4
- Does your confidence interval capture 0? What does this indicate about the two samples?
- State the null and alternative hypotheses using correct notation if you were to test to see if there is a difference in the proportions for approval between all residents of Fairfax County and Prince William County.
- State whether you reject or do not reject the null hypothesis in part (f) and the reason for your decision in one sentence. State your conclusion in context of the problem (i.e. interpret your results and/or answer the question being posed) in one or two complete sentences. STAT 250 Spring 2022 Data Analysis Assignment 4
- What is a possible confounding variable that might have introduced sampling bias into the results of the survey? Answer in one to two sentences.
- Make two separate scatterplots where each scatterplot will present one of the explanatory variables graphed with the response variable “Disease”. Copy and paste the two scatterplots in your solutions document (use Graph à Scatter Plot in StatCrunch). Appropriately title and label each graph.
- Interpret the scatterplot of “Biking” and “Disease” using trend, strength, and shape (form) in one complete sentence. STAT 250 Spring 2022 Data Analysis Assignment 4.
- Interpret the scatterplot of “Smoking” and “Disease” using trend, strength, and shape (form) in one complete sentence. STAT 250 Spring 2022 Data Analysis Assignment 4.
- Calculate both correlation coefficients using Stat à Summary Stats à Correlation in StatCrunch. Each correlation will be calculated using one of the explanatory variables vs. the response variable “Disease”. Provide both of these correlation coefficient values in your solutions document.
- Which of the two explanatory variables would be the better predictor of “Disease”? Base your response on the scatterplots and the correlation coefficient. State your answer in one or two complete sentences including an explanation for your variable choice. STAT 250 Spring 2022 Data Analysis Assignment 4.
- Using the “Biking” variable as the explanatory variable and “Disease” as the response variable, run a Simple Linear Regression analysis in StatCrunch. Use Stat à Regression à Simple Linear. Copy and paste only the StatCrunch results output at the top (do not include the tables).
- Copy and paste the fitted line plot for “Biking” and “Disease” into your solutions document. This StatCrunch graph appears on page 2 of your StatCrunch output (i.e., click the right arrow at the bottom of your regression output to find the image). STAT 250 Spring 2022 Data Analysis Assignment 4.
- Type the regression equation for “Biking” and “Disease” in context into your solutions document. You may copy and paste it from your output in part (f).
- Interpret the slope of the regression line (in context of this data set) for “Biking” and “Disease”.
- Is it meaningful to interpret the y-intercept for “Biking” and “Disease”? Why or why not?
- State r2 (i.e., the coefficient of determination) for “Biking” and “Disease” and explain what this value means in context of the data set.
- In one randomly selected location, the researcher found the biking rate was 70%. Use this information to predict the corresponding rate of heart disease in that location. Use the regression equation in part (h) to predict the rate of heart disease. Show the typed calculation in your solutions.
- Was the prediction you made for the researcher in part (l) an example of extrapolation? Why or why not? Write your response in one to two complete sentences with an explanation.
- Can we say that biking causes reduction in the heart disease rate? Why or why not? If you cannot, provide an example of a confounding variable. Answer these questions in one or two sentences.
