__STAT 250 Spring 2022 Data Analysis Assignment 4__

__You may not upload this file to any online homework help sites. In addition, you may not discuss this assignment on any group chats with any individuals (either in this course or not). Please see our course syllabus for honor code rules. Thank you.__Your solutions document should include the following items. Points will be deducted if the following are not included.

- Type your
**Name**and**STAT 250**with your correct section number (e.g. STAT 250-xxx) right justified and then**Data Analysis Assignment #4**centered on the top of page 1 below your name to begin your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4 - Number your pages across your entire solutions document.
- Your solutions document should include the
**ANSWERS ONLY**with each answer labeled by its corresponding number and subpart. Keep the answers in order. - Generate all requested graphs and tables using
.__StatCrunch__ - Upload your solutions document onto Blackboard as a
__pdf file__using the link provided by your instructor. It is your responsibility for uploading a readable file. STAT 250 Spring 2022 Data Analysis Assignment 4 **You may not work with other individuals on this assignment. It is an honor code violation if you do**.

*Please note: all StatCrunch Instructions provided in the parts of the problems will be presented in italics.*

**Elements of good technical writing:**STAT 250 Spring 2022 Data Analysis Assignment 4 Use complete and coherent sentences to answer the questions. Graphs must be appropriately titled and should refer to the context of the question. Graphical displays must include

__labels with units__if appropriate for each axis. Units should always be included when referring to numerical values. When making a comparison you must use comparative language, such as “greater than”, “less than”, or “about the same as.” STAT 250 Spring 2022 Data Analysis Assignment 4 Ensure that all graphs and tables appear on one page and are not split across two pages. Type

__all__mathematical calculations when directed to compute an answer ‘by-hand.’ Pictures of actual handwritten work are not accepted on this assignment. When writing mathematical expressions into your solutions document you may use either an equation editor or common shortcuts. For example, can be written as sqrt(x), can be written as p-hat, and can be written as x-bar.

**Investigation 1: Ezpass on I-66**

**Transportation engineers determined that to help ease traffic, they would set up tolls for solo drivers during rush hour times on Route I-66 inside the Capital Beltway (Route I-495). Carpooling drivers with the proper Ezpass would drive free. The toll program began in December 4, 2017. To test the claim that fewer vehicles would use Route I-66 after the program began, a count of the number of vehicles was collected from two random samples. One sample collected the number of daily vehicles for 12 randomly selected days prior to December 4, 2017 (“**

**Before**”) and a second sample collected the number of daily vehicles for another 12 randomly selected days after December 4, 2017 (“

**After**”) (up to a year after). The data set is called

**Use the significance level a = 0.05.**

*“Number of Vehicles 1.”*

- What type of data are collected for each sample (categorical or numerical)? How many samples have been selected? Answer these questions in one sentence each.
- Have the samples been collected independently or dependently? Answer the question and provide a reason why in one sentence. STAT 250 Spring 2022 Data Analysis Assignment 4
- Define the population parameter in the context of this question in one sentence.
- State the hypotheses you would use to test the claim stated in the question.
- Calculate the statistic you plan to use to estimate the stated parameter in part (b). Round this statistic to a whole number. STAT 250 Spring 2022 Data Analysis Assignment 4
- Produce one image that displays both samples’ data using horizontal boxplots. Please title and label this graph correctly and copy the graph into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4
- Write a one-sentence interpretation of this plot. Also, comment on whether it is appropriate to use the t-distribution for inference based on this interpretation and the conditions necessary.
- No matter your answer to part (g), use
*Stat**à**T Stats**à**Two Sample**à**With Data*to compute the test statistic and*p*-value for this hypothesis test. Select Sample 1 as “After” and Sample 2 as “Before.” Copy the output into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4 - Based on the p-value from the output produced in part (h), state the decision you would make in this hypothesis test. Provide a reason for this decision in one sentence.
- State your conclusion in the context of this hypothesis test. Write your answer in context in one or two sentences. STAT 250 Spring 2022 Data Analysis Assignment 4
- Comment in one sentence if you believe your conclusions are accurate based on your interpretation in part (g).
- Provide at least one confounding variable that may have had an effect on this study’s results in one sentence.

**Investigation 2: Ezpass on I-66 Continued**Transportation engineers determined that to help ease traffic, they would set up tolls for solo drivers during rush hour times on Route I-66 inside the Capital Beltway (I-495). Carpooling drivers with the proper Ezpass would drive free. The toll program began in December 4, 2017. To test the claim that fewer vehicles would use Route I-66 after the program began, a count of the number of vehicles were collected from two random samples. Both samples were taken on the first Monday of each month for an entire year before and after the toll program was implemented. The data set is called

**and we will use = 0.05 in this investigation.**

*“Number of Vehicles 2”*- Calculate the difference between the number of vehicles for the first Monday of each month using StatCrunch (go to
*Data**à**Compute**à**Expression*). Please subtract (**2018**–**2017**). List the difference for each of the pairs in one column in your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4. - Obtain the sample mean of these differences and the sample standard deviation of these differences using StatCrunch. Copy the table that you obtain from StatCrunch into your solutions document and round the values for the sample mean and sample standard deviation to two decimal places.
- Define the population parameter in context in one sentence.
- State the null and alternative hypotheses using correct notation.
- Create a frequency histogram overlaid with a Normal curve of the sample differences and create a horizontal boxplot of the sample differences. Title and label each graph appropriately and copy these graphs into your solutions. STAT 250 Spring 2022 Data Analysis Assignment 4
- Provide a one-sentence comment about these graphs that allow us to continue conducting inference using the t-distribution in this problem.
- No matter your answer to part (f), calculate the test statistic “by hand” and be sure to show your work (please type this work). Use the rounded values you obtained in part (b). Round the test statistic to three decimal places. STAT 250 Spring 2022 Data Analysis Assignment 4
- State the degrees of freedom for this test and show how you calculated the degrees of freedom.
- Calculate the p-value using the T-calculator in StatCrunch (
*Stat**à**Calculators**à**T*). Present the T-Calculator image in your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4 - Use StatCrunch
*Stat**à**T Stats**à**Paired*and enter “**2018**” for Sample 1 and “**2017**” for Sample 2 to verify your test statistic from part (g) and your*p*-value in part (i). Copy the output into your solutions document. - State your decision whether you reject or do not reject the null hypothesis and the reason for your decision in one sentence. STAT 250 Spring 2022 Data Analysis Assignment 4
- State your conclusion in context of the problem in one or two complete sentences.
- Compare the standard error from the StatCrunch output in part (j) to the standard error obtained in part (h) of Investigation 1. How did the different study designs affect the outcome of each hypothesis test? Answer these in two or three sentences.

**Investigation 3: I-66 Outside the Beltway (no data set)**

**The Transform 66 Outside the Beltway project, scheduled to be completed late 2022, is an expansion of the current highway that will provide new travel choices for drivers and hopefully congestion relief from the Capital Beltway (I-495) to Gainesville for a 22.5 mile stretch. This 22.5 mile stretch of I-66 will go through both Fairfax County and Prince William County. VDOT had team members from the project reach out through a phone survey for a random selection of 3,500 residents from both Fairfax County and Prince William County. One question asked on the survey was “Do you feel that the new I-66 Express Lanes will bring congestion relief?” Of the Fairfax County residents surveyed, 1,642 agreed they thought that the expansion will bring congestion relief while only 1,576 of the Prince William County residents surveyed agreed with the statement.**

- Calculate and label the two sample proportions separately and round the values to four decimal places. Next, calculate the difference between these sample proportions by subtracting (Fairfax County – Prince William County). Type all of these calculations, label each of them.
- Check the Central Limit Theorem conditions for a confidence interval for the difference between two proportions. For the large sample condition, check and show that each sample has at least 10 successes and 10 failures. STAT 250 Spring 2022 Data Analysis Assignment 4
- Construct the 95% confidence interval in StatCrunch using
*Stat**à**Proportion Stats**à**Two Sample**à**With Summary*. Use Fairfax County as Sample 1 and Prince William County as Sample 2. Copy the output into your solutions document. STAT 250 Spring 2022 Data Analysis Assignment 4 - Interpret the 95% confidence interval in context in one sentence. In your interpretation, round the confidence limits to four decimal places. STAT 250 Spring 2022 Data Analysis Assignment 4
- Does your confidence interval capture 0? What does this indicate about the two samples?
- State the null and alternative hypotheses using correct notation if you were to test to see if there is a difference in the proportions for approval between all residents of Fairfax County and Prince William County.
- State whether you reject or do not reject the null hypothesis in part (f) and the reason for your decision in one sentence. State your conclusion in context of the problem (i.e. interpret your results and/or answer the question being posed) in one or two complete sentences. STAT 250 Spring 2022 Data Analysis Assignment 4
- What is a possible confounding variable that might have introduced sampling bias into the results of the survey? Answer in one to two sentences.

**Investigation 4: Heart Disease**A public health researcher is interested in some factors that influence heart disease. In a survey of 68 randomly selected localities, he gathered data on the percentage of people in each locality who bike to work “

**Biking**”, the percentage of people in each locality who smoke “

**Smoking**”, and the percentage of people in each locality who have heart disease “

**Heart.Disease**”. The researcher wants to find which explanatory variable will be a better predictor of the response variable, “

**Heart.Disease**”. Investigate the relationship between the explanatory variables and response variable to help the researcher find the better predictor. The dataset is called “

**”**

*Heart Disease.*- Make two separate scatterplots where each scatterplot will present one of the explanatory variables graphed with the response variable “
**Disease**”. Copy and paste the two scatterplots in your solutions document (use*Graph**à**Scatter Plot*in StatCrunch). Appropriately title and label each graph. - Interpret the scatterplot of “
**Biking**” and “**Disease**” using trend, strength, and shape (form) in one complete sentence. STAT 250 Spring 2022 Data Analysis Assignment 4. - Interpret the scatterplot of “
**Smoking**” and “**Disease**” using trend, strength, and shape (form) in one complete sentence. STAT 250 Spring 2022 Data Analysis Assignment 4. - Calculate both correlation coefficients using
*Stat**à**Summary Stats**à**Correlation*in StatCrunch. Each correlation will be calculated using one of the explanatory variables vs. the response variable “**Disease**”. Provide both of these correlation coefficient values in your solutions document. - Which of the two explanatory variables would be the better predictor of “
**Disease**”? Base your response on the scatterplots and the correlation coefficient. State your answer in one or two complete sentences including an explanation for your variable choice. STAT 250 Spring 2022 Data Analysis Assignment 4. - Using the “
**Biking**” variable as the explanatory variable and “**Disease**” as the response variable, run a Simple Linear Regression analysis in StatCrunch. Use*Stat**à**Regression**à**Simple Linear*. Copy and paste**only**the StatCrunch results output at the top (do not include the tables). - Copy and paste the fitted line plot for “
**Biking**” and “**Disease**” into your solutions document. This StatCrunch graph appears on page 2 of your StatCrunch output (i.e., click the right arrow at the bottom of your regression output to find the image). STAT 250 Spring 2022 Data Analysis Assignment 4. - Type the regression equation for “
**Biking**” and “**Disease**” in context into your solutions document. You may copy and paste it from your output in part (f). - Interpret the slope of the regression line (in context of this data set) for “
**Biking**” and “**Disease**”. - Is it meaningful to interpret the
*y*-intercept for “**Biking**” and “**Disease**”? Why or why not? - State
*r*^{2}(i.e., the coefficient of determination) for “**Biking**” and “**Disease**” and explain what this value means in context of the data set. - In one randomly selected location, the researcher found the biking rate was 70%. Use this information to predict the corresponding rate of heart disease in that location. Use the regression equation in part (h) to predict the rate of heart disease. Show the typed calculation in your solutions.
- Was the prediction you made for the researcher in part (l) an example of extrapolation? Why or why not? Write your response in one to two complete sentences with an explanation.
- Can we say that biking causes reduction in the heart disease rate? Why or why not? If you cannot, provide an example of a confounding variable. Answer these questions in one or two sentences.

*Get PICOT Question Paper Help Now!!*