January 31, 2023 appeared first on Data Science Tutorials after hypothesis testing in R

What do you have to lose? See Data Science tutorial here Data Science Tutorial.

Hypothesis Testing In R, a formal statistical test called a hypothesis test is used to confirm or refute a statistical hypothesis.

The following R hypothesis tests are performed in this course.

t-test with one sample t-test for two samples t-test for paired samples

Each type of test can be run using the R function t.test().

How to create an interaction plot in R? – Data Science Tutorial

one sample t-test

t.test(x, y = NULL, optional = c(“two-sided”, “lesser”, “larger”), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, … )

Where:

x, y: two samples of data.

Alternative: The alternative hypothesis to be tested.

mu: true value of the mean.

Paired: Whether to run the paired t-test.

var.equal: Whether to assume that the variances between samples are equal.

conf.level: The confidence level to use.

The following examples show how to use this function in practice.

Example 1: One-Sample T-Test in R

The one-sample t-test is used to determine whether the population mean is equal to a given value.

Consider the situation where we want to determine whether the average weight of a particular species of turtle is 310 pounds. We go out and collect a direct random sample of turtles with the weights listed below.

How to find unmatched records in R – Data Science Tutorial

Weight: 301, 305, 312, 315, 318, 319, 310, 318, 305, 313, 305, 305, 305

The following code shows how to perform this one sample t-test in R:

Specify turtle weight vector

Weight Now we can do one-sample t-test

t.test(x = weights, mu = 310) One sample t-test data: weights t = 0.045145, df = 12, p-value = 0.9647 Alternative hypothesis: true mean is not equal to 310 95 percent confidence interval: 306.3644 313.7895 Sample Estimate: Mean of x 310.0769

From the output we can see:

t-test statistic: 045145

Degree of Freedom: 12

p-value: 0.9647

95% confidence interval for true mean: [306.3644, 313.7895]

Mean Turtle Weight: 310.0769 We are unable to reject the null hypothesis because the p-value of 0.9647 is greater than or equal to .05 for the test.

This means that we do not have enough evidence to conclude that the average weight of this species of turtle is any different than 310 pounds.

Example 2: Two Sample T-Test in R

To determine whether the means of two populations are equal, a two-sample t-test is employed.

Consider the situation where we want to determine whether the average weight of two different species of turtles is equal. We test this by collecting a straight random sample of turtles of each species with the following weights.

ggpairs in R – Data Science Tutorial

Sample 1: 310, 311, 310, 315, 311, 319, 310, 318, 315, 313, 315, 311, 313

Sample 2: 335, 339, 332, 331, 334, 339, 334, 318, 315, 331, 317, 330, 325

See also  MIDI Madness with ChatGPT: AI-Powered Tunes That Will Make You Laugh, Cry & Dance | R bloggers

The following code shows how to perform this two-sample t-test in R:

Now we can create a vector of turtle weights for each sample

Sample 1 Let’s do a two sample t-test

Welch two sample t-test data: sample1 and sample2 t = -6.7233, df = 15.366, p-value = 6.029e-06 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -21.16313 -10.99071 Estimate: Mean x of mean y 313.1538 329.2308

We reject the null hypothesis because the p-value of the test (6.029e-06) is smaller than .05.

Accordingly, we have sufficient data to conclude that the mean weight of the two species is not the same.

Example 3: Paired Samples T-Test in R

When each observation in one sample can be correlated with an observation in the other sample, a paired samples t-test is used to compare the means of the two samples.

For example, let’s say we want to determine whether a particular training program can help basketball players increase their maximum vertical jump (in inches).

How to Create an Anatogram Plot in R – Data Science Tutorial

We can collect a small, random sample of 12 college basketball players to test this by measuring each player’s maximum vertical jump. Then, after each athlete has used the training regimen for a month, we can take another look at their maximum vertical jump.

The following information shows the maximum jump height (in inches) for each athlete before and after using the training program.

Before: 122, 124, 120, 119, 119, 120, 122, 125, 124, 123, 122, 121

After: 123, 125, 120, 124, 118, 122, 123, 128, 124, 125, 124, 120

The following code shows how to perform this paired samples t-test in R:

Let’s define before and after max jump height

before we can do a paired samples t-test

t.test (x = before, y = after, paired = TRUE) paired t-test data: before and after t = -2.5289, df = 11, p-value = 0.02803 Alternative hypothesis: true difference in means equals Not Not 0 95 Percent Confidence Interval: -2.3379151 -0.1620849 Sample Estimate: Mean Difference -1.25

We reject the null hypothesis because the p-value of the test (0.02803) is smaller than .05.

Autocorrelation and partial autocorrelation in time series (datasciencetut.com)

The average jump height before and after implementing the training program is not equal, thus we do not have enough data to draw such a conclusion.

appeared first on Data Science Tutorials after hypothesis testing in R

Learn to become an expert in Data Science field with Data Science Tutorials.

related