# Udacity Project — A/B Testing

As part of my goal to change careers to enter the data field, I’m going back to school at Western Governors University for their Data Management/Data Analytics Bachelor’s Degree. The later classes in the program take you through Udacity’s Data Analyst Nanodegree, broken up into 5 classes. This is the project for class 2 of 5, the Practical Statistics class.

The data for this project consisted of 2 csv files. One called `ab_data.csv`

that showed an ID, if they were part of the control or treatment group, if they got the new page or the old page, and if they converted. The second was `countries.csv`

. This held the same ID, and showed the country the user was from.

After importing the libraries, I read in the `ab_data.csv`

file, then got some information about the dataset

There was also an issue with the data that needed to be looked at. If someone is in the `treatment`

group, they should also have their landing page as `new_page`

, then drop those rows.

Then I looked to see if there were any duplicate `user_id`

rows. One was found, and it was dropped.

I looked into the probability of someone converting regardless of which page they received, then the probability of someone converting if they were in the treatment or control group

And I looked at the split of people who got the new page vs the old page, to see if it was skewed in one direction or the other. I found there was a 50/50 split between people who got the new vs old page, about as perfect of a divide as you can get

# A/B test

I’m going to say that the null hypothesis is that Pnew < = Pold, and the alternative hypothesis is that Pnew > Pold. Assuming Pnew = Pold, I came up with the following:

I then simulated a sample for the `new_page_converted`

and `old_page_converted`

based on `n_new`

and `n_old`

that I got above

After that, I did the following:

Then I created a histogram from that data

Through all this, I’ve found that both versions of the site have a p-value outside of the 5% error rate. In this case, we can fail to reject the null hypothesis and keep the old version of the website. There is no statistical evidence that shows making a change to the site will result in more conversions.