Ask
Business task
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.
Prepare
Data sources
We’ll be using Cyclistic’s historical bike trip data from the last 12 months, January - December 2021, which is publicly available here: https://divvy-tripdata.s3.amazonaws.com/index.html. There are 12 .csv files total.
It is organized in rows and columns. Each rows represents one trip, and each trip has a unique field that identifies it: ride_id. Each trip is anonymized and includes the following fields:
Process
Cleaning or manipulation of data
Our task requires three libraries — Pandas, Glob, and the OS module and PostgreSQL as our database.
Merge datasets
Merge all 12 months of datasets using a simple python script:
Code example
|
|
View datasets
After merging the 12 datasets, import to Postgres database. The dataset has a total of 5,595,063 rows. Use the SQL statement below to view all rows.
|
|
Use the SQL statement below to view 1,006,761 rows with null values.
Code example
|
|
Delete NULL
Use the SQL statement below to delete 1,006,761 rows with null values. After deletion, 4,588,302 rows left.
Code example
|
|
We are going to delete also rows with zero or less than zero values. To delete the rows, use the SQL statement below:
|
|
116 and 82 records deleted respectively. The total remaining rows that needs to be explored are 4,588,104.
Add new columns
Add new columns and populate with new data.
day_of_week
Use the SQL statement below to add day_of_week and populate it with new data by extracting ‘Day’ from the timestamp.
|
|
Month
Use the SQL statement below to add month and populate it with new data by extracting ‘Month’ from the timestamp.
|
|
Analyze
Summary of analysis
AVG Ride Length
AVG ride length
Use the SQL statement below to compute the average ride length or trip duration (member/casual).
Code example
|
|
The average ride length or trip duration for casual riders is 32:30 minutes while for the members is 13:10 minutes.
MAX ride length
MAX Ride Length
Use the SQL statement below to compute the maximum ride length or trip duration (member/casual).
Code example
|
|
The maximum ride length or trip duration for casual riders is 38 days 20:24:09 hours while for the members is 1 day 55:38 minutes.
Total trips
Total trips
We’ll calculate for annual members and casual riders. We’ll also calculate percentages of overall total for both types:
|
|
Median ride length or trip duration
Median ride length or trip duration
|
|
Busiest day for rides
Busiest day for rides
See which day has the most rides for annual members and casual riders.
Code example
|
|
Median ride length per day
Median ride length per day
|
|
|
|
Total rides per day
Total rides per day
Let’s look at total rides per day. We’ll calculate for overall total, annual members and casual riders:
|
|
Average ride_length for users by day_of_week
Average ride_length for users by day_of_week
|
|
Number of rides for users by day_of_week
Number of rides for users by day_of_week
|
|
Share
Visualizations and key findings
Types of rides
Too many casual riders. Classic bike is commonly used.
Rides for each week
Riders love weekends.
Rides per month
Monthly average
Weekly average
Three recommendations based on analysis
Casual riders ride longer than members. They also have more rides on weekends than members. Show casual riders why membership is better.
Show casual riders how they can save money by purchasing memberships as they ride longer
Run marketing campaigns targeting casual riders during weekends
Run marketing campaigns targeting casual riders during summer months