• Kelly Adams
  • Oct 18, 2021
  • 11 min read

Google Data Analytics Capstone Project

Updated: Jul 5, 2023

I worked on the Google Data Analytics Capstone Project, Track 1, Case Study 1. I will be diving into the background, my full process of cleaning, analyzing and visualizing the data, along with my final suggestions and summary of the data.

Quick Links :

Tableau Dashboard | Github R Code for Analysis | Github R Code for Tableau Visualization | LinkedIn Post

Below is a table of contents in case you want to go to a specific section.

Table of Contents:

Microsoft excel.

Finished Project

Summary of Data

Business Suggestions

What I Learned

Cyclistic is a bike sharing program which features more than 5,800 bikes and 600 docking stations. It offers reclining bikes, hand tricycles, and cargo bikes, making it more inclusive to people with disabilities and riders who can't use a standard two-wheeled bike. It was founded in 2016 and has grown tremendously into a fleet of bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.

Previously, Cyclistic's marketing strategy tried to build the general awareness and appeal to broad consumers. It has flexible pricing plans: single-ride passes, full-day passes, and annual memberships. Those who purchase single-ride or full-day passes are referred to as casual riders while those who purchase annual memberships are Cyclistic members .

My Role : In this scenario I am a junior data analyst at Cyclistic and my team has been tasked with the overall goal (see below) of designing marketing strategies

Overall Goal : Design marketing strategies aimed at converting casual riders into annual members.

Business Question : "How do annual members and casual riders use Cyclistic bikes differently?"

Below I will describe step-by-step the process I used to for this project. If you want to skip ahead to the business suggestions move onto the section "Insights".

Overview : I first analyzed the data separately (each month) in Excel, then used R to analyze the data as a whole (one year). Finally I created a dashboard in Tableau and used Figma to support the design elements.

I initially wanted to gather and analyze my data in Excel because it was the tool I was most familiar with and I could get a general understanding of the data quicker. I did not combine all of the spreadsheets into one because that would've taken more processing power than my computer had.

I began downloading the data from divvy-tripdata , and turning the .csv files into excel spreadsheets. I downloaded the most recent year of data which was at the time of starting my project:

August 2020

September 2020

October 2020

November 2020

December 2020

January 2021

February 2021

Added two columns to all of the months:

ride_length calculated the total ride length for each trip using the start_at column which was: ending time minus starting time.

day_of_week calculated the day of the week for each trip using the start_at column date.

Went over the business task and the information I had at hand and how that could be used to figure out how members and casual riders use the bike service differently

Came up with metrics to look at such as :

total number of rides per hour, per day of the month, per season, per day of the week, and for different bike types

Average ride length between members and casual

For every month in Excel created pivot tables and charts to go with the analysis on (this took the longest):

Total Rides per Weekday - calculated the total rides for members and casual and separated it by day of the week; used a cluster column chart

Average Ride Length - calculated the average ride length for members and casual and separated it by day of the week; used a cluster column chart

Total Rides per Hour - calculated the total rides for members and casual separated by the time of the day (24hr); used a line comparison chart

Total Rides per Day - calculated the total rides for members and casual separated by the day of the month; used a line comparison chart

Total Rides per Bike Type - calculated the total rides for members and casual separated by Bike type; used stacked column chart

I also created a Google docs Notes list where I wrote down the exact steps for each month (had a checklist) and included my insights for each month

Time Spent:

535 minutes or just under 9 hours to complete.

I originally wanted to use SQL but the files were too big to upload and I couldn't figure out how to utilize Google Cloud Platform. Instead I used R to analyze the data because it could handle all of the information quicker than Excel, and I wanted to work on my R skills. Below is my general process in R, I didn't include my mistakes/missteps or errors for the sake of brevity.

View my full code on my Github for this capstone project here .

Load all of the libraries I used: tidyverse, lubridate, hms, data.table

Uploaded all of the original data from the data source divytrip into R using read_csv function to upload all individual csv files and save them in separate data frames. For august 2020 data I saved it into aug08_df, september 2020 to sep09_df and so on.

Merged the 12 months of data together using rbind to create a one year view

Created a new data frame called cyclistic_date that would contain all of my new columns

Created new columns for:

Ride Length - did this by subtracting end_at time from start_at time

Day of the Week

Time - convert the time to HH:MM:SS format

Season - Spring, Summer, Winter or Fall

Time of Day - Night, Morning, Afternoon or Evening

Cleaned the data by:

Removing duplicate rows

Remove rows with NA values (blank rows)

Remove where ride_length is 0 or negative (ride_length should be a positive number)

Remove unnecessary columns: ride_id, start_station_id, end_station_id, start_lat, start_long, end_lat, end_lng

Calculated Total Rides for:

Total number of rides which was just the row count = 4,152,139

Member type - casual riders vs. annual members

Type of Bike - classic vs docked vs electric; separated by member type and total rides for each bike type

Hour - separated by member type and total rides for each hour in a day

Time of Day - separated by member type and total rides for each time of day (morning, afternoon, evening, night)

Day of the Week - separated by member type and total rides for each day of the week

Day of the Month - separated by member type and total rides for each day of the month

Month - separated by member type and total rides for each month

Season - separated by member type and total rides for each season (spring, summer, fall, winter)

Calculated Average Ride Length for:

Total average ride length

Type of Bike - separated by member type and average ride length for each bike type

Hour - separated by member type and average ride length for each hour in a day

Time of Day - separated by member type and average ride length for each time of day (morning, afternoon, evening, night)

Day of the Week - separated by member type and average ride length for each day of the week

Day of the Month - separated by member type and average ride length for each day of the month

Month - separated by member type and average ride length for each month

Season - separated by member type and average ride lengths for each season (spring, summer, fall, winter)

Then using all of this data I created my own summary in my case notes and took note of the: total rides for each variable, average ride lengths for each variable, and the difference between members versus casual riders. I originally wanted to create a report using R Markdown as well but for the sake of time (I had already spent over 20 hours on the project so far), I decided to skip this step, and write this article instead.

1045 minutes or about 17 and a half hours to complete.

While I learned the basics of Tableau in the Google Course I wanted more practice with visualizing data and creating dashboards.

To view my completed dashboard click here .

I created a separate R code (you can view it here on Github) that made some changes for specifically the Tableau portion.

For ride length I rounded the digits by 1, meaning my numbers were 29.8 or 12.5.

Revised how I created my "month" column. I used mutate() to create a column that had the month in ___ format and not number format. So instead of 01 it would say "January"

Cleaned the data: removed rows with NA values, removed duplicate rows, removed where ride_length was 0 or negative and removed unnecessary columns like: ride_id, start_station_id, end_station_id, start_lat, start_long, end_lat, end_lng

Created a new dataframe with this information so I could test the difference between the original data frame (cyclistic_date) that I used for my analysis and the data frame I would use for Tableau (cyclistic_tableau).

In this new data frame I removed more columns to make calculations quicker in Tableau. I removed: start_station_name, end_station_name, time, started_at, ended_at

Downloaded this data frame into a .csv file which I uploaded to Tableau

Created graphs similar to those I created in Excel but added a few:

Total Rides by Bike Type

Ride Length by Weekday

Total Rides by Weekday

Total rides by hour, total rides by month.

Then I created a basic dashboard with all of that information, a prototype for me to view while I was creating the final dashboard ( Figure 1 below).

Created a prototype mockup in Figma

Created a final version of the mockup in Figma

Edited Dashboard in Tableau to reflect design in Figma

Edited graphs in Tableau

Made bar graphs round

Added annotations

Highlights to specific important notes

Got rid of labels for visual purposes

Combined Figma and Tableau (used dashboard created in Figma as the background for my Tableau Dashboard) to create a final prototype ( Figure 2 below)

Made minor edits to design elements and created final dashboard ( Figure 3 - Cyclistic Dashboard V1 )

On April 24, 2023 I decided to update my dashboard (See Finished Project , image Final Dashboard - Cyclistic Dashboard V2 ). All of the analysis is the same. The only changes have been to the dashboard. Which include:

Adding horizontal grid lines to a few of the charts

Updating the tool tips.

Making all of the top metric values (e.g. Total Rides, Average Ride Length, etc.) interactive in Tableau instead of in Figma.

765 minutes or almost 13 hours to complete.

Tableau Prototype

Below was my first draft of the dashboard only using Tableau.

Prototype of my dashboard for my google capstone project

Prototype using Figma Background

Combined Figma and Tableau (used dashboard created in Figma as the background for my Tableau Dashboard) to create a final prototype.

Dashboard Prototype with Figma background

Final Dashboard V1

Made minor edits to design elements and created final dashboard. This was the original final dashboard.

capstone project google data analytics

I am including the other tools I used.

Figma to create my background and help develop the dashboard aesthetics.

Google Docs helped me keep track of all of my documents for this project like:

Date Log - I wrote down what I did that day related to my project

Resources - A list of resources I frequently used

Case Notes - Notes for the case study including the final insights, what I was looking for, and anything else having to do with the case

Evernote to draft this article before I uploaded it here.

FINISHED PROJECT

Here is my finished project: Google Capstone Project (V2) . You can view the links to my R code on Github used for analysis here and the code for Tableau here .

Note: This is V2 with a few minor changes to the dashboard. Including:

Final dashboard for capstone project

SUMMARY OF DATA

Those who purchase single-ride or full-day passes are referred to as casual riders while those who purchase annual memberships are Cyclistic members .

Total Rides by User Type

Average Ride Length per User Type

Average Ride per Weekday

Members had more rides with 2,328,763 total rides or 56% and casual riders had 1,823,376 total rides or 43%.

Total Rides by Rider Type Pie chart

Total Rides per Bike Type

Both casual riders and members used the classic bike the most with 1,777,593 rides or 43% of total rides, followed by docked bikes with 1,545,936 rides or 37% of total rides, and lastly with electric bikes at 828,610 rides or 20% of total rides.

Total Rides per Bike Type - bar chart

Average Ride Length by User Type

The total average ride length was 24 minutes. For casual riders it was longer at 27 minutes while members was 14 minutes.

Average ride length by rider type

Average Ride Length per Weekday

For the average ride length per weekday both casual riders and members had an increase in the average ride length on the weekends. For both Sunday was the longest at 31 minutes.

average ride length per weekday - bar chart

Saturday was the most popular weekday combining casual riders and member rides with 784,239 rides or 19% of total rides. But for member rides only Wednesday was the most popular day with 356,060 rides, 5,407 rides more than Saturday.

Total rides by weekday - bar chart

5PM or 17:00 was the busiest hour for both members and casual riders with 426,685 rides or 10% of the total rides. Typically rides began increasing in the morning at 6AM and rose until 5PM then dropped afterwards. The afternoon was the busiest for both rider types with 1,905,797 rides or 45% of total rides. 4AM was the least popular hour.

Total rides by hour

July was the busiest month combining casual riders and member rides at 691,476 rides or 16% of total rides. While summer was the most popular season for both at 1,903,446 rides or 46% of total rides. Looking at just members August is actually the busiest month with 323,140 rides, 816 rides more than July. Winter is the least popular season and February is the least popular month.

Total bike rides per month - bar chart

Final Summary

The most popular bike among with riders was the classic.

Busiest time was afternoon and the peak time was at 5PM for both casual riders and members.

Busiest weekday was Saturday, casual riders used the service the most on the weekends.

Busiest season was Summer for both types of riders.

Most rides by User Type was members but casual riders weren't far behind.

The average ride length was 24 minutes but casual riders on average rode 23 minutes longer than members.

BUSINESS SUGGESTIONS

This was the hardest part for me for the whole project. I have never provided suggestions for a business nor worked in marketing. Any feedback here would be appreciated.

These are my suggestions for the marketing team to convert casual riders to annual members:

Personalize discounts and show perks in the membership program based on their preferences and riding habits.

Emphasize the benefits of memberships, including discounts during busy times of the year like during Summer, or on the weekends.

Have existing members to share their stories about how using Cyclistic's system has changed their life, to create a sense of community, offer a discount if they do so this will help encourage new riders to join the program.

WHAT I LEARNED

Below is what I learned/practiced from over 40 hours spent on this project:

Pivot Tables in Microsoft Excel

Practice using R for data analysis and cleaning specifically using the tidyverse package for data analysis

Graphs in Tableau, edited visual elements along with creating different charts and filters.

Design elements of an effective dashboard

Combining the design feature of Figma with the functionality of Tableau

R portion of my project I found Itamar's case study on Kaggle using R as well, a helpful resource.

Tableau portion I used Navneet Singh's Tableau Dashboard as inspiration.

  • Data Analytics
  • Portfolio Projects

Recent Posts

Data Analytics Resources

Google Data Analytics Course Review

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

Exploratory Data Analysis on Bellabeat fitness tracker app using Python. Capstone project from Google Data Analytics Professional Certification.

katiehuangx/Google-Data-Analytics-Capstone

Folders and files.

NameName
16 Commits

Repository files navigation

Google data analytics: capstone, bellabeat fitness tracking app analysis.

Bellabeat

This analysis is an optional Capstone project from the Google Data Analytics Professional Certificate on Coursera.

Background:

Bellabeat is a high-tech manufacturer of beautifully-designed health-focused smart products for women since 2013. Inspiring and empowering women with knowledge about their own health and habits, Bellabeat has grown rapidly and quickly positioned itself as a tech-driven wellness company for females.

The co-founder and Chief Creative Officer, Urška Sršen is confident that an analysis of non-Bellebeat consumer data (ie. FitBit fitness tracker usage data) would reveal more opportunities for growth.

Business Task:

Analyze FitBit Fitness Tracker App data to gain insights into how consumers are using the FitBit app and discover trends and insights for Bellabeat marketing team.

Business Objectives:

  • What are the trends identified?
  • How could these trends apply to Bellabeat customers?
  • How could these trends help influence Bellabeat marketing strategy?

Python for Data Cleaning, Data Transformation, Data Visualisation and Data Analysis

The data set is publicly available on Kaggle .

Medium: Google Capstone Project: How Can Bellabeat, A Wellness Technology Company Play It Smart?

Kaggle: Bellabeat Case Study

  • Jupyter Notebook 100.0%

IMAGES

  1. Google Data Analytics Capstone Project: A Comprehensive Guide

    capstone project google data analytics

  2. Google Data Analytics Capstone Project

    capstone project google data analytics

  3. Google Data Analytics Capstone Project

    capstone project google data analytics

  4. Google Data Analytics Capstone: Complete a Case Study ~ Computer

    capstone project google data analytics

  5. Google Data Analytics Capstone Project: A Comprehensive Guide

    capstone project google data analytics

  6. Google Data Analytics Course Capstone Project Presentation

    capstone project google data analytics

VIDEO

  1. Editing 00 End Course Capstone Project md Google Chrome 2024 06 25 19 40 33

  2. Data analytics capstone project

  3. A Data Analysis Presentation My Capstone Project on a Bike Share Company

  4. Capstone Project Data Analytics -RevoU

  5. Data Analysis, Diet, and Alzheimer's

  6. Keystones For Creating A Successful Gen AI Adoption Plan

COMMENTS

  1. Google Data Analytics Capstone Project

    I worked on the Google Data Analytics Capstone Project, Track 1, Case Study 1. I will be diving into the background, my full process of cleaning, analyzing and visualizing the data, along with my final suggestions and summary of the data. Below is a table of contents in case you want to go to a specific section.

  2. Google Data Analytics Capstone: Complete a Case Study

    There are 4 modules in this course. This course is the eighth and final course in the Google Data Analytics Certificate. You'll have the opportunity to complete a case study, which will help prepare you for your data analytics job hunt. Case studies are commonly used by employers to assess analytical skills. For your case study, you'll ...

  3. Google Data Analytics Capstone Project: Cyclistic Case Study

    Introduction. For the past few months, I have been completing the Google Data Analytics Professional Certificate offered on Coursera.I have been learning about the six stages of the data analysis ...

  4. How I created my first Data Analytics Capstone Project

    It is well known that the Google Data Analytics Professional Course have a Capstone project as an 8th last end course for the completion of Professional Data Analytics Certificate ; which gives ...

  5. Google Data Analytics Capstone For Cyclistic

    My final capstone project for the Google Data Analytics Certification. I tackled a sample set of data from a fictional bike sharing and went through all of the steps of the data analysis process with it: ask, prepare, process, analyze, share, and act.

  6. google-data-analytics-capstone-project · GitHub Topics · GitHub

    Discover valuable insights from the Google Capstone Project by analyzing Cyclistic bike-sharing data using SQL, Power BI, and Python. Leverage data-driven insights to enhance membership strategies and unlock the full potential of bike-sharing data. The analysis now includes a comprehensive Python notebook utilizing pandas, matplotlib, and seaborn.

  7. Google Data Analytics Capstone: Cyclistic Case Study

    A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can't use a standard two-wheeled bike. The majority of riders opt for traditional bikes ...

  8. Google Advanced Data Analytics Capstone

    This is the seventh and final course of the Google Advanced Data Analytics Certificate. In this course, you have the opportunity to complete an optional capstone project that includes key concepts from each of the six preceding courses. During this capstone project, you'll use your new skills and knowledge to develop data-driven insights for a ...

  9. Google Data Analytics Capstone Project: Bellabeat Case study

    This is a project documentation for Bellabeat case study from the Google Data Analytics Course. The analysis follows the 6 steps of Data Analysis : Ask, Prepare, Process, Analyze, Share and Act…

  10. Google Data Analytics Capstone: Complete a Case Study

    Module 1 • 2 hours to complete. A capstone is a crowning achievement. In this part of the course, you'll be introduced to capstone projects, case studies, and portfolios, and will learn how they help employers better understand your skills and capabilities. You'll also have an opportunity to explore the online portfolios of real data ...

  11. Google Data Analytics: Capstone Project

    This repository contains the capstone project developed during the final course of the Google Data Analytics Professional Certificate. Which is a specialization, with seven courses in total and a final project (the capstone), available through the Coursera platform. The objective of this capstone ...

  12. Google Capstone Project: How Can Bellabeat, A Wellness ...

    This is an optional capstone project from the Google Data Analytics Course no: Capstone Project which is posted on GitHub and Kaggle. The analysis follows the 6 steps of Data Analysis taught in ...

  13. Google Data Analytics Capstone: Cyclistic

    Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. keyboard_arrow_up. content_copy. SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from Cyclistic Dataset Case ...

  14. Google Data Analytics Capstone Project: Cyclistic bike-share ...

    You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company's future success depends on ...

  15. akorez/Google-Data-Analytics-CapStone-Project

    This exploratory analysis case study is towards Capstome project requirement for Google Data Analytics Professional Certificate. The case study involves a bikeshare company's data of its customer's trip details over a 12 month period (November 2020 - October 2021). The data has been made available by Motivate International Inc. under this license.

  16. Google Data Analytics Capstone

    If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. keyboard_arrow_up. content_copy. SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from Cyclistic.

  17. katiehuangx/Google-Data-Analytics-Capstone

    This analysis is an optional Capstone project from the Google Data Analytics Professional Certificate on Coursera. Background: Bellabeat is a high-tech manufacturer of beautifully-designed health-focused smart products for women since 2013.

  18. Google Data Analytics Capstone Project: Bike Share Analysis

    Sep 17, 2021. Photo by sabina fratila on Unsplash. This is my version of the Google Data Analyst Capstone Project: Case Study 1. The full document of the case study can be found in Google Data ...

  19. Google Data Analytics Capstone Project

    If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. keyboard_arrow_up. content_copy. SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from Divvy Chicago Bike-Sharing Data.