0% found this document useful (0 votes)
107 views

Create ML Models With BigQuery ML Challenge Lab

.........

Uploaded by

vieirajuh959
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views

Create ML Models With BigQuery ML Challenge Lab

.........

Uploaded by

vieirajuh959
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Create ML Models with BigQuery ML:

Challenge Lab
Task 1. Create a dataset to store your machine learning
models
1. To create a dataset, click on the View actions icon next to your project ID and select
Create dataset.

2. Next, name your Dataset ID QUICK_GCP_LAB and click Create dataset.

Task 2. Create a forecasting BigQuery machine learning


model
CREATE OR REPLACE MODEL QUICK_GCP_LAB.location_model
OPTIONS
(model_type='linear_reg', labels=['duration_minutes']) AS
SELECT
start_station_name,
EXTRACT(HOUR FROM start_time) AS start_hour,
EXTRACT(DAYOFWEEK FROM start_time) AS day_of_week,
duration_minutes,
address as location
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips
JOIN
`bigquery-public-data.austin_bikeshare.bikeshare_stations` AS stations
ON
trips.start_station_name = stations.name
WHERE
EXTRACT(YEAR FROM start_time) = [TRAINING YEAR]
AND duration_minutes > 0

Task 3. Create the second machine learning model


CREATE OR REPLACE MODEL QUICK_GCP_LAB.subscriber_model
OPTIONS
(model_type='linear_reg', labels=['duration_minutes']) AS
SELECT
start_station_name,
EXTRACT(HOUR FROM start_time) AS start_hour,
subscriber_type,
duration_minutes
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips
WHERE EXTRACT(YEAR FROM start_time) = [TRAINING YEAR]
Task 4. Evaluate the two machine learning models
Evaluation metrics for QUICK_GCP_LAB_model

SELECT
SQRT(mean_squared_error) AS rmse,
mean_absolute_error
FROM
ML.EVALUATE(MODEL QUICK_GCP_LAB.location_model, (
SELECT
start_station_name,
EXTRACT(HOUR FROM start_time) AS start_hour,
EXTRACT(DAYOFWEEK FROM start_time) AS day_of_week,
duration_minutes,
address as location
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips
JOIN
`bigquery-public-data.austin_bikeshare.bikeshare_stations` AS stations
ON
trips.start_station_name = stations.name
WHERE EXTRACT(YEAR FROM start_time) = [EVALUATION YEAR] )
)

Evaluation metrics for QUICK_GCP_LAB_subscriber_model

SELECT
SQRT(mean_squared_error) AS rmse,
mean_absolute_error
FROM
ML.EVALUATE(MODEL `QUICK_GCP_LAB.subscriber_model`, (
SELECT
start_station_name,
EXTRACT(HOUR FROM start_time) AS start_hour,
subscriber_type,
duration_minutes
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips
WHERE
EXTRACT(YEAR FROM start_time) = [EVALUATION YEAR] )
)
Task 5. Use the subscriber type machine learning
model to predict average trip durations
SELECT
start_station_name,
COUNT(*) AS trips
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips`
WHERE
EXTRACT(YEAR FROM start_time) = [EVALUATION YEAR]
GROUP BY
start_station_name
ORDER BY
trips DESC

SELECT AVG(predicted_duration_minutes) AS average_predicted_trip_length


FROM ML.predict(MODEL QUICK_GCP_LAB.subscriber_model, (
SELECT
start_station_name,
EXTRACT(HOUR FROM start_time) AS start_hour,
subscriber_type,
duration_minutes
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips`
WHERE
EXTRACT(YEAR FROM start_time) = [EVALUATION YEAR]
AND subscriber_type = 'Single Trip'
AND start_station_name = '21st & Speedway @PCL'))

You might also like