Search Video Evaluation Guidelines
Search Video Evaluation Guidelines
Introduction
In this document, we explain relevance rating guidelines for video search on Apple TV.
If you are not familiar with the Apple TV app , please refer to https://www.apple.com/apple-tv-app/ for an
overview and basic information about this app.
The data we receive from you in the form of high quality relevance judgments will be used to build and improve
artificial intelligence systems such as search algorithms and machine learned rankers that power the user
experience for Apple TV users.
Our ultimate goal is to surprise and delight our customers by improving search quality and enhancing customer
satisfaction, and you play an important role in this.
Ask yourself “Why is this particular result returned for this query? Is this relevant?” Stay curious and do a
thorough research to understand the relationship between a query and the result.
Your attention to detail, research and language skills as well as your cultural knowledge of the market are all
critical to the success of our projects.
Please keep in mind that your tasks will be spot-checked for quality, and measured against those of your peers.
If your accuracy rate is consistently high enough, you may be qualified to work as an auditor who gives
feedback to your peers.
https://baseline.apple.com/training/evaluations/6/guidelines 1/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
Relevance Rating
In relevance rating, we start with a query and score the relevance of returned result for that query. There are
two steps:
1. Determine the primary intent(s) of the query.
2. Use the rating guidelines to help rate each piece of content for the given query.
Note: The most likely intent of the query is predetermined for you using the query type field in the metadata
section.
If during your research you determine a different intent, please use that intent as the primary intent.
Note: It is very important to also consider secondary intent(s). If the content satisfies only a secondary intent,
it should receive a lower rating; e.g., rating of Good or Acceptable and not high score like Excellent.
Query classification
Each query is assigned a query type.
On the left hand side in BaseLine, inside the input metadata section you will find the classification of the query.
The classification is already set. You do not have to classify the query yourself.
Examples:
The query "Big" is classified as Ambiguous. The intent is not clear.
9ECF661E-D5C9-46C9-9E14-9C949368E3B1_4_5005_c.jpeg · 16.9 KB
57AE68F0-15CB-430E-B897-3CFADFF941EF_4_5005_c.jpeg · 17.3 KB
You should rate the content relevancy based on the classification of the query , How relevant is the content in
relation to the query and the query type.
Examples for each query type and how to rate results is added in this document.
Query research
We also expect you to use search engines and other supporting information and sites to help you understand
the intent of each query.
For your convenience, links to web entities based on the query are added on the left hand side of the page in
the input metadata section.
A5E2E276-A22D-4625-A773-5D3F57B694C0_4_5005_c.jpeg · 12.3 KB
https://baseline.apple.com/training/evaluations/6/guidelines 2/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
A click on the Google link will take you to a web search page https://www.google.com/?#q=boomerang
A Click on the Google play link will take you to https://play.google.com/store/search?
q=boomerang&c=movies&gl=us&hl=en
which is a search page within the Google play store.
A Click on the YouTube link will take you to a search page on YouTube
A click on IMDB link will take to an IMDB search page where you may find more information about the content
related to the query.
In addition to query based links, Each result will also have links based on the title of the content presented.
1E59C385-DFEC-4CE9-95FD-09EF79DD426A_4_5005_c.jpeg · 33.1 KB
Rating Guidelines
These are the options for rating content result for a given query:
Perfect
Excellent
Good
Acceptable
Unacceptable: Off Topic
Problem: Other - A problem or technical issue with the task in BaseLine that made it impossible to judge
the result.
Note: Problem: other should only be used if there is a problem or technical issue with BaseLine and the task
can not be completed.
https://baseline.apple.com/training/evaluations/6/guidelines 3/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
[Unavailable].png · 20.8 KB
Problem: Other rating should not be used if the content is unavailable in Apple TV but it is still possible to
determine relevance between input and output based on the information provided in BaseLine, and by
completing a side search on Google and/or on other platforms such as Amazon Prime, Netflix. Also note that
content/queries in all languages need to be evaluated based on relevance to the user’s intent and should not
be rated Problem: Other.
1. TV Navigational
2. Movie Navigational
query: [pixels]
Query type : Movie Navigational
result: Pixels https://tv.apple.com/us/movie/pixels/umc.cmc.2gkxzhbgkhr4r8wa58i3963ih
rating: → Perfect
reasoning: This is the intended movie
https://baseline.apple.com/training/evaluations/6/guidelines 4/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
query: [elle]
Query type : Movie Navigational
result: Ellen DeGeneres: Here and Now https://tv.apple.com/us/movie/ellen-degeneres-here-and-
now/umc.cmc.4f6jof371bhinq64ggq8nx55k
rating: → Good
reasoning: Content is not the intended movie, It is related to the query, popular and can serve as the secondary
intent
query: [elle]
Query type : Movie Navigational
result: For Ellen https://itunes.apple.com/us/movie/for-ellen/id555244137
rating: → Acceptable
reasoning: Content is not the intended movie, It is related to the query but is dated
query: [1917]
Query type : Movie Navigational
result: The Great War: 1917 - The Breaking of Armies. https://tv.apple.com/gb/movie/the-great-war-1917-the-
breaking-of-armies/umc.cmc.5aj55fjymylpaommondrrdr9h
rating: → Good.
reasoning: The primary intent is the recent movie 1917. However , In this case the content may satisfy a
secondary intent and may help the user discover new content that is related to the query.
Movie bundle
You may across a movie bundle result.
If the bundle includes the primary intent movie, rate it as Excellent.
Note: There are quite a few cases where a query can be both Movie Navigational and TV Navigational.
For example : 'Tom and Jerry' can be A TV show https://itunes.apple.com/us/tv-season/tom-and-jerry-
vol-1/id417551536
or a movie https://itunes.apple.com/us/movie/tom-and-jerry-the-movie/id545054826
We can choose only one classification. However, If you encounter results that are from the other
classification, the relevancy to the query is still very high and the rating should be Excellent.
https://baseline.apple.com/training/evaluations/6/guidelines 5/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
3. Ambiguous
A query that that is likely incomplete or is too generic to determine user intent.
A single primary intent is difficult to discern.
Please note that due to the way users are interacting with our product there are a lot of cases where you will
encounter a single or double character queries that are ambiguous
Examples : [a], [fr], [n]
General guidelines:
1359572B-D242-4E6E-BA3B-35A4BCE69E05_4_5005_c.jpeg · 29.1 KB
On IMDB you can refer to the main section to get information about popularity of the content.
309C3161-4EB6-447D-A08A-9E1F65C36EE4_4_5005_c.jpeg · 26.6 KB
Here are a few examples of rating for highly ambiguous queries and reasoning for choosing the rating.
query: [a]
Query type : Ambiguous
result: For All Mankind. https://tv.apple.com/us/show/ncis/umc.cmc.3en7wd5upm1hx2sdbbr949kb
rating: → Acceptable
reasoning: There is low relevancy to the query for the token 'all' and maybe some users will not use the first
token 'for' . The content is recent and very popular on Apple TV.
If the content was not recent or not popular the rating would be 'Unacceptable: Off topic.
query: [n]
Query type : Ambiguous
result: NCIS. https://tv.apple.com/us/show/ncis/umc.cmc.3en7wd5upm1hx2sdbbr949kbj
rating: → Excellent
reasoning: Content title starts with n which is relevant to the query. The content is popular and have a good
chance of satisfying primary user intent.
query: [n]
Query type : Ambiguous
result: Jumanji: The Next Level. https://tv.apple.com/us/movie/jumanji-the-next-
level/umc.cmc.5s2gntehgm0y74ryjwlgyyji9
rating: → Good
reasoning: Even though the title does not start with n, The Title contain the toke 'The Next Level'. In this case
content may be known in this name and can satisfy user intent.
Note: When analyzing the title tokens , You can ignore the word 'the'. Users may not use it when
interacting with search.
https://baseline.apple.com/training/evaluations/6/guidelines 6/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
query: [n]
Query type : Ambiguous
result: The Nightmare Before Christmas. https://tv.apple.com/us/movie/the-nightmare-before-
christmas/umc.cmc.15sv1obrzlxhuf2xhjftbr6ab
rating: → Good.
reasoning: By ignoring 'the', we can judge that the content may be relevant to the query. The content is dated
to 1994 and is highly seasonal. However, This is a classic movie and could satisfy user intent or help with
discovery of content.
query: [n]
Query type : Ambiguous
result: Not Going Out. https://tv.apple.com/gb/show/not-going-out/umc.cmc.3kigc6dtw7wuakf16oy10qc1w
rating: → Good
reasoning: The title is relevant to the query. From recency standpoint the show is dated to 2016 but if you
check the IMDB https://www.imdb.com/title/tt0862614/?ref_=fn_al_tt_1 you can see that the show is fairly
popular.
query: [the g]
Query type : Ambiguous
result: The Good Place. https://tv.apple.com/US/show/umc.cmc.361pp6dpt0jsmj9sxywuiw665
rating: → Excellent
reasoning: The title is relevant to the query. Content is recent and very popular, Can satisfy intent for a lot of
users.
query: [the g]
Query type : Ambiguous
result: The Game https://tv.apple.com/US/show/umc.cmc.3ug950auo9z3dnk7vf9fpxrry
rating: → Acceptable
reasoning: Relevant to the query. However, This is a reality show that ended in 2015. Not popular anymore.
query: [the]
Query type : Ambiguous
result: Pirates of the Caribbean https://tv.apple.com/us/movie/pirates-of-the-caribbean-on-stranger-
tides/umc.cmc.129m5yvtfnfxv2syud88styqv
rating: → Unacceptable: Off-Topic
reasoning: Relevant to the query is low. it matches a token in the middle of the title. You would expect users to
use the toke 'pirate' to look for this movie. Even though the the content is very popular the low relevancy merit a
rating of Unacceptable: Off-Topic
query: [Big]
Query type : Ambiguous
result: Big Little Lies https://tv.apple.com/us/show/big-little-lies/umc.cmc.5kua2qrt76zvlhbyfzwkfhdkv
rating: → Excellent
reasoning: Content is recent and very popular, Can satisfy intent for a lot of users.
query: [Big]
Query type : Ambiguous
result: Big https://tv.apple.com/us/movie/big/umc.cmc.63shspjw8f8vnl45r7zw9i3jx
rating: → Excellent
reasoning: Exact match with the query, well-known movie, can satisfy intent for a lot of users.
query: [Big]
Query type : Ambiguous
result: The Big Lebowski https://tv.apple.com/us/movie/the-big-lebowski/umc.cmc.3ev2yyf8c4chboxiibx165ysv
rating: → Good
reasoning: Content is popular, even though it’s old. Can satisfy some user’s first or secondary intent.
query: [age]
https://baseline.apple.com/training/evaluations/6/guidelines 7/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
query: [grey]
Query type : Ambiguous
result: Grey's Anatomy: Cast at PaleyFest https://tv.apple.com/US/movie/umc.cmc.2v4p3povgbisoch70y5nlfeng
rating: → Acceptable
reasoning: There is high relevancy to the query, However this is a cast members of Grey’s Anatomy gather for
a celebration of their show and not the TV show itself. It may satisfy query intent and help with discovery.
4. Genre
Examples
query: [kids]
result: Frozen https://tv.apple.com/us/movie/frozen/umc.cmc.4b17gber8k76h90rzlulvrbcl
rating: → Excellent
reasoning: This is a popular kids movie, very likely to satisfy the user intent
query: [kids]
result: Maleficent: Mistress of Evil https://tv.apple.com/us/movie/maleficent-mistress-of-
evil/umc.cmc.2j3ekl3vvh8ihkbzgbdrfkwxn
rating: → Good
reasoning: Recent content that fits the genre
query: [kids]
result: JESSIE https://tv.apple.com/us/show/jessie/umc.cmc.4o2zgz5oz6cb6g24o04q4t0d2
rating: → Acceptable
reasoning: Kids related but dated content.
https://baseline.apple.com/training/evaluations/6/guidelines 8/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
General guideline:
The person has to be associated with the movie or TV show based on the query type.
If this is the case the rating should be Excellent.
If the person is not associated with the movie or TV show it should be rated as Unacceptable
If the person is associated with the movie but in a different function then the query type , rate it as
Acceptable
query: [leonardo dicaprio]
Query type : Actor/Actress
result: Catch Me If You Can https://tv.apple.com/ee/movie/catch-me-if-you-
can/umc.cmc.4p199eelws0o5vx6uwrsendqv
rating: → Excellent
reasoning: The leading actor in the movie
6. Channel
General guideline:
The Movie / TV show has to be associated with the channel
If the content is produced by the channel the rating should be Excellent or Good, based on recency and
popularity of the content.
If the content is not produced by the channel but is available to watch on the channel the rating should be
Acceptable.
Use popularity and recency signals to determine how well the content may satisfy user’s intent.
query: [hbo]
Query type : Channel
result: My Brilliant Friend https://tv.apple.com/us/show/my-brilliant-friend/umc.cmc.5j7epmrl5koccd8wdvirhzf0x
rating: → Excellent
reasoning: A popula and recent TV series produced by HBO.
query: [hulu]
Query type : Channel
result: Family guy https://tv.apple.com/US/show/umc.cmc.19tw0rz87wtuskutsuzcqo8z9
rating: → Acceptable
reasoning: Family guy is not produced by Hulu but is available to watch on Hulu.
667A5173-E1EF-4347-A2C1-8AEEB92F33A6_4_5005_c.jpeg · 11.9 KB
7. Studio
Studio: A query that refers to the studio where the movie was made
Example: [Ghibli], [Warner Bros. Pictures], [Columbia Pictures]
If the content is associated with the Studio the rating should be Excellent.
If the content is not associated with the Studio it should be rated as Unacceptable: Off-Topic .
https://baseline.apple.com/training/evaluations/6/guidelines 9/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
Use popularity and recency to determine how well the content may satisfy user’s intent.
8. Year/decade
Year/decade: A query that refers to the a year or decade when the content was produced.
Examples: [1980s movies],[2000s TV shows]
If the content was produced at the time frame referred by the query rate it as Excellent.
If the content was not produced at the time frame referred by the query rate it as Unacceptable: Off-Topic
query: [1990]
Query type : Year/Decade
result: The 1990s: The Deadliest Decade https://tv.apple.com/us/show/the-1990s-the-deadliest-
decade/umc.cmc.b7riszeu5t0aticjafdjgbq4
rating: → Good
reasoning: Content is related to the 90’s
query: [1990]
Query type : Year/Decade
result: The nightmare before Christmas https://tv.apple.com/us/movie/the-nightmare-before-
christmas/umc.cmc.15sv1obrzlxhuf2xhjftbr6ab
rating: → Good
reasoning: Content was produced in the 90's. Content is not popular but seasonal. (Can be rated as good or
excellent).
query: [1990]
Query type : Year/Decade
result: 1990: The Bronx Warriors https://itunes.apple.com/us/movie/1990-the-bronx-warriors/id1278365874
rating: → Acceptable
reasoning: Content is released in 1983. It could satisfy a secondary intent
Query: [90s]
Query Type: Year/Decade
Result: Hocus Pocus https://tv.apple.com/US/movie/umc.cmc.bse02yxtv1oix30bbl0nn5h6
Rating: Excellent
Reason: Content was released in 1993 and is highly popular clearly satisfying the intent
9. Country of origin
Country of origin: A query that refers to where the content was made.
Examples: [Italian movies],[British movies], [Indian tv shows]
If the content was produced in the country referred by the query rate it as Excellent.
If the content was not produced in the country referred by the query rate it as Unacceptable: Off-Topic
query: [Hindi]
Query type : Country of origin
result: Thapped https://tv.apple.com/us/movie/thappad/umc.cmc.3u29p663cbldrffgxs5zb0niv
rating: → Excellent
reasoning: Content is associated with India as country of origin.
Note: Please bear in mind that some scenarios might not be that clear cut. When in doubt, use your best
judgment to determine best rating. For example
query: [irish]
Query type : Country of origin
result: Thapped https://tv.apple.com/us/show/the-irish-mob/umc.cmc.3omp0krrqahiy0s8uirwml9rs
rating: → Excellent
reasoning: Content is not originated in Ireland but there is a strong relevancy to the query and will like satisfy
user intent.
10. Awards
https://baseline.apple.com/training/evaluations/6/guidelines 10/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
If the content is associated with the award, either by winning or nomination, rate it as Excellent.
If the content is not associated with the award rate it as Unacceptable: Off-Topic
11. Topic
Do not use the perfect rating for this types of queries. The highest rate for this query type is Excellent.
There is no significance if the result is a movie or TV show.
Use popularity and recency to determine how well the content may satisfy user’s intent
query: [yoga workout]
Query type : Topic
result: Total body yoga https://tv.apple.com/us/show/total-body-yoga-workouts-for-weight-loss--
strength/umc.cmc.k2hzp2q9fcjkchnbwfbw6eeg
rating: → Excellent
reasoning: Content is related to Yoga
query: [food]
Query type : Topic
result: Diners, Drive-Ins, and Dives https://tv.apple.com/us/show/diners-drive-ins-and-dives-triple-d-
nation/umc.cmc.3rlmqzdt5ekp47bsi67at6o25
rating: → Excellent
reasoning: Content is related to food and is popular
query: [food]
Query type : Topic
result: The Pioneer Woman. https://tv.apple.com/us/show/the-pioneer-
woman/umc.cmc.6rk3jf42qkvfd7iulxceenggd
rating: → Good
reasoning: Content is related to food but is a bit dated
query: [food]
Query type : Topic
result: Food Wars! https://tv.apple.com/us/show/food-wars/umc.cmc.2j9k6bad98x5qu1igbxb6p4gh
rating: → Acceptable
reasoning: Not likely that this content is related to to the topic food. It is associated mainly by the query but
most like will not satisfy user intent.
12. Character
Do not use the perfect rating for this types of queries. The highest rate for this query type is Excellent.
There is no significance if the result is a movie or TV show.
Use popularity and recency to determine how well the content may satisfy user’s intent
query: [superman]
Query type : Character
result: Superman: The Movie https://tv.apple.com/gr/movie/superman/umc.cmc.5qkqo6kcwxnno7lex3a2wmlal
rating: → Excellent
reasoning: Highly relevant to the character
13. Song/Soundtrack
Do not use the perfect rating for this types of queries. The highest rate for this query type is Excellent.
https://baseline.apple.com/training/evaluations/6/guidelines 11/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
A query that refers to a broad set of movie or TV shows and does not fit in one of the categories.
Examples: [Free shows], [Popular movies],[Comedy movies from 2005],[funny shows], [popular tv shows]
Do not use the perfect rating for this types of queries. The highest rate for this query type is Excellent.
Use popularity and recency to determine how well the content may satisfy user’s intent
Person
Cast & Crew member that is related to the query.
The content is labeled as 'person'.
This type of content can appear for any query type.
89E97C12-47F6-4551-8BC4-46D55A3E8793_4_5005_c.jpeg · 20.8 KB
Rating should reflect relevancy to the query and the popularity of the person.
https://baseline.apple.com/training/evaluations/6/guidelines 12/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
rating: → Good
reasoning: Relevant to the query, fairly recent. Not so popular.
Brand
Channel or media publisher that is related to the query.
The content is labeled as 'Brand'.
This type of content can appear for any query type.
E76007F1-A13B-49CB-9E80-64D0D2A3B4CA_4_5005_c.jpeg · 16.5 KB
Sporting events
Sporting event that is related to the query.
The content is labeled as 'Sporting event'.
This type of content can appear for any query type.
https://baseline.apple.com/training/evaluations/6/guidelines 13/15
22/07/2021 Guidelines for Search - Video Training — BaseLine
45E5C1C5-8794-4D35-93D5-9E092BD59623_4_5005_c.jpeg · 31 KB
Note: Since sporting events are coming and going, You may encounter tasks where the content is no longer
available.
For those cases, rate it as if the content is available and use relevancy of information that is presented
(Title and other meta data).
Examples:
query: [ma]
Query type : Ambiguous
result1: Orlando Magic at San Antonio Spurs
or
result2: Miami Marlins at Philadelphia Phillies
rating: → Acceptable
reasoning: For highly ambiguous queries, when there is a partial match on the team name, either the city or
the name, The rating should be Acceptable.
query: [two]
Query type : Ambiguous
result: LA Galaxy || vs. San Diego Loyal SC
rating: → Unacceptable: Off-Topic
reasoning: There is no relevancy between the result and the query.
Adult Content
We assume that all adult or pornographic content is Unacceptable: Off-Topic for non-adult and non-porn
queries. As an example, any adult or porn content is Off-Topic for the query [drama] because the user has not
expressly indicated that they want content of this type.
On the other hand, if the query is [erotica] or [porn] any adult or porn content is relevant.
When rating results for such cases the translated query should be used as the query and relevance should be
attributed to both the query and the translated query.
In some cases you will find the language information in the metadata section of the query.
B6AB36EF-B6D4-4A10-8154-801E0D57BB1F_4_5005_c.jpeg · 21.6 KB
Query: rapido y
Query Type: Movie Navigational
Language: Spanish
Translation: fast and
Result: Furious 7 https://tv.apple.com/us/movie/furious-7/umc.cmc.2qd0f0y7hlhotepuixrs5d3hf
Rating: Excellent
Reason: Query is in Spanish ('fast and'). Intent is a movie in the 'Fast and Furious' series.
For example
Query: 卧虎藏龙
Query Type: Movie Navigational
Language: Chinese
Translation: Crouching Tiger, Hidden Dragon
Result: Crouching Tiger, Hidden Dragon https://tv.apple.com/us/movie/crouching-tiger-hidden-
dragon/umc.cmc.1qyopok7zvm57lvbxa2oz4ibq
Rating: Perfect
Reason: The movie is the primary intent.
https://baseline.apple.com/training/evaluations/6/guidelines 15/15