YouTube API

In this note, we are going to practice how to extract data by YouTube API. There are many tutorials on Internet introducing people how to use YouTube API. It is strongly suggested to google for these web pages in order to learn YouTube API. This web page provides a simple example for you. Also, this web page is useful for you to learn YouTube API. According to these web pages, in order to use YouTube API, you need to have the API key. To this end, you need to create (or select) a project on Google Cloud Platform. Log in with your Google account. You will see the web page like the below picture.

Click on the icon My Project and you will see the below dialog. You can select an existed project or crate a new one.

knitr::include_graphics("MyProject.png")

Once you have selected or create a project, you will be led to the below page.

knitr::include_graphics("api1.png")

Click on Enable APIS AND SERVICES and select YouTube Data API v3 and enable it. Finally, we need to get the credential and AOI key. Once you get your API key, you can save it in a file for future use.

knitr::include_graphics("youtubeapiv3.png")

Collect Data via YouTube API

The YouTube API has the format as https://www.googleapis.com/youtube/v3/. If we would like to collect the data of a single video, we need to call by https://www.googpleapis.com/youtube/v3/videos?id=[videoId]&key=[your_YouTube_API_key]&part=snippet. The videoId is the id of the video. For example, the id of one video on the channel MASA’s Cooking ABC (https://www.youtube.com/watch?v=Gm4STFk62iU) is Gm4STFk62iU. The key is your API key. The argument part is set as snippet, which will return some basic information about the video, such as title, the posted time, the description about the video, and so on. The following codes can get the snippet information of a video on MASA’s cooking ABC.

url<-"https://www.googleapis.com/youtube/v3/videos?id=Gm4STFk62iU&key=[your YouTube API key]&part=snippet"
text<-suppressWarnings(readLines(url))

The collected data are shown as follow.

{
  "kind": "youtube#videoListResponse",
  "etag": "6Ay1WlbUTPLhdn_wH8dCLzgrMz4",
  "items": [
    {
      "kind": "youtube#video",
      "etag": "v7YxLcR8L68C4NQ_9NhNbAUHuGI",
      "id": "Gm4STFk62iU",
      "snippet": {
        "publishedAt": "2021-11-26T12:30:06Z",
        "channelId": "UCr90FXGOO8nAE9B6FAUeTNA",
        "title": "Presented by 3M百利-辣味噌拌炒豬肉/Stir fried Spicy Miso Pork |MASAの料理ABC",
        "description": "◆MASA頻道訂閱↓ http://www.youtube.com/channel/UCr90F...  \n◆食譜↓  \n做料理的過程不是只有切菜加熱食材  \n還要包括料理中使用的所有東西都  \n清乾淨才完成料理程序  \n尤其是洗碗會影響到鍋具跟吃得健康!  \n好好了解怎麼洗東西效率才會好  \n也可以保護自己跟鍋具  \n這次跟大家介紹非常好用的  \n3M百利防刮細緻黃色菜瓜布 \n獨特纖維粒子能溫柔的把怕刮的平底鍋、玻璃杯 \n有效洗乾淨又不刮傷器具 \n讓我們一起來看看吧! (๑˃̵ᴗ˂̵)و \n\n辣味噌拌炒豬肉/豚のピリ辛味噌炒め \n [1人份]  \n五花肉 Sliced pork—150g  \n高麗菜 Cabbage—80g  \n洋蔥 Onion—1/4個  \n生辣椒 Chili pepper—1支  \n青椒 Green pepper—1/2個  \n紅蘿蔔 Carrot—20g  \n蒜泥 Garlic paste—1個分  \n薑泥 Ginger paste—1小匙  \n\n清酒 Sake—2大匙  \n味淋 Mirin—1大匙  \n砂糖 Sugar—1小匙  \n醬油 Soy sauce—2小匙  \n味噌 Miso—1大匙  \n鹽巴&黑胡椒 Salt&Pepper  \n\n*料理の名前&作り方はあくまでも自己流なのでご了承くださいw  \n\n*料理名稱&做法不一定正式or傳統, 是從自己的想法&經驗來分享的 請各位事先諒解。m( _ _ )m  \n*歡迎合作/Contact:bigway1688@gmail.com  \n*Instagram:https://www.instagram.com/masa_cookin...  \n*MASA Facebook:https://www.facebook.com/masa.abc \n*食譜書:https://www.masa.tw/masas-book  \n\n●沙拉油:泰山均衡369健康調合油 TAISUN 369 Blend Oil  \n●鍋具:THERMOS 膳魔師厚鑄耐摩不沾鍋20cm, 24cm  \n●鍋具:THERMOS 蘋果原味鍋單柄湯鍋18cm  \n●BGM:  \n甘茶の音楽工房 \nMusMus  \nOtoLogic  \nポケットサウンド  \nPremiumBeat:\n-Studio Le Bus\n-Joe Sacco   \n-Kyon   \n-Smithereens \n\n#3M \n#百利菜瓜布\n#防刮細緻黃色菜瓜布\n#清潔\n#呵護最愛不留傷害",
        "thumbnails": {
          "default": {
            "url": "https://i.ytimg.com/vi/Gm4STFk62iU/default.jpg",
            "width": 120,
            "height": 90
          },
          "medium": {
            "url": "https://i.ytimg.com/vi/Gm4STFk62iU/mqdefault.jpg",
            "width": 320,
            "height": 180
          },
          "high": {
            "url": "https://i.ytimg.com/vi/Gm4STFk62iU/hqdefault.jpg",
            "width": 480,
            "height": 360
          },
          "standard": {
            "url": "https://i.ytimg.com/vi/Gm4STFk62iU/sddefault.jpg",
            "width": 640,
            "height": 480
          },
          "maxres": {
            "url": "https://i.ytimg.com/vi/Gm4STFk62iU/maxresdefault.jpg",
            "width": 1280,
            "height": 720
          }
        },
        "channelTitle": "MASAの料理ABC",
        "categoryId": "26",
        "liveBroadcastContent": "none",
        "defaultLanguage": "zh-TW",
        "localized": {
          "title": "Presented by 3M百利-辣味噌拌炒豬肉/Stir fried Spicy Miso Pork |MASAの料理ABC",
          "description": "◆MASA頻道訂閱↓ http://www.youtube.com/channel/UCr90F...  \n◆食譜↓  \n做料理的過程不是只有切菜加熱食材  \n還要包括料理中使用的所有東西都  \n清乾淨才完成料理程序  \n尤其是洗碗會影響到鍋具跟吃得健康!  \n好好了解怎麼洗東西效率才會好  \n也可以保護自己跟鍋具  \n這次跟大家介紹非常好用的  \n3M百利防刮細緻黃色菜瓜布 \n獨特纖維粒子能溫柔的把怕刮的平底鍋、玻璃杯 \n有效洗乾淨又不刮傷器具 \n讓我們一起來看看吧! (๑˃̵ᴗ˂̵)و \n\n辣味噌拌炒豬肉/豚のピリ辛味噌炒め \n [1人份]  \n五花肉 Sliced pork—150g  \n高麗菜 Cabbage—80g  \n洋蔥 Onion—1/4個  \n生辣椒 Chili pepper—1支  \n青椒 Green pepper—1/2個  \n紅蘿蔔 Carrot—20g  \n蒜泥 Garlic paste—1個分  \n薑泥 Ginger paste—1小匙  \n\n清酒 Sake—2大匙  \n味淋 Mirin—1大匙  \n砂糖 Sugar—1小匙  \n醬油 Soy sauce—2小匙  \n味噌 Miso—1大匙  \n鹽巴&黑胡椒 Salt&Pepper  \n\n*料理の名前&作り方はあくまでも自己流なのでご了承くださいw  \n\n*料理名稱&做法不一定正式or傳統, 是從自己的想法&經驗來分享的 請各位事先諒解。m( _ _ )m  \n*歡迎合作/Contact:bigway1688@gmail.com  \n*Instagram:https://www.instagram.com/masa_cookin...  \n*MASA Facebook:https://www.facebook.com/masa.abc \n*食譜書:https://www.masa.tw/masas-book  \n\n●沙拉油:泰山均衡369健康調合油 TAISUN 369 Blend Oil  \n●鍋具:THERMOS 膳魔師厚鑄耐摩不沾鍋20cm, 24cm  \n●鍋具:THERMOS 蘋果原味鍋單柄湯鍋18cm  \n●BGM:  \n甘茶の音楽工房 \nMusMus  \nOtoLogic  \nポケットサウンド  \nPremiumBeat:\n-Studio Le Bus\n-Joe Sacco   \n-Kyon   \n-Smithereens \n\n#3M \n#百利菜瓜布\n#防刮細緻黃色菜瓜布\n#清潔\n#呵護最愛不留傷害"
        },
        "defaultAudioLanguage": "zh-Hant"
      }
    }
  ],
  "pageInfo": {
    "totalResults": 1,
    "resultsPerPage": 1
  }
}

The returned data are a character vector, in which the description of this video is under the node description. Similarly, you can easily find the title of this video and the published time. However, there is no information about the counts of like, dislike, and views. If we want these pieces of information, we need to set the argument as part=statistics.

url<-"https://www.googleapis.com/youtube/v3/videos?id=Gm4STFk62iU&key=[your YouTube API key]&part=statistics"
text<-suppressWarnings(readLines(url))

The returned data are contained in a character vector. Apparently, the viewCount, likeCound, dislikeCount, and commentCount can be easily retrieved from the node “items”.

 [1] "{"                                               
 [2] "  \"kind\": \"youtube#videoListResponse\","      
 [3] "  \"etag\": \"-I-aoVp12s55g6YGMh8b9uwftH4\","    
 [4] "  \"items\": ["                                  
 [5] "    {"                                           
 [6] "      \"kind\": \"youtube#video\","              
 [7] "      \"etag\": \"iq1ExK6MuMpgWw53DNWEb4p_6lc\","
 [8] "      \"id\": \"Gm4STFk62iU\","                  
 [9] "      \"statistics\": {"                         
[10] "        \"viewCount\": \"26017\","               
[11] "        \"likeCount\": \"1102\","                
[12] "        \"dislikeCount\": \"9\","                
[13] "        \"favoriteCount\": \"0\","               
[14] "        \"commentCount\": \"37\""                
[15] "      }"                                         
[16] "    }"                                           
[17] "  ],"                                            
[18] "  \"pageInfo\": {"                               
[19] "    \"totalResults\": 1,"                        
[20] "    \"resultsPerPage\": 1"                       
[21] "  }"                                             
[22] "}"

What if we want to analyze the videos on the playlist of a channel? First, we need to get the id of the channel from that we want to collect data. To this end, the API is changed to this format as https://www.googleapis.com/youtube/v3/channels?key=[your_YouTube_API_key]&forUsername=[the_name_of_owner_of_the_channel]&part=id.

url<-"https://www.googleapis.com/youtube/v3/channels?key=[your_YouTube_API_key]&forUsername=[the_name_of_owner_of_the_channel]&part=id"
text<-suppressWarnings(readLines(url))

The returned contents are shown as follow. The id of this channel can be found under the node id.

 [1] "{"                                               
 [2] "  \"kind\": \"youtube#channelListResponse\","    
 [3] "  \"etag\": \"cmDV0YZ5NiFOFK-4EBp-uef8e5w\","    
 [4] "  \"pageInfo\": {"                               
 [5] "    \"totalResults\": 1,"                        
 [6] "    \"resultsPerPage\": 5"                       
 [7] "  },"                                            
 [8] "  \"items\": ["                                  
 [9] "    {"                                           
[10] "      \"kind\": \"youtube#channel\","            
[11] "      \"etag\": \"OCqStZdithPF3nxtwpEGGY2C2_Q\","
[12] "      \"id\": \"UCr90FXGOO8nAE9B6FAUeTNA\""      
[13] "    }"                                           
[14] "  ]"                                             
[15] "}"

With the channel id, we can find the id of the playlist of the channel, which can be found under the node uploads.

url<-"https://www.googleapis.com/youtube/v3/channels?key=[your_APIkey]
&part=contentDetails&id=UCr90FXGOO8nAE9B6FAUeTNA"
text<-suppressWarnnings(readLines(url))

The results can be seen as follow. The id of the playlists is UUr90FXGOO8nAE9B6FAUeTNA.

 [1] "{"                                                  
 [2] "  \"kind\": \"youtube#channelListResponse\","       
 [3] "  \"etag\": \"m4wCO8qKhe4QA5wuG8szgPkX9EY\","       
 [4] "  \"pageInfo\": {"                                  
 [5] "    \"totalResults\": 1,"                           
 [6] "    \"resultsPerPage\": 5"                          
 [7] "  },"                                               
 [8] "  \"items\": ["                                     
 [9] "    {"                                              
[10] "      \"kind\": \"youtube#channel\","               
[11] "      \"etag\": \"mMEXLcRbSxsOCaZZDZC7Bzq1suE\","   
[12] "      \"id\": \"UCr90FXGOO8nAE9B6FAUeTNA\","        
[13] "      \"contentDetails\": {"                        
[14] "        \"relatedPlaylists\": {"                    
[15] "          \"likes\": \"\","                         
[16] "          \"uploads\": \"UUr90FXGOO8nAE9B6FAUeTNA\""
[17] "        }"                                          
[18] "      }"                                            
[19] "    }"                                              
[20] "  ]"                                                
[21] "}" 

Now we can get the videos on the playlists. The API is created by the following codes. The returned information is saved as a variable text.v. The maximum number of returned posts is 50 and can be exceeded.

urlv<-"https://www.googleapis.com/youtube/v3/playlistItems?"
urlv<-paste0(urlv,"&part=snippet",",contentDetails,status&playlistId=",PlaylistID,
             "&key=",APIkey,"&maxResults=",20)
text.v<-suppressWarnings(readLines(urlv))

First, we can retrieve the tiles of the 20 videos. Also, we can get the published time of each video’s description and the id’s of each video. Second, we can get the statistics of each video.

titles<-text.v[which(grepl("title",text.v))]
postimes<-text.v[which(grepl("publishedAt",text.v))]
postimes<-sapply(postimes,function(x)unlist(strsplit(x,'\"'))[4])
names(postimes)<-NULL
desps<-text.v[which(grepl("description",text.v))]
ids<-text.v[which(grepl('videoId',text.v))]
ids<-ids[seq(1,2*length(titles),2)]
ids<-sapply(ids,function(x){
  temp<-unlist(strsplit(x,'\"'))[4]
  return(temp)
})
names(ids)<-NULL
stats<-lapply(ids,function(x){
  url<-"https://www.googleapis.com/youtube/v3/videos?"
  id<-x
  url<-paste0(url,"id=",id,"&key=",APIkey,"&part=statistics")
  text<-suppressWarnings(readLines(url))
  return(text)
})
v.stats<-sapply(stats,function(x){
  view<-unlist(strsplit(x[10],'\"'))[4]
  like<-unlist(strsplit(x[11],'\"'))[4]
  dislike<-unlist(strsplit(x[12],'\"'))[4]
  comment<-unlist(strsplit(x[14],'\"'))[4]
  return(c(view,like,dislike,comment))
})

We can do data analysis for these 20 videos. We can create a data frame to contain the like, dislikes, published times, and comment counts. As we have time-series data, we use the function ts( ) to transfer those-date-and-time data to a particular vector to the format of a time-series-like object.

v.stats<-t(v.stats)
v.stats<-data.frame(v.stats)
names(v.stats)<-c("view","like","dislike","comment")
v.stats$view<-as.numeric(v.stats$view)
v.stats$like<-as.numeric(v.stats$like)
v.stats$dislike<-as.numeric(v.stats$dislike)
v.stats$comment<-as.numeric(v.stats$comment)
v.stats$publishedAt<-ts(postimes)
v.stats<-v.stats[order(v.stats$publishedAt,decreasing=T),]
v.stats$time<-as.Date(v.stats$publishedAt)-as.Date(v.stats$publishedAt[nrow(v.stats)])
v.stats$time2<-as.Date(v.stats$publishedAt[1])-sort(v.stats$time)

We can plot the like counts of these 20 videos. It looks clear that the like counts are gradually decreasing.

library(ggplot2)
ggplot(v.stats,aes(time2,like))+
  geom_line(color="tomato")+geom_point()+
  scale_x_date(date_labels="%b %d")