API References

YoutubeAPI

class youtube_scraping_api.YoutubeAPI(debug_level='ERROR')[source]

Core developer interface for Youtube Scraping API

Parameters

debug_level (str, optional) – Which level must be reached in order to print out log messages

Methods:

channel([channel_id, username])

Parse channel metadata and all it’s videos

playlist([playlist_id, continuation_token])

Parse playlist metadata and videos

query_suggestions([query, language, country])

Return a list of query suggestions for given query string

search([query, continuation_token, raw, filter])

Parse YouTube search results of specific query or continuation token

video(video_id)

Parse video metadata, captions, download link, etc.

channel(channel_id=None, username=None)[source]

Parse channel metadata and all it’s videos

Parameters
  • channel_id (str, optional) – ID of channel

  • username (str, optional) – Username of channel owner

Returns

Channel Object

Return type

Channel

playlist(playlist_id=None, continuation_token=None)[source]

Parse playlist metadata and videos

Parameters
  • playlistId (str, optional) – ID of playlist

  • continuation_token (str, optional) – A token generated by YouTube to fetch more playlist videos

Returns

Playlist Object

Return type

Playlist

query_suggestions(query=None, language='en', country='gb')[source]

Return a list of query suggestions for given query string

Parameters
  • query (str, optional) – A string of query, defaults set to None

  • language (str, optional) – Language of results, defaults set to ‘en’

  • country (str, optional) – Country code for more accurate suggestions, defaults set to ‘gb’

Returns

A list of query suggestions

Return type

list

search(query=None, continuation_token=None, raw=False, filter=None)[source]

Parse YouTube search results of specific query or continuation token

Parameters
  • query (str, optional) – A query string to search on YouTube

  • continuation_token (str, optional) – A token generated by YouTube to fetch more search results

  • raw (bool, optional) – Whether to return search results in raw format. Default set to False

  • filter (SearchFilter, optinal) – Filter for search results

Returns

Search results

Return type

SearchResult

video(video_id: str)youtube_scraping_api.parser.video.Video[source]

Parse video metadata, captions, download link, etc.

Parameters

video_id (str) – ID of Youtube video

Returns

Video object

Return type

Video

Filter

Classes:

AvailableSearchFilter([duration, …])

An object containing all available search filters

SearchFilter([type, features, sort_by, …])

Filter for search results

Functions:

get_filtered_url(session, base_url, filter)

Generate valid search url that includes query string filter

class youtube_scraping_api.filter.AvailableSearchFilter(duration=None, upload_date=None, type=None, features=None, sort_by=None)[source]

An object containing all available search filters

class youtube_scraping_api.filter.SearchFilter(type=None, features=None, sort_by=None, upload_date=None, duration=None)[source]

Filter for search results

Parameters
  • type (str or None) – (optional) Type of search result

  • features (list or None) – (optional) Features of vidoes

  • sort_by (str or None) – (optional) Criteria of sorting search results

  • upload_date (str or None) – (optional) Upload date of videos

  • duration – (optional) Expected duration of videos

Return type

Object[SearchFilter]

Methods:

get_all_filters()

Get all available filters that can be used when querying search results

classmethod get_all_filters()[source]

Get all available filters that can be used when querying search results

youtube_scraping_api.filter.get_filtered_url(session, base_url, filter)[source]

Generate valid search url that includes query string filter

Parameters
  • session (Session) – Requests session

  • base_url (str) – Base search url that includes only query string

  • filter (SearchFilter) – Search filter that defined by user

Returns

Valid search url

Return type

str

Video

class youtube_scraping_api.parser.video.Video(video_id: str, builtin_called: bool = False, **kwargs)[source]

Container for video data

Attributes:

author

Extract the content creator who upload the video

captions

Give you a list of available captions for the video

description

Extract full description of video

download_data

Parse download links and metadata of the video

length

Extract the length of the video in second

publish_time

Extract the time when the video is published

raw

Return a dictionary containing all data of video

supertitle

Extract supertitle(custom tags) from video

tags

Extract descriptive keywords which content creators can add to thier video to help viewers find their content

title

Extract video title

type

Determine type of video

Methods:

download([itag, path, log_progress, …])

Download video from YouTube into local storage

get_comment_count()

Fetch the amount of comments of the video

get_file_size(url)

Get the size of video stream

get_signature_url(url)

Get decrypted download link for the video

parse_data()

Fetch HTML source code and extract JSON data from it

stream(url[, chunk_size, range_size])

Request and yield chunks of video stream

property author

Extract the content creator who upload the video

Returns

Video author

Return type

Channel

property captions

Give you a list of available captions for the video

Returns

List of available captions

Return type

CaptionQuery

property description

Extract full description of video

Returns

Description of video

Return type

str

download(itag: Optional[int] = None, path: str = '.', log_progress: bool = True, chunk_size: int = 4096, callback_func: Optional[Callable[[Any], None]] = None, name: Optional[str] = None)None[source]

Download video from YouTube into local storage

Parameters
  • itag (int, optional) – Itag of the video to download, video with best quality will be downloaded if set to None, default set to None

  • path (str, optional) – Relative or absolute path to save the video

  • log_progress (bool, optional) – Wether to show progress bar of download or not. Default set to True

  • chunk_size (int, optional) – Size of chunk per request. Default set to 4096

  • callback_func (Callable, optional) – Callback function to be called downloading video. Default set to None

  • name (str, optional) – Filename of video. Use video title if not set

Returns

None, just download the video and save it

Return type

None

property download_data

Parse download links and metadata of the video

Returns

List of dictionary containing download links and metadata

Return type

list

get_comment_count()int[source]

Fetch the amount of comments of the video

Returns

Number of comments

Return type

int

get_file_size(url: str)int[source]

Get the size of video stream

Parameters

url (str) – Download link of the video

Returns

size of video stream in bytes

Return type

int

get_signature_url(url: str)str[source]

Get decrypted download link for the video

Parameters

url (str) – Encrypted download link of the video

Returns

Usable download link of video

Return type

str

Note

This function isn’t developed by me since I have no enough time to dive so deep into Javascript. Credit to PyTube.

property length

Extract the length of the video in second

Returns

Length of video

Return type

int

parse_data()None[source]

Fetch HTML source code and extract JSON data from it

Returns

Nothing, data have been set inside local variable

Return type

None

property publish_time

Extract the time when the video is published

Returns

Time when video is published

Return type

str

Todo

Convert output string to datetime object

property raw

Return a dictionary containing all data of video

Returns

Raw data of video

Return type

dict

stream(url, chunk_size: int = 8192, range_size: int = 10000000000) → Generator[source]

Request and yield chunks of video stream

Parameters
  • url (str) – Download link of video stream

  • chunk_size (int) – Size of chunk per request

  • range_size (int, optional) – Default size to download, can be overridden

Returns

video stream chunk

Return type

bytes

property supertitle

Extract supertitle(custom tags) from video

Returns

List of supertitles of video

Return type

list

property tags

Extract descriptive keywords which content creators can add to thier video to help viewers find their content

Returns

List of all tags of video

Return type

list

property title

Extract video title

Returns

Video title

Return type

str

property type

Determine type of video

Returns

Video type

Return type

str

Caption

Classes:

Caption(language_code, name, url, …)

Caption object of video

CaptionQuery(data, default)

Container of available captions of the video

TransLangQuery(data)

Container of language available for translation of a caption

TranslatedCaption(language, url, …)

Object of translated version of caption

TranslationLang(raw)

Translation language for translatable caption

class youtube_scraping_api.caption.Caption(language_code: str, name: str, url, translatable: str = False, translate_langs: Optional[List] = None)[source]

Caption object of video

Attributes:

available_translations

Return all available translations of the caption

dict

GIves you a dictionary containing the captions content and metadata

xml

Raw XML format of the caption

Methods:

get_text([delimiter])

Extract text from XML string

translate_to(language_code)

Translate the caption to the given language if the caption is translatable

property available_translations: youtube_scraping_api.caption.TransLangQuery

Return all available translations of the caption

Returns

List of all available translations

Return type

List[TransLangQuery]

property dict: Dict[str, Union[str, float]]

GIves you a dictionary containing the captions content and metadata

Returns

Caption text and it’s metadata in dictionary format

Return type

dict

get_text(delimiter='\n')str[source]

Extract text from XML string

Parameters

delimiter (str, optional) – Delimiter for spliting sentences, defaults to ‘n’

Returns

Caption in pure text form

Return type

str

translate_to(language_code: str) → Optional[youtube_scraping_api.caption.TranslatedCaption][source]

Translate the caption to the given language if the caption is translatable

Parameters

language_code (str) – Language code of targeted language for translation

Returns

Translated caption

Return type

TranslatedCaption

property xml: str

Raw XML format of the caption

Returns

Caption in XML format

Return type

str

class youtube_scraping_api.caption.CaptionQuery(data: list, default: int = 0)[source]

Container of available captions of the video

Methods:

get_caption([language_code])

Get caption by language code

get_caption(language_code: Optional[str] = None) → Optional[youtube_scraping_api.caption.Caption][source]

Get caption by language code

Parameters

language_code (str, optional) – Language code of the caption, defaults set to None

Returns

Caption that corresponds to the language code. Return caption with default language if language_code is not specified. Return None if caption with the language code is not found.

Return type

Caption or None

class youtube_scraping_api.caption.TransLangQuery(data: list)[source]

Container of language available for translation of a caption

Methods:

get_language(language_code)

Get language by language code

get_language_code()

Return language code of all available languages

get_name()

Return name of all available languages

get_language(language_code: str) → Optional[youtube_scraping_api.caption.TranslationLang][source]

Get language by language code

Parameters

language_code (str) – Language code of the language

Returns

Language that corresponds to the language code. Return None if caption with the language code is not found.

Return type

List[TranslateLang] or None

get_language_code()list[source]

Return language code of all available languages

Returns

Language code of all available languages

Return type

list

get_name()list[source]

Return name of all available languages

Returns

Name of all available languages

Return type

list

class youtube_scraping_api.caption.TranslatedCaption(language: TranslateLang, url: str, original_lang_code: str)[source]

Object of translated version of caption

class youtube_scraping_api.caption.TranslationLang(raw: dict)[source]

Translation language for translatable caption

Playlist

class youtube_scraping_api.parser.playlist.Playlist(response, builtin_called=False)[source]

A container of playlist metadata and its videos

Attributes:

description

Extract description of the playlist if available

last_updated

Return the time when the playlist is last updated

owner

Return the name of playlist creator

title

Exrtract name of the playlist

video_count

Count how many videos are in the playlist

view_count

Count the total views of all videos in the playlist

property description

Extract description of the playlist if available

Returns

Playlist description

Return type

str of None

property last_updated

Return the time when the playlist is last updated

Returns

Playlist last updated time

Return type

str

property owner

Return the name of playlist creator

Returns

Playlist creator name

Return type

str

property title

Exrtract name of the playlist

Returns

Playlist name

Return type

str

property video_count

Count how many videos are in the playlist

Returns

Number of videos in the playlist

Return type

int

property view_count

Count the total views of all videos in the playlist

Returns

Total views of videos in the playlist

Return type

int

class youtube_scraping_api.parser.playlist.PlaylistVideo(data)[source]

A container of playlist with the video index in playlist video that function exactly the same as Video Object

Attributes:

author

Extract the content creator who upload the video

captions

Give you a list of available captions for the video

description

Extract full description of video

download_data

Parse download links and metadata of the video

index

Index of the video in playlist

length

Extract the length of the video in second

publish_time

Extract the time when the video is published

raw

Return a dictionary containing all data of the playlist video

supertitle

Extract supertitle(custom tags) from video

tags

Extract descriptive keywords which content creators can add to thier video to help viewers find their content

title

Extract video title

type

Determine type of video

Methods:

download([itag, path, log_progress, …])

Download video from YouTube into local storage

get_comment_count()

Fetch the amount of comments of the video

get_file_size(url)

Get the size of video stream

get_signature_url(url)

Get decrypted download link for the video

parse_data()

Fetch HTML source code and extract JSON data from it

stream(url[, chunk_size, range_size])

Request and yield chunks of video stream

property author

Extract the content creator who upload the video

Returns

Video author

Return type

Channel

property captions

Give you a list of available captions for the video

Returns

List of available captions

Return type

CaptionQuery

property description

Extract full description of video

Returns

Description of video

Return type

str

download(itag: Optional[int] = None, path: str = '.', log_progress: bool = True, chunk_size: int = 4096, callback_func: Optional[Callable[[Any], None]] = None, name: Optional[str] = None)None

Download video from YouTube into local storage

Parameters
  • itag (int, optional) – Itag of the video to download, video with best quality will be downloaded if set to None, default set to None

  • path (str, optional) – Relative or absolute path to save the video

  • log_progress (bool, optional) – Wether to show progress bar of download or not. Default set to True

  • chunk_size (int, optional) – Size of chunk per request. Default set to 4096

  • callback_func (Callable, optional) – Callback function to be called downloading video. Default set to None

  • name (str, optional) – Filename of video. Use video title if not set

Returns

None, just download the video and save it

Return type

None

property download_data

Parse download links and metadata of the video

Returns

List of dictionary containing download links and metadata

Return type

list

get_comment_count()int

Fetch the amount of comments of the video

Returns

Number of comments

Return type

int

get_file_size(url: str)int

Get the size of video stream

Parameters

url (str) – Download link of the video

Returns

size of video stream in bytes

Return type

int

get_signature_url(url: str)str

Get decrypted download link for the video

Parameters

url (str) – Encrypted download link of the video

Returns

Usable download link of video

Return type

str

Note

This function isn’t developed by me since I have no enough time to dive so deep into Javascript. Credit to PyTube.

index

Index of the video in playlist

Returns

Video index in playlist

Return type

int

property length

Extract the length of the video in second

Returns

Length of video

Return type

int

parse_data()None

Fetch HTML source code and extract JSON data from it

Returns

Nothing, data have been set inside local variable

Return type

None

property publish_time

Extract the time when the video is published

Returns

Time when video is published

Return type

str

Todo

Convert output string to datetime object

property raw

Return a dictionary containing all data of the playlist video

Returns

Raw data of video

Return type

dict

stream(url, chunk_size: int = 8192, range_size: int = 10000000000) → Generator

Request and yield chunks of video stream

Parameters
  • url (str) – Download link of video stream

  • chunk_size (int) – Size of chunk per request

  • range_size (int, optional) – Default size to download, can be overridden

Returns

video stream chunk

Return type

bytes

property supertitle

Extract supertitle(custom tags) from video

Returns

List of supertitles of video

Return type

list

property tags

Extract descriptive keywords which content creators can add to thier video to help viewers find their content

Returns

List of all tags of video

Return type

list

property title

Extract video title

Returns

Video title

Return type

str

property type

Determine type of video

Returns

Video type

Return type

str

Channel

class youtube_scraping_api.parser.channel.Channel(channel_id=None, username=None, builtin_called=False, **kwargs)[source]

Container of channel data

Attributes:

avatar

Extract avatar thumbnail url of the channel

banner

Return a list of banner urls in different resolutions if available

description

Extract description of the channel

facebook_profile_id

Extract facebook profile id of channel owner if available

header_links

Return a list of social media links that content creator put on their channel header if available

is_verified

Check if the channel is verified by YouTube

keywords

Extract descriptive keywords which content creators can add to thier channel to help viewers find them

name

Extract name of the channel

raw

Returns all available channel metadata

subscriber_count

Extract number of subscriber of the cannel

url

Return url of the channel

vanity_url

Return a human readable custom url created by the channel owner if available

Methods:

parse_data()

Fetch HTML source code and extract JSON data from it

property avatar

Extract avatar thumbnail url of the channel

Returns

Avatar thumbnail url of the channel

Return type

str

property banner

Return a list of banner urls in different resolutions if available

Returns

A list of banner urls

Return type

list or None

property description

Extract description of the channel

Returns

Channel description

Return type

str

property facebook_profile_id

Extract facebook profile id of channel owner if available

Returns

Facebook profile id of the channel owner

Return type

str or None

Return a list of social media links that content creator put on their channel header if available

Returns

A list of social media links

Return type

list or None

property is_verified

Check if the channel is verified by YouTube

Returns

A boolean that states whether the channel is verified

Return type

bool

property keywords

Extract descriptive keywords which content creators can add to thier channel to help viewers find them

Returns

List of all keywords of the channel

Return type

list

property name

Extract name of the channel

Returns

Channel name

Return type

str

parse_data()[source]

Fetch HTML source code and extract JSON data from it

Returns

Nothing, data have been set inside local variable

Return type

None

property raw

Returns all available channel metadata

Returns

A dictionary with metadata values

Return type

dict

property subscriber_count

Extract number of subscriber of the cannel

Returns

Number of subscriber of the channel

Return type

int or str or None

property url

Return url of the channel

Returns

Channel url

Return type

str

property vanity_url

Return a human readable custom url created by the channel owner if available

Returns

Custom channel url

Return type

str

Utilities

Functions:

convert_valid_filename(string)

Remove invalid character for saving file from string

find_snippet(text, start, end[, skip])

Find snippet in text

get_initial_data(html)

Extract primary JSON data from raw HTML source code

get_initial_player_response(html)

Extract JSON data where video download links are located

get_thumbnail(video_id)

Get url for thumbnails of video

parse_continuation_token(data)

Extract continuation from raw JSON data

reveal_redirect_url(url)

Get real url from redirect url

search_dict(partial, key)

Recursive search in dictionary

youtube_scraping_api.utils.convert_valid_filename(string)[source]

Remove invalid character for saving file from string

Parameters

string (str) – String to be converted into valid filename

Returns

String that has invalid character removed

Return type

str

youtube_scraping_api.utils.find_snippet(text, start, end, skip=(0, 0))[source]

Find snippet in text

Parameters
  • text (str) – Text to search in

  • start (str) – Where to start grabbing text

  • end (str) – Where to stop grabbing text and return

  • skip (tuple) – Number of character to trim in front and behind gragbbed text

Returns

Snippet found in the text

Return type

str

youtube_scraping_api.utils.get_initial_data(html)[source]

Extract primary JSON data from raw HTML source code

Parameters

html (str) – Raw HTML source code

Returns

JSON data in form of dictionary

Return type

dict

youtube_scraping_api.utils.get_initial_player_response(html)[source]

Extract JSON data where video download links are located

Parameters

html (str) – Raw HTML source code

Returns

JSON data in form of dictionary

Return type

dict

youtube_scraping_api.utils.get_thumbnail(video_id)[source]

Get url for thumbnails of video

Parameters

video_id (str) – Youtube ID of the video

Returns

A dictionary of thumbnail urls

Return type

dict

Todo

Check thumbnail urls availability

youtube_scraping_api.utils.parse_continuation_token(data)[source]

Extract continuation from raw JSON data

Parameters

data (dict) – Raw JSON data

Returns

Continuation token

Return type

str

youtube_scraping_api.utils.reveal_redirect_url(url)[source]

Get real url from redirect url

Parameters

url (str) – Redirect url

Returns

Real url

Return type

str

youtube_scraping_api.utils.search_dict(partial, key)[source]

Recursive search in dictionary

Parameters
  • partial (dict) – Dictionary to search in

  • key (str) – Key that you want to search in dictionary

Returns

Value in dictionary of targeted key

Return type

Any