-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[iwara] Add support #7785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[iwara] Add support #7785
Changes from 10 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
cf1594d
[iwara] Add initial support
NecRaul bcc7534
[iwara] Add search support
NecRaul 1f129db
[iwara] Code cleanup
NecRaul 24516e0
[iwara] Small fixes and additions
NecRaul 5040290
[iwara] Add tag support
NecRaul 09f4b69
[iwara] Add mime-type to metadata
NecRaul 74b3bbb
[iwara] Refactor patterns/matching using urllib
NecRaul 9c428cd
[iwara] Add unit tests
NecRaul 4508056
[iwara] Update docs
NecRaul 2145452
[iwara] Fix linting on older Python versions
NecRaul 6da9363
[iwara] update 'IwaraAPI' interface class
mikf 655c2bd
[iwara] split and rename 'profile' extractor
mikf 8e4ad59
[iwara] simplify '_user_params()' usage
mikf 45f567e
[iwara] update 'video' extractor
mikf 3e72957
[iwara] update 'image' extractor
mikf ca363c8
[iwara] update 'playlist' extractor
mikf a3762b2
[iwara] update 'search' extractor
mikf 1089311
[iwara] update 'tag' extractor
mikf 92dc4e4
[iwara] simplify 'yield_image' usage
mikf 3aef4d3
[iwara] add video "image" test
mikf 71740e5
[iwara] provide 'date' metadata
mikf 44d085b
[iwara] simplify 'source()'
mikf ac0c0e3
[iwara] small optimizations
mikf afacbaa
[iwara] add missing 'keyarg=1' to profile() memcache decorator
mikf d9d74fe
[tests/iwara] update results
mikf 261cfe2
[iwara] call 'IwaraAPI.authenticate()' only once
mikf 51a6179
[iwara] extract more 'user' metadata
mikf f0ff28b
[iwara] update default format strings
mikf 61e5c4e
Merge remote-tracking branch 'origin/master' into NecRaul-iwara
mikf d31dd8b
[iwara] restructure image/video handling
mikf 8ab3326
[iwara] fix login and token handling
mikf 01083cd
[iwara] add 'favorite' extractor
mikf d02a9d3
[iwara] add 'following' and 'followers' extractors
mikf File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,6 +93,7 @@ | |
"issuu", | ||
"itaku", | ||
"itchio", | ||
"iwara", | ||
"jschan", | ||
"kabeuchi", | ||
"keenspot", | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,357 @@ | ||
# -*- coding: utf-8 -*- | ||
|
||
# This program is free software; you can redistribute it and/or modify | ||
# it under the terms of the GNU General Public License version 2 as | ||
# published by the Free Software Foundation. | ||
|
||
"""Extractors for https://www.iwara.tv/""" | ||
|
||
from .common import Extractor, Message | ||
from .. import text | ||
from urllib.parse import unquote, urlparse, parse_qs | ||
import hashlib | ||
|
||
BASE_PATTERN = r"(?:https?://)?(?:www\.)?iwara\.tv" | ||
|
||
|
||
class IwaraExtractor(Extractor): | ||
"""Base class for iwara.tv extractors""" | ||
category = "iwara" | ||
root = "https://www.iwara.tv" | ||
directory_fmt = ("{category}", "{username}") | ||
filename_fmt = "{id} {title} {filename}.{extension}" | ||
archive_fmt = "{type} {username} {id} {filename}" | ||
|
||
def _init(self): | ||
self.api = IwaraAPI(self) | ||
|
||
def extract_user_info(self, profile): | ||
user = profile.get("user", {}) | ||
return { | ||
"user_id": user.get("id"), | ||
"username": user.get("username"), | ||
"display_name": user.get("name").strip(), | ||
} | ||
|
||
def extract_media_info(self, item, key, include_file_info): | ||
data = { | ||
"id": item.get("id"), | ||
"title": item.get("title").strip() if item.get("title") else "", | ||
} | ||
|
||
if include_file_info: | ||
file_info = item if key is None else item.get(key, {}) | ||
filename = file_info.get("name") | ||
filename, extension = filename.rsplit(".", 1) | ||
createdAt = file_info.get("createdAt") | ||
dt = text.parse_datetime(createdAt, "%Y-%m-%dT%H:%M:%S.%fZ") | ||
datetime = dt.strftime(f"%a, %b {dt.day}, %Y %H:%M:%S") | ||
data.update({ | ||
"file_id": file_info.get("id"), | ||
"filename": filename, | ||
"extension": extension, | ||
"mime": file_info.get("mime"), | ||
"size": file_info.get("size"), | ||
"width": file_info.get("width"), | ||
"height": file_info.get("height"), | ||
"duration": file_info.get("duration"), | ||
"datetime": datetime, | ||
"type": file_info.get("type"), | ||
}) | ||
return data | ||
|
||
def get_metadata(self, user_info, media_info): | ||
return { | ||
**user_info, | ||
**media_info | ||
} | ||
|
||
def yield_video(self, user_info, video): | ||
video_info = self.extract_media_info(video, "file", True) | ||
video_id = video_info.get("id") | ||
file_id = video_info.get("file_id") | ||
video = self.api.item(f"/video/{video_id}") | ||
file_url = video.get("fileUrl") | ||
sources = self.api.source(file_id, file_url) | ||
source = next((r for r in sources if r.get("name") == "Source"), None) | ||
download_url = source.get('src', {}).get('download') | ||
url = f"https:{download_url}" | ||
metadata = self.get_metadata(user_info, video_info) | ||
yield Message.Directory, metadata | ||
yield Message.Url, url, metadata | ||
|
||
def yield_image(self, user_info, image_group): | ||
image_group_info = self.extract_media_info(image_group, "file", False) | ||
for image_file in image_group.get("files", {}): | ||
image_file_info = self.extract_media_info(image_file, None, True) | ||
image_info = {**image_file_info, **image_group_info} | ||
file_id = image_info.get("file_id") | ||
extension = image_info.get("extension") | ||
url = ( | ||
f"https://i.iwara.tv/image/original/" | ||
f"{file_id}/{file_id}.{extension}" | ||
) | ||
metadata = self.get_metadata(user_info, image_info) | ||
yield Message.Directory, metadata | ||
yield Message.Url, url, metadata | ||
|
||
|
||
class IwaraProfileExtractor(IwaraExtractor): | ||
NecRaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""Extractor for iwara.tv profile pages""" | ||
subcategory = "profile" | ||
pattern = BASE_PATTERN + r"/profile(?:/|$)" | ||
NecRaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
example = "https://www.iwara.tv/profile/username" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 2 and parts[0] == "profile": | ||
self.profile = parts[1] | ||
else: | ||
return | ||
|
||
def items(self): | ||
profile = self.api.profile(f"/profile/{self.profile}") | ||
if not profile: | ||
return | ||
user_info = self.extract_user_info(profile) | ||
user_id = user_info.get("user_id") | ||
videos = self.api.collection("/videos", user_id) | ||
for video in videos: | ||
yield from self.yield_video(user_info, video) | ||
|
||
playlists = self.api.collection("/playlists", user_id) | ||
for playlist in playlists: | ||
videos = self.api.collection(f"/playlist/{playlist.get('id')}") | ||
for video in videos: | ||
user_info = self.extract_user_info(video) | ||
yield from self.yield_video(user_info, video) | ||
|
||
image_groups = self.api.collection("/images", user_id) | ||
for image_group in image_groups: | ||
image_group_id = image_group.get("id") | ||
images = self.api.item(f"/image/{image_group_id}") | ||
yield from self.yield_image(user_info, images) | ||
|
||
|
||
class IwaraVideoExtractor(IwaraExtractor): | ||
"""Extractor for individual iwara.tv videos""" | ||
subcategory = "video" | ||
pattern = BASE_PATTERN + r"/video(?:/|$)" | ||
example = "https://www.iwara.tv/video/video-id/slug" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 2 and parts[0] == "video": | ||
self.video_id = parts[1] | ||
else: | ||
return | ||
|
||
def items(self): | ||
video = self.api.item(f"/video/{self.video_id}") | ||
if not video: | ||
return | ||
user_info = self.extract_user_info(video) | ||
yield from self.yield_video(user_info, video) | ||
|
||
|
||
class IwaraImageExtractor(IwaraExtractor): | ||
"""Extractor for individual iwara.tv image pages""" | ||
subcategory = "image" | ||
pattern = BASE_PATTERN + r"/image(?:/|$)" | ||
example = "https://www.iwara.tv/image/image-id/slug" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 2 and parts[0] == "image": | ||
self.image_id = parts[1] | ||
else: | ||
return | ||
|
||
def items(self): | ||
image_group = self.api.item(f"/image/{self.image_id}") | ||
if not image_group: | ||
return | ||
user_info = self.extract_user_info(image_group) | ||
yield from self.yield_image(user_info, image_group) | ||
|
||
|
||
class IwaraPlaylistExtractor(IwaraExtractor): | ||
"""Extractor for individual iwara.tv playlist pages""" | ||
subcategory = "playlist" | ||
pattern = BASE_PATTERN + r"/playlist(?:/|$)" | ||
example = "https://www.iwara.tv/playlist/playlist-id" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 2 and parts[0] == "playlist": | ||
self.playlist_id = parts[1] | ||
else: | ||
return | ||
|
||
def items(self): | ||
videos = self.api.collection(f"/playlist/{self.playlist_id}") | ||
if not videos: | ||
return | ||
for video in videos: | ||
video = self.api.item(f"/video/{video.get('id')}") | ||
user_info = self.extract_user_info(video) | ||
yield from self.yield_video(user_info, video) | ||
|
||
|
||
class IwaraSearchExtractor(IwaraExtractor): | ||
"""Extractor for iwara.tv search pages""" | ||
subcategory = "search" | ||
pattern = BASE_PATTERN + r"/search" | ||
example = "https://www.iwara.tv/search?query=example&type=search_type" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 1 and parts[0] == "search": | ||
query_dict = parse_qs(parsed.query) | ||
self.query = query_dict.get("query", [""])[0] | ||
self.type = query_dict.get("type", [None])[0] | ||
else: | ||
return | ||
|
||
def items(self): | ||
collection = self.api.collection("/search", self.query) | ||
if self.type == "video": | ||
for video in collection: | ||
video = self.api.item(f"/video/{video.get('id')}") | ||
user_info = self.extract_user_info(video) | ||
yield from self.yield_video(user_info, video) | ||
elif self.type == "image": | ||
for image_group in collection: | ||
image_group = self.api.item(f"/image/{image_group.get('id')}") | ||
if not image_group: | ||
return | ||
user_info = self.extract_user_info(image_group) | ||
yield from self.yield_image(user_info, image_group) | ||
|
||
|
||
class IwaraTagExtractor(IwaraExtractor): | ||
"""Extractor for iwara.tv tag search""" | ||
subcategory = "tag" | ||
pattern = BASE_PATTERN + r"/(videos|images)(?:\?.*)?" | ||
example = "https://www.iwara.tv/videos?tags=example" | ||
|
||
def __init__(self, match): | ||
IwaraExtractor.__init__(self, match) | ||
parsed = urlparse(self.url) | ||
parts = parsed.path.strip("/").split("/") | ||
if len(parts) >= 1 and parts[0] in ("videos", "images"): | ||
query_dict = parse_qs(parsed.query) | ||
self.type = parts[0] | ||
self.tags = query_dict.get("tags", [""])[0] | ||
else: | ||
return | ||
|
||
def items(self): | ||
collection = self.api.collection(f"/{self.type}", self.tags) | ||
if self.type == "videos": | ||
for video in collection: | ||
video = self.api.item(f"/video/{video.get('id')}") | ||
user_info = self.extract_user_info(video) | ||
yield from self.yield_video(user_info, video) | ||
elif self.type == "images": | ||
for image_group in collection: | ||
image_group = self.api.item(f"/image/{image_group.get('id')}") | ||
if not image_group: | ||
return | ||
user_info = self.extract_user_info(image_group) | ||
yield from self.yield_image(user_info, image_group) | ||
|
||
|
||
class IwaraAPI(): | ||
"""Interface for the Iwara API""" | ||
root = "https://api.iwara.tv" | ||
|
||
def __init__(self, extractor): | ||
self.extractor = extractor | ||
self.token = None | ||
self.headers = {} | ||
self.login("/user/login") | ||
|
||
def login(self, endpoint): | ||
NecRaul marked this conversation as resolved.
Show resolved
Hide resolved
|
||
url = self.root + endpoint | ||
email, password = self.extractor._get_auth_info() | ||
if not email or not password: | ||
return | ||
json = { | ||
"email": email, | ||
"password": password | ||
} | ||
response = self.extractor.request(url, method="POST", json=json) | ||
token = response.json().get("token") | ||
if token: | ||
self.token = token | ||
self.headers = {"Authorization": f"Bearer {token}"} | ||
self.extractor.log.info(f"Logged in as {email}.") | ||
|
||
def profile(self, endpoint): | ||
url = self.root + endpoint | ||
return self.extractor.request_json(url, headers=self.headers) | ||
|
||
def collection(self, endpoint, search=None): | ||
url = self.root + endpoint | ||
params = {} | ||
page = 0 | ||
limit = 50 | ||
if self.extractor.subcategory == "search": | ||
params = { | ||
"query": unquote(search) if search else "", | ||
"page": page, | ||
"limit": limit, | ||
"type": self.extractor.type, | ||
} | ||
elif self.extractor.subcategory == "tag": | ||
params = { | ||
"tags": unquote(search) if search else "", | ||
"page": page, | ||
"limit": limit, | ||
} | ||
else: | ||
params = { | ||
"user": unquote(search) if search else "", | ||
"page": page, | ||
"limit": limit, | ||
} | ||
collection = [] | ||
while True: | ||
data = self.extractor.request_json( | ||
url, | ||
headers=self.headers, | ||
params=params | ||
) | ||
results = data.get("results", []) | ||
if not results: | ||
break | ||
collection.extend(results) | ||
if len(results) < limit: | ||
break | ||
params["page"] += 1 | ||
return collection | ||
|
||
def item(self, endpoint): | ||
url = self.root + endpoint | ||
return self.extractor.request_json(url, headers=self.headers) | ||
|
||
def source(self, file_id, url): | ||
expiration = parse_qs(urlparse(url).query).get("expires", [None])[0] | ||
if not expiration: | ||
return [] | ||
sha_postfix = "5nFp9kmbNnHdAFhaqMvt" | ||
sha_key = f"{file_id}_{expiration}_{sha_postfix}" | ||
hash = hashlib.sha1(sha_key.encode("utf-8")).hexdigest() | ||
headers = {"X-Version": hash, **self.headers} | ||
return self.extractor.request_json(url, headers=headers) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.