This commit is contained in:
Christoph Califice
2025-10-09 20:05:31 -03:00
parent ed22ef22bc
commit 0a5f88d75a
1442 changed files with 101562 additions and 0 deletions

View File

@@ -0,0 +1,32 @@
# Ruby Common Files
The files in this folder are a shared codebase to be used by any scrapers written in Ruby. If you are using any Ruby scrapers, it is suggested to download this whole folder and include it alongside your scrapers in the same placement it currently sits in this repo.
## Ruby and Gems
Before you use files from this folder or any ruby script you will need to ensure you have installed Ruby in your instance, as well as the Faraday gem. If you are using docker you will need to open a console/shell for the container and run `apk add --no-cache ruby && gem install faraday`.
In a future version of Stash, Ruby and Faraday will be included and won't require a separate install.
## Configs
Navigate to the `/configs` folder to set any api_keys and endpoints that you may need to configure for your scrapers to work. There is a readme in that folder as well with more details.
## GraphQL
Navigate to the `/graphql` folder for more details on the shared GraphQL interfaces that your Ruby scrapers can leverage.
## Logger
The `logger.rb` file defines a shared class that Ruby scrapers can leverage to output logs at the correct log level instead of everything coming through as an error log.
Once required it is suggested to assign the logger to a variable so that you can call it in short form like:
```Ruby
logger = Stash::Logger
logger.info("This log will be output as an 'INFO' level log")
```
## Simple Mass Requirement
Your scraper can require all common interfaces files at once with something like `require_relative "rb_common/rb_common"`.

View File

@@ -0,0 +1,42 @@
# Common Configs
## Adding your settings to an existing config
Various ruby scrapers will use these configs so that each one doesn't separately need to ask for your api keys and endpoints. For example if you want to configure your stash instance you can open the `stash_config.rb` with any text editor and in the `USER_CONFIG`, add your details. For example you might change:
```Ruby
USER_CONFIG = {
endpoint: "http://localhost:9999",
api_key: ""
}
```
to (made up endpoint and key for example):
```Ruby
USER_CONFIG = {
endpoint: "http://192.168.0.99:6969",
api_key: "thisIsAFakeAPIKeyTheRealOneIsMuchLonger"
}
```
## Calling these configs in your scraper
The config values have been defined as class methods, so calling them in your scraper is as simple as requiring the config file either directly with something like `require_relative "rb_common/configs/stash_config"` or generally by requiring all the common interfaces with something like `require_relative "rb_common/rb_common"`.
Once required your scrupt can access them via calls like `Config::Stash.api_key` and `Config::Stash.endpoint`.
## Creating a new config
If you are looking to create a new common configuration for use with your scrapers here are the important things to know about the implimentation.
You will first need to require and inherit from the base config. The existing configs have been namespaced under `Config::`, this is not mantority but it is suggested for consistency. You can see this in the other configs with:
```Ruby
require_relative "config_base"
module Config
class Stash < ConfigBase
```
From there the base class expects you to define a `USER_CONFIG` hash. The base class contains `endpoint` and `api_key` methods that look for keys of the same name in the `USER_CONFIG`. If the keys are not present in the `USER_CONFIG`, the methods will just return nil. You are free to ad more than the default two keys to your `USER_CONFIG`, but you will need to define their accessor methods on your child class.

View File

@@ -0,0 +1,15 @@
# frozen_string_literal: true
class ConfigBase
class << self
def endpoint
return nil unless self::USER_CONFIG[:endpoint]
self::USER_CONFIG[:endpoint]
end
def api_key
return nil unless self::USER_CONFIG[:api_key]
self::USER_CONFIG[:api_key]
end
end
end

View File

@@ -0,0 +1,14 @@
# frozen_string_literal: true
require_relative "config_base"
module Config
class Stash < ConfigBase
# Tweak user settings below. An API Key can be generated in Stash's setting page
# ( Settings > Security > Authentication )
USER_CONFIG = {
endpoint: "http://localhost:9999",
api_key: ""
}
end
end

View File

@@ -0,0 +1,13 @@
# frozen_string_literal: true
require_relative "config_base"
module Config
class StashDB < ConfigBase
# Tweak user settings below. An API Key can be generated in StashDB's user page
USER_CONFIG = {
endpoint: "https://stashdb.org/graphql",
api_key: ""
}
end
end

View File

@@ -0,0 +1,53 @@
# Common GraphQL
## Base Class
The base class is currently very basic, it requires the `faraday` gem and the logger for you. From there it defines a `query` method that can take up to two arguments. The first argument is your GraphQL query, and the second argument that defaults to nil if not provided is any variables that you would like to be passed with your query.
It defines a private `logger` method that is just a shorthand of the shared stash logger class `Stash::Logger`.
It defines a `standard_headers` method with:
```Ruby
{
"Content-Type": "application/json",
"Accept": "application/json",
"DNT": "1",
}
```
It is expected that any child classes define an `@extra_headers` variable on initialization with any ApiKey headers or such that may be required.
## Stash Interface
The Stash Interface has been designed so that raw queries should be written in HEREDOC format as private methods like for example:
```Ruby
def gallery_path_query
<<-'GRAPHQL'
query FindGallery($id: ID!) {
findGallery(id: $id) {
path
}
}
GRAPHQL
end
```
Those raw queries can then be called by any defined public methods like for example:
```Ruby
def get_gallery_path(gallery_id)
query(gallery_path_query, id_variables(gallery_id))["findGallery"]
end
```
As seem in the above example there is also a private helper method shorthand defined for when the variable passed to the query is just an ID:
```Ruby
def id_variables(id)
variables = {
"id": id
}
end
```

View File

@@ -0,0 +1,71 @@
# frozen_string_literal: true
# Loading dependancies in a begin block so that we can give nice errors if they are missing
begin
# Logger should always be the first dependancy we try to load so that we know
# if we can depend on it when logging other errors.
require_relative "../logger"
require 'faraday'
rescue LoadError => error
if error.message.match?(/logger$/)
# If the logger isn't present, manually insert the level character when
# logging (this is what the logger class from the file would have done for us)
error_level_char = "\x01e\x02"
STDERR.puts(error_level_char + "[GraphQL] Missing 'logger.rb' file in the rb_common folder.")
exit
end
logger = Stash::Logger
if error.message.match?(/faraday$/)
logger.error("[GraphQL] Faraday gem is not installed, please install it with 'gem install faraday'")
else
logger.error("[GraphQL] Unexpected error #{error.class} encountered: #{error.message}")
end
exit
end
class GraphQLBase
def query(query, variables = nil)
headers = standard_api_headers.merge(@extra_headers)
connection = Faraday.new(url: @url, headers: headers)
response = connection.post do |request|
body = { "query" => query }
body["variables"] = variables if variables
request.body = body.to_json
end
case response.status
when 200
result = JSON.parse(response.body)
if result["error"]
result["error"]["errors"].each do |error|
logger.error("GraphQL error: #{error}")
exit!
end
else
result["data"]
end
when 401
logger.error("[GraphQL] HTTP Error 401, Unauthorized. Make sure you have added an API Key in the 'config.rb' in the 'rb_common/configs' folder")
return nil
else
logger.error("[GraphQL] Query failed: #{response.status} - #{response.body}")
return nil
end
end
private
def logger
Stash::Logger
end
def standard_api_headers
{
"Content-Type": "application/json",
"Accept": "application/json",
"DNT": "1",
}
end
end

View File

@@ -0,0 +1,684 @@
# frozen_string_literal: true
# Loading dependancies in a begin block so that we can give nice errors if they are missing
begin
require_relative "graphql_base"
require_relative "../configs/stash_config"
rescue LoadError => error
logger = Stash::Logger
if error.message.match?(/graphql_base$/)
logger.error("[GraphQL] Missing 'graphql/graphql_base.rb' file in the rb_common folder.")
elsif error.message.match?(/configs\/stash_config$/)
logger.error("[GraphQL] Missing 'configs/stash_config.rb' file in the rb_common folder.")
else
logger.error("[GraphQL] Unexpected error #{error.class} encountered: #{error.message}")
end
exit
end
module GraphQL
class Stash < GraphQLBase
def initialize(referer: nil)
@api_key = Config::Stash.api_key
@url = Config::Stash.endpoint + "/graphql"
@extra_headers = { "ApiKey": @api_key }
@extra_headers["Referer"] = referer if referer
end
def configuration
query(configuration_query)["configuration"]
end
def get_scene(scene_id)
response = query(find_scene_query, id_variables(scene_id))["findScene"]
end
def get_gallery(gallery_id)
query(find_gallery_query, id_variables(gallery_id))["findGallery"]
end
def get_gallery_path(gallery_id)
query(gallery_path_query, id_variables(gallery_id))["findGallery"]
end
private
def id_variables(id)
variables = {
"id": id
}
end
def configuration_query
<<-'GRAPHQL'
query Configuration {
configuration {
...ConfigData
}
}
fragment ConfigData on ConfigResult {
general {
...ConfigGeneralData
}
interface {
...ConfigInterfaceData
}
dlna {
...ConfigDLNAData
}
scraping {
...ConfigScrapingData
}
defaults {
...ConfigDefaultSettingsData
}
}
fragment ConfigGeneralData on ConfigGeneralResult {
stashes {
path
excludeVideo
excludeImage
}
databasePath
generatedPath
metadataPath
cachePath
calculateMD5
videoFileNamingAlgorithm
parallelTasks
previewAudio
previewSegments
previewSegmentDuration
previewExcludeStart
previewExcludeEnd
previewPreset
maxTranscodeSize
maxStreamingTranscodeSize
writeImageThumbnails
apiKey
username
password
maxSessionAge
trustedProxies
logFile
logOut
logLevel
logAccess
createGalleriesFromFolders
videoExtensions
imageExtensions
galleryExtensions
excludes
imageExcludes
customPerformerImageLocation
scraperUserAgent
scraperCertCheck
scraperCDPPath
stashBoxes {
name
endpoint
api_key
}
}
fragment ConfigInterfaceData on ConfigInterfaceResult {
menuItems
soundOnPreview
wallShowTitle
wallPlayback
maximumLoopDuration
noBrowser
autostartVideo
autostartVideoOnPlaySelected
continuePlaylistDefault
showStudioAsText
css
cssEnabled
language
slideshowDelay
disabledDropdownCreate {
performer
tag
studio
}
handyKey
funscriptOffset
}
fragment ConfigDLNAData on ConfigDLNAResult {
serverName
enabled
whitelistedIPs
interfaces
}
fragment ConfigScrapingData on ConfigScrapingResult {
scraperUserAgent
scraperCertCheck
scraperCDPPath
excludeTagPatterns
}
fragment ConfigDefaultSettingsData on ConfigDefaultSettingsResult {
scan {
useFileMetadata
stripFileExtension
scanGeneratePreviews
scanGenerateImagePreviews
scanGenerateSprites
scanGeneratePhashes
scanGenerateThumbnails
}
identify {
sources {
source {
...ScraperSourceData
}
options {
...IdentifyMetadataOptionsData
}
}
options {
...IdentifyMetadataOptionsData
}
}
autoTag {
performers
studios
tags
__typename
}
generate {
sprites
previews
imagePreviews
previewOptions {
previewSegments
previewSegmentDuration
previewExcludeStart
previewExcludeEnd
previewPreset
}
markers
markerImagePreviews
markerScreenshots
transcodes
phashes
}
deleteFile
deleteGenerated
}
fragment ScraperSourceData on ScraperSource {
stash_box_index
stash_box_endpoint
scraper_id
}
fragment IdentifyMetadataOptionsData on IdentifyMetadataOptions {
fieldOptions {
...IdentifyFieldOptionsData
}
setCoverImage
setOrganized
includeMalePerformers
}
fragment IdentifyFieldOptionsData on IdentifyFieldOptions {
field
strategy
createMissing
}
GRAPHQL
end
def find_scene_query
<<-'GRAPHQL'
query FindScene($id: ID!, $checksum: String) {
findScene(id: $id, checksum: $checksum) {
...SceneData
}
}
fragment SceneData on Scene {
id
checksum
oshash
title
details
url
date
rating
o_counter
organized
path
phash
interactive
file {
size
duration
video_codec
audio_codec
width
height
framerate
bitrate
}
paths {
screenshot
preview
stream
webp
vtt
chapters_vtt
sprite
funscript
}
scene_markers {
...SceneMarkerData
}
galleries {
...SlimGalleryData
}
studio {
...SlimStudioData
}
movies {
movie {
...MovieData
}
scene_index
}
tags {
...SlimTagData
}
performers {
...PerformerData
}
stash_ids {
endpoint
stash_id
}
}
fragment SceneMarkerData on SceneMarker {
id
title
seconds
stream
preview
screenshot
scene {
id
}
primary_tag {
id
name
aliases
}
tags {
id
name
aliases
}
}
fragment SlimGalleryData on Gallery {
id
checksum
path
title
date
url
details
rating
organized
image_count
cover {
file {
size
width
height
}
paths {
thumbnail
}
}
studio {
id
name
image_path
}
tags {
id
name
}
performers {
id
name
gender
favorite
image_path
}
scenes {
id
title
path
}
}
fragment SlimStudioData on Studio {
id
name
image_path
stash_ids {
endpoint
stash_id
}
parent_studio {
id
}
details
rating
aliases
}
fragment MovieData on Movie {
id
checksum
name
aliases
duration
date
rating
director
studio {
...SlimStudioData
}
synopsis
url
front_image_path
back_image_path
scene_count
scenes {
id
title
path
}
}
fragment SlimTagData on Tag {
id
name
aliases
image_path
}
fragment PerformerData on Performer {
id
checksum
name
url
gender
twitter
instagram
birthdate
ethnicity
country
eye_color
height
measurements
fake_tits
career_length
tattoos
piercings
aliases
favorite
image_path
scene_count
image_count
gallery_count
movie_count
tags {
...SlimTagData
}
stash_ids {
stash_id
endpoint
}
rating
details
death_date
hair_color
weight
}
GRAPHQL
end
def find_gallery_query
<<-'GRAPHQL'
query FindGallery($id: ID!) {
findGallery(id: $id) {
...GalleryData
}
}
fragment GalleryData on Gallery {
id
checksum
path
created_at
updated_at
title
date
url
details
rating
organized
images {
...SlimImageData
}
cover {
...SlimImageData
}
studio {
...SlimStudioData
}
tags {
...SlimTagData
}
performers {
...PerformerData
}
scenes {
...SlimSceneData
}
}
fragment SlimImageData on Image {
id
checksum
title
rating
organized
o_counter
path
file {
size
width
height
}
paths {
thumbnail
image
}
galleries {
id
path
title
}
studio {
id
name
image_path
}
tags {
id
name
}
performers {
id
name
gender
favorite
image_path
}
}
fragment SlimStudioData on Studio {
id
name
image_path
stash_ids {
endpoint
stash_id
}
parent_studio {
id
}
details
rating
aliases
}
fragment SlimTagData on Tag {
id
name
aliases
image_path
}
fragment PerformerData on Performer {
id
checksum
name
url
gender
twitter
instagram
birthdate
ethnicity
country
eye_color
height
measurements
fake_tits
career_length
tattoos
piercings
aliases
favorite
image_path
scene_count
image_count
gallery_count
movie_count
tags {
...SlimTagData
}
stash_ids {
stash_id
endpoint
}
rating
details
death_date
hair_color
weight
}
fragment SlimSceneData on Scene {
id
checksum
oshash
title
details
url
date
rating
o_counter
organized
path
phash
interactive
file {
size
duration
video_codec
audio_codec
width
height
framerate
bitrate
}
paths {
screenshot
preview
stream
webp
vtt
chapters_vtt
sprite
funscript
}
scene_markers {
id
title
seconds
}
galleries {
id
path
title
}
studio {
id
name
image_path
}
movies {
movie {
id
name
front_image_path
}
scene_index
}
tags {
id
name
}
performers {
id
name
gender
favorite
image_path
}
stash_ids {
endpoint
stash_id
}
}
GRAPHQL
end
def gallery_path_query
<<-'GRAPHQL'
query FindGallery($id: ID!) {
findGallery(id: $id) {
path
}
}
GRAPHQL
end
end
end

View File

@@ -0,0 +1,49 @@
# frozen_string_literal: true
module Stash
class Logger
class << self
# Log messages sent from a script scraper instance are transmitted via stderr
# and are encoded with a prefix consisting of the special character SOH, then
# the log level (one of t, d, i, w, or e - corresponding to trace, debug, info,
# warning and error levels respectively), then the special character STX.
def trace(text)
log("t", text)
end
def debug(text)
log("d", text)
end
def info(text)
log("i", text)
end
def warning(text)
log("w", text)
end
def error(text)
log("e", text)
end
private
def log(level, text)
level_char = control_wrap(level)
# I'm not sure what case is covered by the image part of this regex, but it
# was present in the py_common version so I've included it.
text.dup.gsub!(/data:image.+?;base64(.+?')/) { |match| text }
text.split("\n").each { |message| STDERR.puts(level_char + message) }
end
def control_wrap(level)
# Wraps the string between the SOH and STX control characters
"\x01#{level}\x02"
end
end
end
end

View File

@@ -0,0 +1,18 @@
id: rb_common
name: rb_common
metadata: {}
version: 39ff3d5
date: "2023-11-22 00:08:53"
requires: []
source_repository: https://stashapp.github.io/CommunityScrapers/stable/index.yml
files:
- configs/README.md
- configs/stashdb_config.rb
- configs/stash_config.rb
- configs/config_base.rb
- README.md
- graphql/README.md
- graphql/graphql_base.rb
- graphql/stash.rb
- logger.rb
- rb_common.rb

View File

@@ -0,0 +1,9 @@
# frozen_string_literal: true
# Require this file to require all current common interfaces at once
require_relative "configs/config_base"
require_relative "configs/stash_config"
require_relative "configs/stashdb_config"
require_relative "graphql/graphql_base"
require_relative "graphql/stash"
require_relative "logger"