Skip to content

Commit 14a8c9b

Browse files
committed
Add docs for process_response? and parse.
1 parent c6c332c commit 14a8c9b

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

docs/scraper-reference.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,30 @@ More information about how filters work is available on the [Filter Reference](.
187187

188188
_Note: this filter is disabled by default._
189189

190+
### Processing responses before filters
191+
192+
These methods are runned before filter stacks, and can directly process responses.
193+
194+
* `process_response?(response)`
195+
196+
Determine whether a response should be processed. A response will be dropped if this method returns `false`.
197+
198+
It is useful to filter pages, such as empty, invalid, or redirecting pages, depending on the content.
199+
200+
Example: [lib/docs/scrapers/kotlin.rb](../lib/docs/scrapers/kotlin.rb)
201+
202+
203+
* `parse(response)`
204+
205+
Parse HTTP/File response, and convert to a Nokogiri document by default.
206+
207+
Overrides this method if you want to modified HTML source code before Nokogiri.
208+
It is useful to preserve whitespaces of code segments within non-pre blocks, because Nokogiri may delete them.
209+
210+
Example: [lib/docs/scrapers/go.rb](../lib/docs/scrapers/go.rb)
211+
212+
213+
190214
## Keeping scrapers up-to-date
191215

192216
In order to keep scrapers up-to-date the `get_latest_version(opts)` method should be overridden. If `self.release` is defined, this should return the latest version of the documentation. If `self.release` is not defined, it should return the Epoch time when the documentation was last modified. If the documentation will never change, simply return `1.0.0`. The result of this method is periodically reported in a "Documentation versions report" issue which helps maintainers keep track of outdated documentations.

0 commit comments

Comments
 (0)