|
| 1 | + |
| 2 | +geckorp |
| 3 | += |
| 4 | + |
| 5 | +This is a client implementation of Firefox its remote debug protocol in python. |
| 6 | + |
| 7 | +The protocol is used in Firefox' DevTools in remote debug mode. It essentially exposes the raw api to interact with the debug server and has some similarities with a common webdriver. See also [Documentation](https://geckordp.readthedocs.io/en/latest). |
| 8 | + |
| 9 | +***What's possible with geckordp?*** |
| 10 | + |
| 11 | +Geckordp is meant to be used as a low level library to build tools on top. |
| 12 | +With a few helpers like the WebExtension-API and a proxy server, it can be feature rich enough for: |
| 13 | + |
| 14 | ++ web ui-testing |
| 15 | ++ extension testing |
| 16 | ++ browser test tools |
| 17 | ++ webdriver |
| 18 | ++ data scraping |
| 19 | ++ https recording |
| 20 | ++ network traffic analysis |
| 21 | ++ remote controller for browser |
| 22 | ++ ...and possibly more |
| 23 | + |
| 24 | + |
| 25 | +## Getting Started |
| 26 | +# <!-- REMOVE --> |
| 27 | + |
| 28 | +To use Geckordp, install it with: |
| 29 | + |
| 30 | +```bash |
| 31 | +pip install geckordp |
| 32 | +# python -m pip install geckordp |
| 33 | +# python -m pip install geckordp[develop] |
| 34 | +``` |
| 35 | +Documentation can be generated with: |
| 36 | +``` |
| 37 | +sphinx-build -a -c docs/src -b html docs/build docs |
| 38 | +``` |
| 39 | + |
| 40 | +## Usage |
| 41 | +# <!-- REMOVE --> |
| 42 | + |
| 43 | +```python |
| 44 | +import json |
| 45 | +from geckordp.rdp_client import RDPClient |
| 46 | +from geckordp.actors.root import RootActor |
| 47 | +from geckordp.profile import ProfileManager |
| 48 | +from geckordp.firefox import Firefox |
| 49 | + |
| 50 | + |
| 51 | +""" Uncomment to enable debug output |
| 52 | +""" |
| 53 | +#from geckordp.settings import GECKORDP |
| 54 | +#GECKORDP.DEBUG = 1 |
| 55 | +#GECKORDP.DEBUG_REQUEST = 1 |
| 56 | +#GECKORDP.DEBUG_RESPONSE = 1 |
| 57 | + |
| 58 | + |
| 59 | +def main(): |
| 60 | + # clone default profile to 'geckordp' |
| 61 | + pm = ProfileManager() |
| 62 | + profile_name = "geckordp" |
| 63 | + port = 6000 |
| 64 | + pm.clone("default-release", profile_name) |
| 65 | + profile = pm.get_profile_by_name(profile_name) |
| 66 | + profile.set_required_configs() |
| 67 | + |
| 68 | + # start firefox with specified profile |
| 69 | + Firefox.start("https://example.com/", |
| 70 | + port, |
| 71 | + profile_name, |
| 72 | + ["-headless"]) |
| 73 | + |
| 74 | + # create client and connect to firefox |
| 75 | + client = RDPClient() |
| 76 | + client.connect("localhost", port) |
| 77 | + |
| 78 | + # initialize root |
| 79 | + root = RootActor(client) |
| 80 | + |
| 81 | + # get a list of tabs |
| 82 | + tabs = root.list_tabs() |
| 83 | + print(json.dumps(tabs, indent=2)) |
| 84 | + |
| 85 | + input() |
| 86 | + |
| 87 | +if __name__ == "__main__": |
| 88 | + main() |
| 89 | +``` |
| 90 | +For more examples see [here](https://github.com/reapler/geckordp/issues). |
| 91 | + |
| 92 | + |
| 93 | +## Tested Platforms |
| 94 | +# <!-- REMOVE --> |
| 95 | + |
| 96 | +| Tested Platform | Working | Firefox-Version | |
| 97 | +| -------------------------------------------| ------------------------| ------------------------| |
| 98 | +| Windows (x64) | yes | 89.0 | |
| 99 | +| Ubuntu 20.04 | yes | 89.0 | |
| 100 | +| macOS 12 | ? | 89.0 | |
| 101 | + |
| 102 | +Geckordp requires minimum Python 3.7 and the latest Firefox build. Older versions of Firefox may also work as long the API changes are not too drastically. In case of doubt, run tests with: |
| 103 | +```bash |
| 104 | +pytest tests/ |
| 105 | +``` |
| 106 | + |
| 107 | + |
| 108 | +## Contribute |
| 109 | +# <!-- REMOVE --> |
| 110 | + |
| 111 | +Every help in form of issues or pull requests are very appreciated. If you would like to improve the project there are a few things to keep in mind: |
| 112 | + |
| 113 | +For submitted code: |
| 114 | +* formatting |
| 115 | +* tests required (if new) |
| 116 | +* should basically reflect the geckodriver api (if possible) |
| 117 | + |
| 118 | +For issues or improvements see [here](https://github.com/reapler/geckordp/issues/new). |
| 119 | + |
| 120 | +For features, I suggest to just ask on the issue tracker. |
| 121 | + |
| 122 | + |
| 123 | +## Develop |
| 124 | +# <!-- REMOVE --> |
| 125 | + |
| 126 | +To get started, here is a rough list of some notable objectives: |
| 127 | + |
| 128 | +* add remaining actors from geckodriver |
| 129 | +* add documentation for *all* actors its functions (even official repository got none) |
| 130 | + |
| 131 | + |
| 132 | +For example let's say you would like to implement an actor from [here](https://github.com/mozilla/gecko-dev/tree/master/devtools/shared/specs). |
| 133 | + |
| 134 | +Since there's no documentation you have no idea what it's exactly doing or where the actor ID is derived from. |
| 135 | +In order to bridge this knowledge-gap without fully understanding the geckodriver source code, I suggest capturing the required packets for your needs (in my opinion it's more straightforward and simpler). |
| 136 | + |
| 137 | +[Here](https://github.com/reapler/geckordp/blob/master/dev) you can already view available pcap-dumps. In addition to the dumps, the converter script allows to easily follow and understand the traffic between client and server. |
| 138 | +To convert a pcap-dump, run this command in 'dev/' folder: |
| 139 | +``` |
| 140 | +python converter.py -i connect-navigate.pcapng |
| 141 | +``` |
| 142 | + |
| 143 | +# <!-- REMOVE --> |
| 144 | + |
| 145 | +But you can also record the traffic yourself. |
| 146 | + |
| 147 | +For this example I will use [Wireshark](https://www.wireshark.org/) but any other packet capture software should do it. |
| 148 | + |
| 149 | +But at first, one Firefox instance need to be started like this: |
| 150 | + |
| 151 | +``` |
| 152 | +firefox -new-instance -no-remote -new-window http://example.com/ -p geckordp --start-debugger-server 6000 |
| 153 | +``` |
| 154 | +The other instance can be started with: |
| 155 | + |
| 156 | +``` |
| 157 | +firefox -new-instance -no-remote -new-window about:debugging#/setup |
| 158 | +``` |
| 159 | +After Firefox is started, it's time to fire up Wireshark and to record the loopback interface. |
| 160 | + |
| 161 | +On the second Firefox instance under section "Network Location" add **localhost:6000** and connect to the first instance. |
| 162 | + |
| 163 | +Now you can proceed as you wish and use the DevTools as you wanted. A few comments here and there on the packets before you do an action should make it easier to analyze it later. |
| 164 | + |
| 165 | + |
| 166 | +However, to get a better overview, it is a good idea to filter the unnecessary packets out. |
| 167 | + |
| 168 | +These filters should do a great job: |
| 169 | +``` |
| 170 | +# filter by port |
| 171 | +tcp.port == 6000 |
| 172 | +
|
| 173 | +# with actual json payload |
| 174 | +tcp.port == 6000 && tcp.payload |
| 175 | +
|
| 176 | +# with payload and comments |
| 177 | +tcp.port == 6000 && tcp.payload || frame.comment |
| 178 | +``` |
| 179 | + |
| 180 | + |
| 181 | +However, you can always ask the Mozilla developers on their [matrix](https://chat.mozilla.org)-[channels](https://wiki.mozilla.org/Matrix#Software_Development) or here in the [issue](https://github.com/reapler/geckordp/issues) section for help. |
| 182 | + |
| 183 | + |
| 184 | +## Technical Details |
| 185 | +# <!-- REMOVE --> |
| 186 | + |
| 187 | +To be able to communicate with the server, a pre-configured profile is required. |
| 188 | + |
| 189 | +Geckordp offers additional helper functions to resolve this problem with the [ProfileManager](https://github.com/reapler/geckordp/blob/master/geckordp/profile.py). |
| 190 | + |
| 191 | +The following flags are changed on profile configuration: |
| 192 | + |
| 193 | + ### disable crash-recover after 'ungraceful' process termination |
| 194 | + ("browser.sessionstore.resume_from_crash", False) |
| 195 | + |
| 196 | + ### disable safe-mode after 'ungraceful' process termination |
| 197 | + ("browser.sessionstore.max_resumed_crashes", 0) |
| 198 | + ("toolkit.startup.max_resumed_crashes", -1) |
| 199 | + ("browser.sessionstore.restore_on_demand", False) |
| 200 | + ("browser.sessionstore.restore_tabs_lazily", False) |
| 201 | + |
| 202 | + ### set download folder (not set by firefox) |
| 203 | + ("browser.download.dir", str(Path.home())) |
| 204 | + |
| 205 | + ### enable compatibility |
| 206 | + ("devtools.chrome.enabled", True) |
| 207 | + |
| 208 | + ### don't open dialog to accept connections from client |
| 209 | + ("devtools.debugger.prompt-connection", False) |
| 210 | + |
| 211 | + ### enable remote debugging |
| 212 | + ("devtools.debugger.remote-enabled", True) |
| 213 | + |
| 214 | + ### allow tab isolation (for e.g. separate cookie-jar) |
| 215 | + ("privacy.userContext.enabled", True) |
| 216 | + |
| 217 | + ### misc |
| 218 | + ("devtools.cache.disabled", True) |
| 219 | + ("browser.aboutConfig.showWarning", False) |
| 220 | + ("browser.tabs.warnOnClose", False) |
| 221 | + ("browser.tabs.warnOnCloseOtherTabs", False) |
| 222 | + ("browser.shell.skipDefaultBrowserCheckOnFirstRun", True) |
| 223 | + ("pdfjs.firstRun", True) |
| 224 | + ("doh-rollout.doneFirstRun", True) |
| 225 | + ("browser.startup.firstrunSkipsHomepage", True) |
| 226 | + ("browser.tabs.warnOnOpen", False) |
| 227 | + ("browser.warnOnQuit", False) |
| 228 | + ("toolkit.telemetry.reportingpolicy.firstRun", False) |
| 229 | + ("trailhead.firstrun.didSeeAboutWelcome", True) |
| 230 | + |
| 231 | +# <!-- REMOVE --> |
| 232 | +Once the new profile was created, Firefox can be started with it. |
| 233 | +However, actors need to be initialized at first. |
| 234 | + |
| 235 | +Some actors need to call additional functions to get initialized on server-side. |
| 236 | +But this is not always necessary and depends on what is actually needed. |
| 237 | +These required functions and its actors are initialized respectively used in this order according to the [pcap-dumps](https://github.com/reapler/geckordp/blob/master/dev). |
| 238 | + |
| 239 | + |
| 240 | +| Browser initialization: |
| 241 | + |
| 242 | + RDPClient() -> .connect() |
| 243 | + v |
| 244 | + RootActor() -> .get_root() |
| 245 | + v |
| 246 | + DeviceActor() -> .get_description() |
| 247 | + v |
| 248 | + ProcessActor() -> .get_target() |
| 249 | + v |
| 250 | + WebConsoleActor() -> .start_listeners([]) |
| 251 | + v |
| 252 | + ContentProcessActor() -> .list_workers() |
| 253 | + |
| 254 | +| Tab initialization: |
| 255 | + |
| 256 | + TabActor() -> .get_target()* |
| 257 | + v |
| 258 | + BrowsingContextActor() -> .attach()* |
| 259 | + v |
| 260 | + WebConsoleActor() -> .start_listeners([])* |
| 261 | + v |
| 262 | + ThreadActor() -> .attach()* |
| 263 | + v |
| 264 | + WatcherActor() -> .watch_resources(...)* |
| 265 | + v |
| 266 | + TargetConfigurationActor() |
| 267 | + |
| 268 | + |
| 269 | +\**required if this actor will be used or events are wanted* |
| 270 | + |
| 271 | +# <!-- REMOVE --> |
| 272 | +The following hierarchy diagram shows dependencies between the actors and how to initialize single actors: |
| 273 | + |
| 274 | +<img src="actor-hierarchy.png"> |
| 275 | + |
| 276 | +For debugging purposes, Geckordp can be configured to print out requests and responses to better understand the structure of the json packets. |
| 277 | +To enable it use: |
| 278 | + |
| 279 | +```python |
| 280 | +from geckordp.settings import GECKORDP |
| 281 | +GECKORDP.DEBUG = 1 |
| 282 | +GECKORDP.DEBUG_REQUEST = 1 |
| 283 | +GECKORDP.DEBUG_RESPONSE = 1 |
| 284 | +# environment variables can also be used for e.g. |
| 285 | +# GECKORDP_DEBUG_RESPONSE=1 |
| 286 | +``` |
| 287 | + |
| 288 | +# <!-- REMOVE --> |
| 289 | + |
| 290 | +Other noteworthy general hints, issues or experiences: |
| 291 | + |
| 292 | +* actor initialization (plus the related functions like attach, watch or listening) on blank new tabs may get detached after visiting a new url and must be reinitiated (can be avoided if the page got a html header & body) |
| 293 | +* received messages are just plain python dictionaries and most of the time it has consistent fields which can be directly accessed |
| 294 | +* failed requests will return 'None' |
| 295 | +* actors can have multiple contexts, that means different actor IDs can have the same actor model (for e.g. WebConsoleActor for process or tab) |
| 296 | +* called functions within manually registered **async** handlers on RDPClient can not call functions which emits 'RDPClient.request_response()' later in its execution path (instead use non-async handlers in this case) |
| 297 | +* on a new Firefox update it can happen that a few events doesn't get caught by the RDPClient handler or requests getting a wrong response, unfortunately a few event/response packets doesn't follow the same pattern and events must be manually specified in Geckordp which can have the implied side effects |
| 298 | + |
| 299 | + |
| 300 | +## License |
| 301 | +# <!-- REMOVE --> |
| 302 | +``` |
| 303 | +MIT License |
| 304 | +
|
| 305 | +Copyright (c) 2021 reapler |
| 306 | +
|
| 307 | +Permission is hereby granted, free of charge, to any person obtaining |
| 308 | +a copy of this software and associated documentation files (the |
| 309 | +"Software"), to deal in the Software without restriction, including |
| 310 | +without limitation the rights to use, copy, modify, merge, publish, |
| 311 | +distribute, sublicense, and/or sell copies of the Software, and to |
| 312 | +permit persons to whom the Software is furnished to do so, subject to |
| 313 | +the following conditions: |
| 314 | +
|
| 315 | +The above copyright notice and this permission notice shall be |
| 316 | +included in all copies or substantial portions of the Software. |
| 317 | +
|
| 318 | +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, |
| 319 | +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF |
| 320 | +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND |
| 321 | +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE |
| 322 | +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION |
| 323 | +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION |
| 324 | +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
| 325 | +``` |
| 326 | + |
| 327 | +<!-- CLASS_INDEX --> |
0 commit comments