Skip to content

Inconsistent HttpParser fields #1437

Open
@JJ-Author

Description

@JJ-Author

Describe the bug
the field values of the request object (for a GET request via http and https) vary between http and https (intercepted) in an inconsistent and potentially incorrect manner that is hindering correct implementation of a custom archive redirection plugin

  • protocol is None always -> is this supposed to carry scheme information? how to distinguish whether the request is https or not???
  • _url is incomplete (missing scheme+host) for https requests
  • host is None for https request
  • port is incorrectly reported as 80 for https request (that is actually sent to 443)

To Reproduce
the issue can be reproduced with

  1. running poetry install and then poetry shell and then python proxy/request_proxy.py in the root dir of https://github.com/kuefmz/https-interception-proxypy/
  2. curl -x http://0.0.0.0:8899/ --cacert ca-cert.pem https://www.w3id.org/simulation/ontology/
  3. curl -x http://0.0.0.0:8899/ --cacert ca-cert.pem http://www.w3id.org/simulation/ontology/

Expected behavior

  • protocol should not be None but http or https?
  • _url should be full request url (including scheme+host) for https requests that are intercepted
  • host should be equivalent to the FQDN for https request as they are for http request
  • port should be correctly reported

Version information

  • OS: [ubuntu 20.04]
  • Browser [curl]
  • Device: [amd64]
  • proxy.py Version [2.4.4]

Additional context

Log from the custom proxy with output of request object and selection of fields

2024-08-02 15:51:26,940 - pid:209673 [I] plugins.load:89 - Loaded plugin proxy.http.proxy.HttpProxyPlugin
2024-08-02 15:51:26,940 - pid:209673 [I] plugins.load:89 - Loaded plugin __main__.RequestPlugin
Request _url: www.w3id.org:443
Request.method: b'CONNECT'
Request protocol: None
Request host: b'www.w3id.org'
Request path: None
Request properties: {'state': 6, 'type': 1, 'protocol': None, 'host': b'www.w3id.org', 'port': 443, 'path': None, 'method': b'CONNECT', 'code': None, 'reason': None, 'version': b'HTTP/1.1', 'total_size': 116, 'buffer': None, 'headers': {b'host': (b'Host', b'www.w3id.org:443'), b'user-agent': (b'User-Agent', b'curl/7.81.0'), b'proxy-connection': (b'Proxy-Connection', b'Keep-Alive')}, 'body': None, 'chunk': None, '_url': , '_is_chunked_encoded': False, '_content_expected': False, '_is_https_tunnel': True}

Request _url: /simulation/ontology/
Request.method: b'GET'
Request protocol: None
Request host: None
Request path: b'/simulation/ontology/'
Request properties: {'state': 6, 'type': 1, 'protocol': None, 'host': None, 'port': 80, 'path': b'/simulation/ontology/', 'method': b'GET', 'code': None, 'reason': None, 'version': b'HTTP/1.1', 'total_size': 96, 'buffer': None, 'headers': {b'host': (b'Host', b'www.w3id.org'), b'user-agent': (b'User-Agent', b'curl/7.81.0'), b'accept': (b'Accept', b'/')}, 'body': None, 'chunk': None, '_url': <proxy.http.url.Url object at 0x7f50b472bc10>, '_is_chunked_encoded': False, '_content_expected': False, '_is_https_tunnel': False}
2024-08-02 15:51:35,421 - pid:209683 [I] server.access_log:388 - 127.0.0.1:59292 - CONNECT www.w3id.org:443 - 683 bytes - 680.55ms

Request _url: http://www.w3id.org/simulation/ontology/
Request.method: b'GET'
Request protocol: None
Request host: b'www.w3id.org'
Request path: b'/simulation/ontology/'
Request properties: {'state': 6, 'type': 1, 'protocol': None, 'host': b'www.w3id.org', 'port': 80, 'path': b'/simulation/ontology/', 'method': b'GET', 'code': None, 'reason': None, 'version': b'HTTP/1.1', 'total_size': 145, 'buffer': None, 'headers': {b'host': (b'Host', b'www.w3id.org'), b'user-agent': (b'User-Agent', b'curl/7.81.0'), b'accept': (b'Accept', b'/'), b'proxy-connection': (b'Proxy-Connection', b'Keep-Alive')}, 'body': None, 'chunk': None, '_url': <proxy.http.url.Url object at 0x7f50b4520610>, '_is_chunked_encoded': False, '_content_expected': False, '_is_https_tunnel': False}
2024-08-02 15:51:44,094 - pid:209688 [I] server.access_log:388 - 127.0.0.1:49428 - GET www.w3id.org:80/simulation/ontology/ - 301 Moved Permanently - 573 bytes - 479.09ms

Metadata

Metadata

Assignees

Labels

QuestionQuestions related to proxy server

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions