'charmap' codec can't decode byte 0x81 in position 1980: character maps to <undefined> #1026
Replies: 4 comments 1 reply
-
I have same situation |
Beta Was this translation helpful? Give feedback.
-
Same here |
Beta Was this translation helpful? Give feedback.
-
Me too |
Beta Was this translation helpful? Give feedback.
-
Yes same problem here, I managed to isolate the problem to it being somewhere in the data pipeline within LiteLLM. It has something to do with the UTF-8 encoding proces. I fixed it by downgrading the LiteLLM version so it seems to be a version bug. This gave other bugs so I switched to non-llm scraping but that might help you? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm trying to web scrape cyber event related sites using crawl4ai and deepseek but I keep encountering this error. I tried following example implementations from the site and videos but they all get the same error any help? Code below
Extracted Items: [{'index': 0, 'error': True, 'tags': ['error'], 'content': "'charmap' codec can't decode byte 0x81 in position 1980: character maps to "}]
import asyncio
import json
import os
from typing import List
from crawl4ai import AsyncWebCrawler, BrowserConfig, CacheMode, CrawlerRunConfig, LLMConfig
from crawl4ai.extraction_strategy import LLMExtractionStrategy
from pydantic import BaseModel, Field
URL_TO_SCRAPE = "https://www.bleepingcomputer.com/news/security/toyota-confirms-third-party-data-breach-impacting-customers/"
INSTRUCTION_TO_LLM = (
"From the source, answer the following with one word and if it can't be determined answer with Undetermined: "
"Threat actor type (Criminal, Hobbyist, Hacktivist, State Sponsored, etc), Industry, "
"Motive (Financial, Political, Protest, Espionage, Sabotage, etc), Event Type (Exploitive, Disruptive, Mixed), Country, State, County. "
)
class ThreatIntel(BaseModel):
threat_actor_type: str = Field(..., alias="Threat actor type")
industry: str
motive: str
event_type: str
country: str
state: str
county: str
async def main():
if name == "main":
asyncio.run(main())
Beta Was this translation helpful? Give feedback.
All reactions