Horizontal scaling - Tasks in the memory #761
Replies: 1 comment 2 replies
-
Hi @teehamaral Let's start with the second part of your message. I looked at your messages from last week, so I assume you were using the previous Docker. Try the new one that is available, and I will discuss it more next week because the new version uses entirely new memory management tools and is quite different. We designed it to be extremely efficient in memory usage. This version will undergo testing in Google Cloud Functions, which is not easy at all, but it works great. Regarding the first point, building a decentralized model is on the roadmap. For the decentralized approach, I am creating an engine from the ground up, which does not typically use the standard Docker. It spawns all the crawl tasks among clusters, with each cluster containing nodes that can collect and combine data before bringing it back. It employs classic algorithms from computer networks, which we will implement in the second quarter. Right now, the most important focus for me has been efficiency and memory usage. So, please try the new Docker and report any issues; we will fix them to ensure it works effectively. Thank you so much. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Looks like it's memory-oriented for creating new tasks, so how do you make it run in multiple servers horizontally scaling? Because of the way it is now, it will cause inconsistency in querying for the task ID to retrieve the results if the request goes to a server where was not created the task.
Also when creating tasks via
/crawl
endpoint, including multiple URLs (about 10 URLs), it consumes a good amount of memory, I was able to see peaks of 99%.Does anyone already have this kind of problem?
Beta Was this translation helpful? Give feedback.
All reactions