Replies: 2 comments 1 reply
-
I ended up using Redis SUBSCRIBE to act as a buffered stream for URL input and the go binary act as the consumer for the input from Redis. There should be a more simple way to do this but I'm happy that this is working and it seems like the memory footprint is not as large as I used puppeteer before. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Yes, each url a goroutine. No need to use redis for single machine. Use redis or other db for distributed system. Too general to teach you what to do, you can get them by just googling |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm new to this concurrency thing, let's say I want to implement a page crawler with the page pool, the same as the one in example https://github.com/go-rod/rod/blob/master/examples_test.go#L532 but with a bit of modification to accept an array of URLs and doing the crawling job, is it a best practice to just create all the goroutine for every URL I want to crawl or do I need to implement another pool for this? For example:
Seeing the process at the
htop
command, it doesn't really create a hundred threads as I thought it would and it looks like go uses all the max core of the current CPU. Is there a way to limit this process usage of cores? Thanks 🙏 .Beta Was this translation helpful? Give feedback.
All reactions