This could be really useful. Edge uses the chromium engine so hopefully you could use it there as well.
Article scanning tool for newspapers
-
I am using BAS to develop a tool for scanning new articles on over 200 newspaper pages. I fetch articles from one of three sources:
Articles listed in the sitemap.
Articles available in the RSS feed.
Articles displayed on the homepage.
However, yesterday I encountered a case where an article did not appear on the homepage or in the sitemap. Instead, it was located in a subcategory.I would like to inquire about a more comprehensive approach to scanning articles so that none are missed.
-
@ptt-bds said in Article scanning tool for newspapers:
I am using BAS to develop a tool for scanning new articles on over 200 newspaper pages. I fetch articles from one of three sources:
Articles listed in the sitemap.
Articles available in the RSS feed.
Articles displayed on the homepage.
However, yesterday I encountered a case where an article did not appear on the homepage or in the sitemap. Instead, it was located in a subcategory.I would like to inquire about a more comprehensive approach to scanning articles so that none are missed.
Your task shouldn't be very difficult to implement, but not with BAS. Instead, consider using other tools. If you have some experience with Python, take a look at https://scrapy.org.
Scrapy, for example, has its own crawler spider that can gather information from websites, which might be exactly what you need.