HCMUS-Scraper

HCMUS-News Scraper: a web scraper that crawl news from my university websites

Inspired from this github repo

Visit the Results page

Websites list that I scraping from:

Technology:

At first, I use Scrapy but then one of the page that I want to crawl has dynamic JS loaded content so I switch to Selenium.

What I have learned:

working with json, basic github ci/cd, scraping static and dynamic content, how to overcome website’s blocking objection.