How to easily crawl article content
A simple way to crawler an article from the internet is using the python library “from newspaper import Article”.
You add a link and then you get the content that you wish. Here you can find the whole documentation.
Code sample
from newspaper import Article url = "https://fivethirtyeight.com/features/what-were-watching-in-the-nhls-playoff-races/" article = Article(url) # Download html article.download() # Get information article.parse() article.nlp() print("title",article.title, "\n") print("publish_date",article.publish_date, "\n") print("top_image",article.top_image, "\n") print("summary",article.summary, "\n") print("keywords",article.keywords, "\n")