Downloader#

The news/downloader.py module ships the two manager classes that the news-fetching Celery tasks delegate to. Both share a BaseManager that supplies a URL blacklist check (against settings.NEWS_BLACKLIST) and a Yahoo-video URL detector.

  • ShareIndexEtfManager – RSS-based fetcher that reads Yahoo Finance’s RSS feed for a share / ETF / index symbol, follows the redirect to each article, parses metadata via metadata_parser (falling back to the RSS-embedded data on failure), and saves a News row per article.

  • CoinManager – Karpet-based fetcher; pulls a coin’s recent news feed via Karpet.fetch_news and saves one News row per item.

Both managers skip URLs already present in the database and URLs on the blacklist; both fetch() methods return the count of new news rows created.

BaseManager#

class richy.news.downloader.BaseManager[source]#

Bases: object

BaseManager.is_on_blacklist(url)[source]#

Checks if URL is on blacklist (NEWS_BLACKLIST). Check is based on server name (“www.” is stripped off if found).

Parameters:

url – URL to be checked.

Returns:

Boolean - True if on the list False otherwise.

BaseManager.is_video(url)[source]#

ShareIndexEtfManager#

class richy.news.downloader.ShareIndexEtfManager(item)[source]#

Bases: BaseManager

ShareIndexEtfManager.fetch()[source]#

Fetches news for self.item. Uses YAHOO’s RSS feed (NEWS_FEED) and for each item in the feed tries to download metadata inside self.create_news().

Also uses blacklist from settings - NEWS_BLACKLIST.

Returns:

Number of downloaded news.

ShareIndexEtfManager.parse_target(url)[source]#

If given URL is YAHOO’s redirect URL tries to fetch the target URL from the script that is returned by YAHOO on origin URL.

If not return origin URL.

Parameters:

url – RSS feed URL.

Returns:

Target URL.

ShareIndexEtfManager.create_news(url, e)[source]#

Tries to parse our metadata and save them to the news. If no metadata or an exception occurred, RSS data are used.

Parameters:
  • url – News URL.

  • e – RSS feed item data.

Returns:

News model instance - the newly created news.

CoinManager#

class richy.news.downloader.CoinManager(item)[source]#

Bases: BaseManager

CoinManager.fetch()[source]#

Fetches news with karpet library. Also uses blacklist from settings - NEWS_BLACKLIST.

Returns:

Number of downloaded news.