Downloader
The news/downloader.py module ships the two manager classes
that the news-fetching Celery tasks delegate to. Both share a
BaseManager that supplies a URL
blacklist check (against settings.NEWS_BLACKLIST) and a
Yahoo-video URL detector.
ShareIndexEtfManager –
RSS-based fetcher that reads Yahoo Finance’s RSS feed for a
share / ETF / index symbol, follows the redirect to each
article, parses metadata via metadata_parser (falling
back to the RSS-embedded data on failure), and saves a
News row per article.
CoinManager – Karpet-based
fetcher; pulls a coin’s recent news feed via
Karpet.fetch_news and saves one
News row per item.
Both managers skip URLs already present in the database and
URLs on the blacklist; both fetch() methods return the
count of new news rows created.
BaseManager
-
class richy.news.downloader.BaseManager[source]
Bases: object
-
BaseManager.is_on_blacklist(url)[source]
Checks if URL is on blacklist (NEWS_BLACKLIST).
Check is based on server name (“www.” is stripped off if found).
- Parameters:
url – URL to be checked.
- Returns:
Boolean - True if on the list False otherwise.
-
BaseManager.is_video(url)[source]
ShareIndexEtfManager
-
class richy.news.downloader.ShareIndexEtfManager(item)[source]
Bases: BaseManager
-
ShareIndexEtfManager.fetch()[source]
Fetches news for self.item.
Uses YAHOO’s RSS feed (NEWS_FEED) and for each
item in the feed tries to download metadata inside
self.create_news().
Also uses blacklist from settings - NEWS_BLACKLIST.
- Returns:
Number of downloaded news.
-
ShareIndexEtfManager.parse_target(url)[source]
If given URL is YAHOO’s redirect URL
tries to fetch the target URL from the script that
is returned by YAHOO on origin URL.
If not return origin URL.
- Parameters:
url – RSS feed URL.
- Returns:
Target URL.
-
ShareIndexEtfManager.create_news(url, e)[source]
Tries to parse our metadata and save them to the news.
If no metadata or an exception occurred, RSS data are used.
- Parameters:
url – News URL.
e – RSS feed item data.
- Returns:
News model instance - the newly created news.
CoinManager
-
class richy.news.downloader.CoinManager(item)[source]
Bases: BaseManager
-
CoinManager.fetch()[source]
Fetches news with karpet library.
Also uses blacklist from settings - NEWS_BLACKLIST.
- Returns:
Number of downloaded news.