RSS feeds - The WWW

Users browsing this thread: 1 Guest(s)
ckester
Nixers
Yeah, rawdog is still based on Python 2.7 and there doesn't seem to be any work on migrating it to 3. That's why I have had moving to sfeed in mind as a backburner project for quite a while now. Maybe I should get to work on that.

sfeed lends itself to a pipes-and-filters approach that I find congenial. I'm not sure I like the way sfeed_update "merges" feeds, but I haven't looked too deeply at that yet. I might end up using only the XML parser from sfeed and writing my own code to build the river-of-news pages.

A more serious concern with sfeed_update is that it doesn't seem to check whether a feed has actually been updated since the last time it was fetched. It looks like it uses curl(1) to download the whole feed each and every time. Not cool, generating unnecessary traffic like that. As I mentioned before, some sites include the entire article content in their feed.

(I need to go back and look at how rawdog is dealing with unchanged feeds: is it just a matter of making a conditional GET with If-Modified-Since? I seem to recall that the pertinent date info is stored in the rawdog db for each feed...and I don't see anything like that in the sfeedrc or generated files. But this is just a first impression and I could be wrong.)

UPDATE: yes, looking at the fetch() function in rawdog.py and the parse.args there, I see that rawdog uses the etag and/or last-modified headers feature of feedparser to save bandwidth if and when the publisher supports them. With some changes to the sfeedrc file or creation of a db similar to rawdog's, it shouldn't too hard to add the same feature to sfeed_update's use of curl(1). So there's item #1 on the TODO list. ;)


Messages In This Thread
RSS feeds - by octahedral - 21-06-2020, 02:17 PM
RE: RSS feeds - by venam - 21-06-2020, 02:37 PM
RE: RSS feeds - by jkl - 21-06-2020, 02:49 PM
RE: RSS feeds - by ckester - 21-06-2020, 03:16 PM
RE: RSS feeds - by movq - 21-06-2020, 03:19 PM
RE: RSS feeds - by venam - 21-06-2020, 03:25 PM
RE: RSS feeds - by ckester - 21-06-2020, 03:35 PM
RE: RSS feeds - by jkl - 21-06-2020, 03:48 PM
RE: RSS feeds - by movq - 23-06-2020, 01:08 PM
RE: RSS feeds - by ckester - 23-06-2020, 02:23 PM
RE: RSS feeds - by movq - 24-06-2020, 07:55 AM
RE: RSS feeds - by Saos - 24-06-2020, 07:00 PM
RE: RSS feeds - by twee - 27-06-2020, 06:59 PM
RE: RSS feeds - by bouncepaw - 28-06-2020, 05:41 PM
RE: RSS feeds - by acg - 01-07-2020, 09:11 AM
RE: RSS feeds - by twee - 01-07-2020, 09:40 AM
RE: RSS feeds - by acg - 01-07-2020, 10:32 AM
RE: RSS feeds - by ckester - 01-07-2020, 07:33 PM
RE: RSS feeds - by ckester - 01-07-2020, 09:25 PM
RE: RSS feeds - by Dworin - 05-07-2020, 11:36 AM
RE: RSS feeds - by jkl - 05-07-2020, 11:38 AM
RE: RSS feeds - by fre d die - 16-07-2020, 04:41 PM