About News Feeds

Introduction ››
Parent Previous Next

The following describes technical details about RSS and Atom feeds that are not required knowledge for normal use of multiFEED. Feel free to skip this topic unless you are encountering problems with specific feeds or are considering the use of some of the more exotic features of multiFEED (or are just curious).

If you have used Google or other web readers to access your news feeds, or if you just hit the  Subscribe  or button on a site with available feeds you may not know much about what they really are, or what limitations they might have. Although other news formats exist, by far the two most popular are RSS 2 and Atom. RSS was the first of the two, with Atom feeds following some years later to address some of the shortcomings of the earlier format. Both formats package their news into XML documents, but internally the information is stored differently, and some feed data is unique to each format, making it difficult to write a single reader application that displays feed content consistently.

RSS is the original news feed XML document standard, and consequently common (erroneous) usage refers to all feed formats as RSS, even though in reality this refers only to a specific XML structure. Since RSS began almost spontaneously as a way for content providers to package articles for consumers, it was initially very loosely defined. Other than the absolute basics much of the useful information about the feed may be found in changing, or sometimes almost random places in the XML document, or may not be included at all. Even if the data you want is where you look, it may not appear as you expect since data formatting is left almost entirely to the feed publisher's imagination. Over time, some de facto conventions have been adopted by the publishing community, but many big content publishers ignore or modify the usual RSS XML structure to suit their own needs, and many of the smaller ones just don't know what they are doing.

There are actually multiple RSS feed formats, the most popular by far being RSS 2. As of this writing the other versions are 0.9x, 1.0, and 1.1. RSS 2 has almost nothing in common with the earlier formats and in fact the 0.9x, 1.0, and 1.1 branches are largely incompatible with each other too, having been developed by unaffiliated and competing individuals or groups. Of the three, RSS 1.0 is the most established, and has continued to be developed, even after the advent of RSS 2, so that in many ways RSS 1.0 is actually more advanced than RSS 2. RSS 1.0 is also commonly known as RDF (actually the "R" in RSS stands for RDF - the acronym RSS stands for RDF Site Summary).

The Atom format is a newer standard designed to solve many of the perceived deficiencies with RSS. Although with Atom the publishers were still given free rein to add anything they wished to the document contents, the core data found in nearly all feeds was relegated to clearly defined locations, and publisher specific information was supposed to be constrained within XML namespaces so as not to confuse general-purpose feed readers. Certain important and ubiquitous feed data was decreed to be mandatory, and other slightly less popular information was deemed optional, but still clearly defined if present. Date formats in particular were tightened up, a very welcome improvement over RSS dates, which vaguely referenced an Internet RFC, but didn't enforce it, resulting in a chaotic abundance of proprietary and conflicting dates in RSS feeds. RSS 1.0 incorporates some of the advances pioneered in the Atom file format, and in particular uses the same improved date format as Atom.

Although Atom is an undeniably better specification for news dissemination than RSS, the lead RSS had in the marketplace and its early adoption by big content publishers has slowed the popular usage of Atom. All the big players (CNN, BBC, Fox News, etc.) provide RSS 2 feeds, but most of the available Atom feeds are published by smaller, more “techie” industries and individuals. You often have to go specifically looking for Atom feeds, whereas RSS feeds are thrust at you on nearly every web site you visit.

Although they are all packaged in an XML document, and contain some similar entities, such as articles, the internal document structure of the various RSS and Atom formats is different, and different nomenclature is also used for the same concepts. For instance, each feed can have multiple articles, but for RSS we call this an “item” whereas the same thing in an Atom feed is an “entry”. Although you don't have to know anything about feed XML internals to use multiFEED effectively, some advanced features, such as feed timezone overrides, may require delving into the actual XML format of the feeds you are trying to tweak.