Graphical Dynamics user community

Get tech support, suggest new features ... let's help each other get the most out of AutoIntern.
It is currently September 9th, 2010, 12:54 am

All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: Feed reader
PostPosted: October 2nd, 2007, 5:35 pm 
Offline
Site Admin
User avatar

Joined: December 11th, 2006, 1:53 pm
Posts: 92
If you've ever surfed over to a blog or a news website within the last couple years, no doubt you've noticed the little graphic buttons that look like Image or Image or Image somewhere on the page.

These buttons indicate that you can download the latest headlines automatically from that site. If it's a blog, this means you can be notified of each new blog entry that the blogger posts. If it's a news site, it means you can be notified of each new news article. This process is called "syndication", and RSS stands for "Real Simple Syndication", the data format used. (RSS is based on XML, which is why sometimes the button says "XML" instead of "RSS".) The blog or website is "feeding" you new content, and you scan for new content at your favorite sites with a program called a "feedreader".

This sample event acts like a feedreader. You can schedule it to run every morning, and every morning you'll have a custom-created HTML page waiting for you with a list of all the latest articles from all the websites & blogsites you want to keep track of!

How it works

Like all our sample WIL scripts, the RSS Viewer (rssviewer.wil) is in the events\sample subdirectory under your AutoIntern directory. It's launched from the Sample.ain schedule file, which you can load via File|Open. You'll find another file, myfeeds.txt, in the directory. Open this up in Notepad, and you'll notice a few entries in it, which are addresses (URLs) of XML files on several servers. These point to some popular blogs & news sites to get you started.

When you come to a blog or news site you want to monitor, click on its Image or Image or Image button. You'll see a page full of raw XML entries, and if you look closely you'll see that they contain the information on the most recent posts or articles from that site. What you really want is the URL that shows up in the Address field above the webpage. Copy & paste this URL into your myfeeds.txt file.

When the event occurs, it will go through myfeeds.txt line by line, download each URL it finds, and parse the XML files it receives using our WxDOM extender. It will build an HTML file with all the articles that were posted within the last 2 days.

NOTE: If you want to track a blog from Blogspot/Blogger, there is no separate Image or Image or Image button. In this case you simply add the blog's homepage to myfeeds.txt, because that page actually contains a link to the real XML file in its <head> section. The script will recognize this and go download the real file.

Parameters

There are several parameters you can change. Open up the rssviewer.wil file, and you'll notice that the first lines of code past the initial comment block are where we set some global variables.
  • To track different lengths of time: Change nWithinHours from its default of 48 hours.
  • To strip HTML from the descriptions: Some feeds put HTML in their article summaries instead of plain text like most feeds do. I haven't been impressed by the result, so I strip the HTML tags out. If you'd rather see these summaries in HTML, change bStripHTML to @FALSE.
  • To cut off long descriptions: Some feeds put the whole, long article in the summary. That's just rude! This script cuts them off after 500 characters. To change this, change nMaxChars.
  • To group by source instead of date: The default is to sort them all by posting date/time regardless of where the articles came from. If you want to group by website as they are found in the myfeeds.txt file, change bSortByDate to @FALSE. (See Known bugs/limitations below)
  • To track more feeds: Change nMaxFeed from its default of 20.
  • To track more articles: Change nMaxItem from its default of 1000.
Useful functions you can use:

  • ConvertSpecialChars - Some HTML special characters aren't rendered correctly by some browsers. This converts them to a form that all browsers can recognize.
  • ArraySortString - This takes a 2-dimensional array and sorts the rows, based on the values in a specified column.
  • TimeFromInternet - This converts a date & time from one of several standard Internet formats to a WIL datetime string.
  • TimeDisplay - This converts a WIL datetime to a more display-friendly format. (More flexible than WIL's TimeDate.)

Known bugs/limitations:

For some reason, some feeds don't include a date/time in their articles. So if the script is in sort-by-date mode it doesn't know where to put it in the list. So it just ignores them. You simply won't see any articles from that source. You can work around this limitation by setting bSortByDate to @FALSE.

The script handles RSS feeds that are in the popular RSS 0.91, RSS 1.0, RSS 2.0, RDF, Atom, or FeedBurner formats. It does not handle any of the more obscure formats out there.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group