HTML filtering / sanitizing in article bodies

An RSS/Atom newsreader with features comparable to commercial newsreaders.

HTML filtering / sanitizing in article bodies

Postby lmorchard » Wed Jan 03, 2007 6:41 pm

Just starting to dig into the Vienna source, so I'm not entirely informed yet, but it looks like there's no filtering or sanitation of HTML content from articles being done anywhere.

I've seen this in lots of other aggregators, and usually the first obvious issue is that the feed-supplied HTML occasionally breaks or uglifies "river of news" displays (like the Unified layout) with an unclosed tag or some such. But, eventually someone decides to make a point, and you end up with all your news looking like platypuses.

There's been some thought by various folks on how to consume RSS safely, Sam Ruby's Venus aggregator has lots of work toward this end, and I've got some ideas myself.

Vienna might be able to do something interesting with HTML tidy, say, and possibly some XSLT to exclude all but "safe" content. If I get some more free time, I might try lashing something together - but I figured this was an issue to raise, if it hasn't been already.
User avatar
lmorchard
Harmless
 
Posts: 5
Joined: Tue Jan 02, 2007 9:54 pm
Location: Santa Clara, CA

Return to Vienna

Who is online

Users browsing this forum: Google Feedfetcher