The image problem I encountered in Part 1 seemed like a weird thing to have happen, because I’d heard that Anthologize would allow me to gather content from other blogs and turn it into a (e)book. If remote images caused Anthologize to fail this function would be of limited use.
I really enjoy Nicholas Nova’s blog “Pasta and Vinegar” so I decided to use it as a test. I entered the RSS feed into Anthologize and it brought back a list of all of Novas’ posts. And I mean all of them. Checked. Ready for importing. I really didn’t want all of his posts, just a few. A desirable feature here would be a checkbox that flipped the import status of all the posts so that you could select just a few.
After clicking “Import” I was taken back to the main My Projects screen. I was a little befuddled about where all the imported posts had gone. I clicked on the last project I had opened and realized that the things I imported were now included in the list of posts (including my local blog posts). At first I’d hoped that the filters could also be applied to the imported posts (this would make selecting from a large number of imported posts easier). Unfortunately, the tags and categories are based on the ones from my blog, not the ones imported via the Pasta and Vinegar RSS. I can’t say this is an Anthologize problem, it looks like the Feedburner feed from P&V doesn’t include any of that information (a quick peek shows that my WP feed does – will have to try this again with a different blog to see if Anthologize recognizes it).
I was able to add the imported posts to a new project and it exported without any troubles – no image errors like I’d seen on my blog. Although Novas’ posts do include embedded images from Flickr they use a standard img src tag, not the Flickr wrapper . So maybe my theory above about why my images didn’t work isn’t correct, the extra code for remote embedding maybe causing the problem. This is causing me to rethink how I post Flickr photos to my blog and consider what some of the longer-term preservation impacts may be.
To test the import feature little further, I went back and imported the feed from Sowing Culture, IMLS DCC’s blog which does include categories in the feed but they were not recognized by Anthologize. Even though the image references did point back to Sowing Culture, the PDF export went off with out a hitch. On examining the PDF, I found that the pages were breaking in the wrong place (after the post title, which left the titles dangling on the preceding page). I went back and looked at the other PDFs and found they exhibit the same behavior. Finding page breaks in a straight run of HTML is a tough nut, but since these are breaks across different Anthologize library items, I would think it would be possible. I would only expect the posts to run as a complete text if I’d “Appended” one post to another.
Fortunately I’m not just stuck with the default PDF output, Anthologize also allows you to export your project in ePub, TEI (with HTML) and RTF. In Part 3, I’ll take a closer look at each of these to see what’s going on under the hood.