Clustify - document clustering
 Home  |  Newsletter  |  My Articles  |  My Account  |  Help 

Location: Help / For Publishers / Guidelines

Guidelines for Magazine Website Publishers

Please note that we currently are NOT adding new publications to our index. You can submit your publication if you want, and we'll keep track of it in case we resume addition of new publications in the future.

Following the guidelines below will make your magazine site more useful to your readers and easier for and other search engines to index.
  1. Don't rename your web pages. When you change the names of your HTML files (i.e. URLs) you break the readers' bookmarks and you break the links to your site from search engines.
  2. Put dates on your articles. This helps readers determine if the article is still relevant, and it convinces them that you update your site regularly. Readers may come to the article directly through a search engine, so putting dates on the table of contents is not sufficient. can cite your article more precisely if you provide dates (please use 4-digit years).
  3. Be careful with frames. Frames allow you to keep one part of the display fixed while another part can be varied (by loading different text or scrolling independently). However, bad use of frames can make it hard for a reader to bookmark an article (the bookmark sends the reader to a page which is different from what they were reading when they created the bookmark). When moving from article to article on your side, does the URL listed in the "Address" or "Location" box on your browser stay the same? If so, you are using frames badly. Consider changing this or providing a "no frames" version of your articles. does not index sites where it cannot get a unique URL for each article (using a different frameset for each article is okay).
  4. Make links work when "current" issue changes. Be careful about how you set up the table of contents for your current issue. The article links should not change when that issue is no longer current or you will break the readers' bookmarks to the articles (i.e. "/current/myarticle.html" should not suddenly change to "/19991201/myarticle.html" when the January issue comes out). For example:
    Bad: Articles in /19991201 with hyperlinks like ./myarticle.html and soft link /current to /19991201.
    Good: Articles in /19991201 with hyperlinks like ../19991201/myarticle.html and soft link /current to /19991201.
  5. Provide a single complete page. Some people advocate slicing articles into small pieces with lots of links so that readers can jump easily to only the parts they want. While this may be good for general browsing, there are some drawbacks:
    • User must hunt for the "next page" link and wait for the selected page to download, causing an interruption in their reading (pushing the "page down" key while reading a single long page is easier).
    • It is hard for the user to print the article.
    • It can be hard for to find all of the pieces if the "next page" links are not of a simple, consistent format. This can cause our search engine to miss parts of your article.
    If you want to cut the article into many pieces, consider also posting a "single page" version.
  6. Use descriptive article titles. Readers can browse your table of contents more efficiently if you use article titles that are simple and clear rather than cute. This will also improve the usefulness of your citation on
  7. Don't replace body text with images. and other search engines often can't detect when you have replaced the first letter or word of a paragraph with a graphic representation of that letter/word. Also, you may cause problems for software that generates audible speech from your pages (e.g. used by the blind).
  8. Don't merge disparate information. For example, if you have a set of product reviews, put each review in its own file and give it a separate link in your table of contents. If you merge them into a single "Reviews" page you make it harder for the readers to find what they are looking for, and many search engines will give that page a poor ranking because only a small percentage of the page is devoted to what the person is looking for.
  9. Tell us if you change your site! If you change the way you name your HTML files, or the way you structure your documents, please let us know. Keep in mind that renaming your HTML files will break your readers' bookmarks.
  10. We don't index abstracts or partial articles. only indexes full articles, and we appreciate it if you clearly distinguish full articles from abstracts by putting the word "abstract" in the appropriate URLs.
  11. We don't index newspapers.
  12. We currently only index articles that are in English.
  13. We currently cannot index PDF files.
  14. Keep usability in mind. We recommend the book Designing Web Usability: The Practice of Simplicity by Jakob Nielsen (disclaimer: we get a commission if you purchase through this link).
  15. Use consistent formatting. Keeping the author, publication date, etc. in the same place and format on all articles makes it much easier for us to avoid errors in the article citations. Including a proper <TITLE> and meta description tag as shown below can also improve your ranking and the way your page is cited in various search engines. If you really want us to love you, use a clearly delineated format like this:
    <TITLE>article title</TITLE>
    <META NAME="description" CONTENT="article (not magazine) description">
    <META NAME="author" CONTENT="article's author">
    <META NAME="issue_date" CONTENT="publication date with 4-digit year">
    Advertisements, navigation buttons, etc.
    <!-- article body begin -->
    Actual text of the article.  No author biographies, links
    to other articles, advertisements, etc. here - don't pollute
    the search engine.
    <!-- article body end -->
    Copyright notices, author biographies, navigation links, etc.
  16. Keep old "table of contents" pages online and easily accessible. While many users may hunt for old articles by using your site's search engine, others may just want to browse articles, so you shouldn't make your search engine the only way to access old articles. Also, spiders from external Internet search engines will not be able to find pages that are only accessible through your site's search engine.
  17. Provide a robots.txt. The robots.txt file specifies which parts of your site search engine spiders are allowed to visit. For details on the file format see Standard for Robot Exclusion. If you want to leave everything open to the spiders you should create an empty robots.txt file. While omitting the file completely is, in theory, equivalent to providing an empty file, this does not work well on some servers. In particular, if you specify a default page in Microsoft IIS that should be returned when a requested page does not exist, apparently it will redirect the spider to the default page without indicating that there was ever any problem (i.e. the spider gets a "HTTP/1.1 200 OK"). This can be very confusing to the spider, which ends up with a page that is not in the expected robots.txt format.