From a search engine optimization (SEO) perspective, whenever possible, HTML is the better choice. Crawler-based search engines easily index web pages. As a result, a HTML file will have a natural ranking advantage over a PDF format.
Does Google index PDFs?
Yes – Google can crawl and index a number of non-HTML file formats including PDFs. This file type will appear in a Search Engine Results Page (SERP) when it is relevant to the user’s query. The web searcher can “View as HTML”, allowing them to read the contents of the PDF without the related application being installed. In addition, it also allows viewers to avoid viruses that are sometimes carried in this file format.
Since posting a PDF on your website is sometimes unavoidable (i.e. white papers), ensure that you SEO your document.
1. Use text format
Search engines read text. When creating your document in Word (for example) use text only (if possible) – and then convert it to PDF. Ensure that your PDF converter translates the Word document into a PDF text format (rather than an image screen capture of the Word file).
2. Use keywords / phrases in titles
Google displays snippets on SERPs that are determined by relevancy based on the information that you provide. In other words, if you want your PDF document to get found for a keyword phrase (i.e. your company name), ensure that document name has that terminology included. Don’t stop there – also add this in the title field in the PDF document settings.
3. Use “white hat” strategies
If you have both HTML and PDF versions of the same information on your website, ensure that you disallow the PDF version in your robots.txt file to avoid duplicate content issues.
Are your PDFs search engine friendly?