Theme images by Storman. Powered by Blogger.

Software Testing

[best practices][feat1]

Recent

recentposts

Popular

Comments

recentcomments

Most Recent

Random Posts

randomposts

Facebook

page/http://facebook.com/letztest

Monday, April 11, 2016

How to download PDF files automatically in Firefox using Selenium WebDriver

 

Here I am trying to explain you, how to download PDF files automatically in Firefox using Selenium WebDriver.

I always believe that learning through examples are the right and easiest way and hence let's see a  complete example of how to save PDF files automatically in Firefox using Selenium WebDriver.

1:  # Environment Tested:  
2:  # Windows 7, Ruby 2.0.0p451, Selenium 2.41.0, Firefox 29.0.1  
3:  require 'selenium-webdriver'  
4:  profile = Selenium::WebDriver::Firefox::Profile.new  
5:  profile["browser.download.folderList"] = 2  
6:  profile["browser.download.dir"] = 'C:\\'  
7:  profile["browser.helperApps.neverAsk.saveToDisk"] = 'application/pdf'  
8:  # disable Firefox's built-in PDF viewer  
9:  profile["pdfjs.disabled"] = true  
10:  # disable Adobe Acrobat PDF preview plugin  
11:  profile["plugin.scan.plid.all"] = false  
12:  profile["plugin.scan.Acrobat"] = "99.0"  
13:  driver = Selenium::WebDriver.for :firefox, :profile => profile  
14:  driver.get('http://static.mozilla.com/moco/en-US/pdf/mozilla_privacypolicy.pdf')  

A walk-through

Prevent Firefox from popping up "Save file" dialog

You may know that Selenium itself doesn't interact with system-level dialogs, in order to download PDFs as part of the browser automation process, it requires the help from either additional 3rd party frameworks or a better approach that could handle the downloading automatically.


 
Firefox's download manager preferences are controlled by some properties defined in about:config page, which can be set programmatically while instantiating FirefoxDriver using Selenium WebDriver.
  • browser.download.folderList controls the default folder to download a file to. 0 indicates the Desktop; 1 indicates the systems default downloads location; 2 indicates a custom folder.
  • browser.download.dir holds the custom destination folder for downloading. It is activated if browser.download.folderList has been set to 2.
  • browser.helperApps.neverAsk.saveToDisk stores a comma-separated list of MIME types to save to disk without asking what to use to open the file.
1:  profile = Selenium::WebDriver::Firefox::Profile.new  
2:  profile["browser.download.folderList"] = 2 # use the custom folder defined in "browser.download.dir" below  
3:  profile["browser.download.dir"] = 'C:\\'  
4:  profile["browser.helperApps.neverAsk.saveToDisk"] = 'application/pdf'  

It is worth noting that the MIME type defined here is application/pdf, which is a type that most PDF files use. However, if the target PDF file has a non-standard MIME type, then "Save file" dialog might still show up. In order to fix this issue, the actual MIME type has to be added into browser.helperApps.neverAsk.saveToDisk property, which can be checked out using either of the following approaches:
  • Upload file to online tools like What MIME?
  • Download file and monitor MIME type in Chrome's developer tool or web debugging proxy like Fiddler, Charles, etc.

Prevent Firefox from previewing PDFs

For built-in PDF.js viewer

With the release of Firefox 19.0, PDF.js has been integrated into Firefox to provide built-in ability of displaying PDF files inside browser. It tries to parse and render PDFs into HTML5, which can be automated using Selenium WebDriver in theory. However, to download PDFs instead of preview in Firefox, another about:config entry needs to be changed to disable PDF.js.
1:  profile["pdfjs.disabled"] = true  

For third party PDF viewers

Except for Firefox's built-in PDF viewer, there might be other third party plugins preventing Firefox from downloading PDFs automatically. If a machine has Adobe Reader installed, then default PDF viewing setting in Firefox might have been set to Adobe Acrobat without notice.
To avoid previewing PDFs with those plugins, two more about:config entries need to be configured when starting WebDriver instance.
  • plugin.scan.plid.all needs to be false, so that Firefox won't scan and load plugins.
  • plugin.scan.Acrobat is a key that holds the minimum allowed version number that Adobe Acrobat is allowed to launch. Setting it to a number larger than currently installed Adobe Acrobat version should do the trick.
1:  profile["plugin.scan.plid.all"] = false  
2:  profile["plugin.scan.Acrobat"] = "99.0"  

0 on: "How to download PDF files automatically in Firefox using Selenium WebDriver"