Here I am trying to explain you, how to download PDF files automatically in Firefox using Selenium WebDriver.
1: # Environment Tested:
2: # Windows 7, Ruby 2.0.0p451, Selenium 2.41.0, Firefox 29.0.1
3: require 'selenium-webdriver'
4: profile = Selenium::WebDriver::Firefox::Profile.new
5: profile["browser.download.folderList"] = 2
6: profile["browser.download.dir"] = 'C:\\'
7: profile["browser.helperApps.neverAsk.saveToDisk"] = 'application/pdf'
8: # disable Firefox's built-in PDF viewer
9: profile["pdfjs.disabled"] = true
10: # disable Adobe Acrobat PDF preview plugin
11: profile["plugin.scan.plid.all"] = false
12: profile["plugin.scan.Acrobat"] = "99.0"
13: driver = Selenium::WebDriver.for :firefox, :profile => profile
14: driver.get('http://static.mozilla.com/moco/en-US/pdf/mozilla_privacypolicy.pdf')
A walk-through
Prevent Firefox from popping up "Save file" dialog
You may know that Selenium itself doesn't interact with system-level dialogs, in order to download PDFs as part of the browser automation process, it requires the help from either additional 3rd party frameworks or a better approach that could handle the downloading automatically.
Firefox's download manager preferences are controlled by some properties defined in about:config
page, which can be set programmatically while instantiating FirefoxDriver
using Selenium WebDriver.browser.download.folderList
controls the default folder to download a file to.0
indicates the Desktop;1
indicates the systems default downloads location;2
indicates a custom folder.browser.download.dir
holds the custom destination folder for downloading. It is activated ifbrowser.download.folderList
has been set to2
.browser.helperApps.neverAsk.saveToDisk
stores a comma-separated list of MIME types to save to disk without asking what to use to open the file.
1: profile = Selenium::WebDriver::Firefox::Profile.new
2: profile["browser.download.folderList"] = 2 # use the custom folder defined in "browser.download.dir" below
3: profile["browser.download.dir"] = 'C:\\'
4: profile["browser.helperApps.neverAsk.saveToDisk"] = 'application/pdf'
It is worth noting that the MIME type defined here is
application/pdf
,
which is a type that most PDF files use. However, if the target PDF
file has a non-standard MIME type, then "Save file" dialog might still
show up. In order to fix this issue, the actual MIME type has to be
added into browser.helperApps.neverAsk.saveToDisk
property, which can be checked out using either of the following approaches:- Upload file to online tools like What MIME?
- Download file and monitor MIME type in Chrome's developer tool or web debugging proxy like Fiddler, Charles, etc.
Prevent Firefox from previewing PDFs
For built-in PDF.js viewer
With the release of Firefox 19.0, PDF.js has been integrated into Firefox to provide built-in ability of displaying PDF files inside browser. It tries to parse and render PDFs into HTML5, which can be automated using Selenium WebDriver in theory. However, to download PDFs instead of preview in Firefox, anotherabout:config
entry needs to be changed to disable PDF.js.1: profile["pdfjs.disabled"] = true
For third party PDF viewers
Except for Firefox's built-in PDF viewer, there might be other third party plugins preventing Firefox from downloading PDFs automatically. If a machine has Adobe Reader installed, then default PDF viewing setting in Firefox might have been set to Adobe Acrobat without notice.To avoid previewing PDFs with those plugins, two more
about:config
entries need to be configured when starting WebDriver instance.plugin.scan.plid.all
needs to befalse
, so that Firefox won't scan and load plugins.plugin.scan.Acrobat
is a key that holds the minimum allowed version number that Adobe Acrobat is allowed to launch. Setting it to a number larger than currently installed Adobe Acrobat version should do the trick.
1: profile["plugin.scan.plid.all"] = false
2: profile["plugin.scan.Acrobat"] = "99.0"
0 on: "How to download PDF files automatically in Firefox using Selenium WebDriver"