Search

Ubuntu - Repair corrupted or broken PDF

Contents[Hide]

dropcap-ubuntu

If you are a day-to-day Linux user, you may have faced some web sites using some broken PDF generation software, where PDF files they generate can't be properly displayed with open-source viewers like Evince.

Latest site where I've faced this problem was Easyjet. I was supposed to print my e-ticket, but all important data were totally unreadable. Here is what Evince was displaying :

ubuntu-pdf-broken

While googling to find a reader able to handle these broken PDF files, I realised that this problem is quite common and that tools like gs (Ghostscript) or mutool (MuPDF) may be able to repair these files.

This article explains how to prepare your Linux desktop to be able to repair corrupted PDF files (like Easyjet e-tickets). It also explains how to integrate this tool as a custom action available from your favorite file manager (Nautilus & PcManFM) with a simple right click on the PDF file.

It has been tested on Ubuntu 16.04 LTS and Ubuntu Gnome 16.04 LTS. But, it should be applicable to any distribution using a Nautilus or PCmanFM.

Thanks to this setup, you'll be able to repair your corrupted PDF files which should be displayed properly in Evince.

ubuntu-pdf-repaired

1. Main Principles

A PDF repair tool should be used in 2 ways :

  • as a classic standalone application where you select the file thru a dialog box
  • from a file manager custom action with a right click on the PDF file

Evince, ePDFView, Xpdf and KPDF are sharing the same PDF rendering engine. So the main idea in a PDF correction tool is to find a robust PDF converter using a different rendering engine.

This is where gs or mutool come into the light :

  • gs uses Ghostscript rendering engine. This engine is well known to be able to handle files which are giving trouble to other renderers.
  • mutool includes a PDF structure repair function

As the PDF reparation job is not guaranteed, the tool in charge of the repairation job should never replace the original file. It should keep the original file and generate a new repaired file in the same folder.

As we have 2 different tools available, repair script will generate 2 different files :

  • myfile (GhostScript repaired).pdf
  • myfile (MuPDF repaired).pdf

As most modern file managers allow to handle files directly on a remote share (thru ftp, smb, ssh, ...), the repairing script should handle these remote files in a transparent manner using either URI or local path.
To do so, we will use gvfs tools to pull the file locally and to push it back to the remote share after it has been repaired.

Finally, as repair job may take some time on big PDF files, a notification should inform you that it is over and that your newly repaired file is available.

If you are not interested in step by step explainations and you just want to install the PDF reparation script, you can jump to Complete Installation Procedure

2. Needed packages

First step is to install all the tools that will be used by the script in charge of the PDF files reparation :

  • gvfs-copy to handle remote files copy
  • notify-send to display desktop notifications
  • gs and mutool to handle the repair work
  • urlencode to convert URI filename for notification display

Some of these tools should be installed by default.

3. Main Script

It's now time to install the main script in charge of the reparation job and to declare it as a desktop application.

/usr/local/bin/pdf-repair
#!/bin/bash
# ---------------------------------------------------
# Repair broken PDF file using gs
#
# Depends on :
#   * ghostscript
#   * mupdf-tools
#   * gridsite-clients
#
# Parameter :
#   $1 - URI of original PDF
#
# Revision history :
#   08/11/2014, V1.0 - Creation by N. Bernaerts
#   20/11/2014, V1.1 - Add file selection dialog box
#   24/01/2015, V1.2 - Check tools availability
#   24/11/2017, V2.0 - Add MuTool repair method (thank to Willie Wildgrube idea)
# ---------------------------------------------------

# check tools availability
command -v gvfs-copy >/dev/null 2>&1 || { zenity --error --text="Please install gvfs-copy [gvfs-bin]"; exit 1; }
command -v notify-send >/dev/null 2>&1 || { zenity --error --text="Please install notify-send [libnotify-bin]"; exit 1; }
command -v gs >/dev/null 2>&1 || { zenity --error --text="Please install gs [ghostscript]"; exit 1; }
command -v mutool >/dev/null 2>&1 || { zenity --error --text="Please install mutool [mupdf-tools]"; exit 1; }
command -v urlencode >/dev/null 2>&1 || { zenity --error --text="Please install urlencode [gridsite-clients]"; exit 1; }

# check if parameter is given, otherwise open dialog box selection
[ "$1" != "" ] && DOC_URI="$1" || DOC_URI=$(zenity --file-selection --title="Select PDF file to repair")

# if no file selected, exit
[ "$DOC_URI" = "" ] && exit 1

# extract document name and extension
DOC_BASE=$(echo "${DOC_URI}" | sed 's/^\(.*\)\.[a-zA-Z0-9]*$/\1/')
DOC_EXT=$(echo "${DOC_URI}" | sed 's/^.*\.\([a-zA-Z0-9]*\)$/\1/')

# set PDF extension for files without extension
[ "${DOC_BASE}" = "${DOC_EXT}" ] && DOC_EXT="pdf"

# generate temporary local filename
TMP_ORIGINAL=$(mktemp -t XXXXXXXX.pdf) && rm "${TMP_ORIGINAL}"
TMP_REPAIRED=$(mktemp -t XXXXXXXX.pdf) && rm "${TMP_REPAIRED}"

# copy input file to temporary local file
gvfs-copy "${DOC_URI}" "${TMP_ORIGINAL}"

# -----------------------
# Repair with GhostScript
# -----------------------

# generate repaired PDF
gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOutputFile="${TMP_REPAIRED}" "${TMP_ORIGINAL}"

# place corrected file side to original 
gvfs-copy "${TMP_REPAIRED}" "${DOC_BASE} (GhostScript repaired).${DOC_EXT}"

# remove temporary file
rm -f "${TMP_REPAIRED}" 

# ------------------
# Repair with MuTool
# ------------------

# generate repaired PDF
mutool clean "${TMP_ORIGINAL}" "${TMP_REPAIRED}"

# place corrected file side to original 
gvfs-copy "${TMP_REPAIRED}" "${DOC_BASE} (MuPDF repaired).${DOC_EXT}"

# ------------
# Notification
# ------------

# get document name and convert URI format
DOC_NAME=$(basename "${DOC_URI}")
DOC_DISPLAY=$(urlencode -d "${DOC_NAME}")

# send desktop notification
notify-send -i pdf-repair "${DOC_DISPLAY} repaired"

# remove temporary files
rm -f "${TMP_ORIGINAL}" "${TMP_REPAIRED}"

/usr/share/applications/pdf-repair.desktop
[Desktop Entry]
Type=Application
Exec=pdf-repair
Hidden=false
NoDisplay=false
Icon=pdf-repair
Keywords=pdf;repair;broken;corrupted;easyjet;
X-GNOME-Autostart-enabled=true
Name[en_US]=Repair corrupted PDF
Name[en]=Repair corrupted PDF
Name[C]=Repair corrupted PDF
Name[fr_FR]=Réparer PDF corrompu
Comment=Tool to repair corrupted PDF files with Ghostcsript. Is works well on boarding passes issued by EasyJet site.
Comment[en_US]=Tool to repair corrupted PDF files with Ghostcsript. Is works well on boarding passes issued by EasyJet site.
Comment[fr_FR]=Outil de reparation de fichiers PDF corrompus. Fonctionne sur les cartes d'enregistrement du site EasyJet.
MimeType=application/pdf;application/x-pdf;application/x-bzpdf;application/x-gzpdf;
Categories=GNOME;GTK;Viewer;Graphics;Utility;

Both files are available from my GitHub account.

4. File Manager Integration

To get a full desktop integration, this repair tool should be available from a custom action in your file manager context menu.

This context menu should be displayed for any file having a PDF mimetype.

With latest Extension for Menus and Actions of the freedesktop.org Desktop Entry Specification (DES-EMA) this integration has become quite easy.

You just need to declare the new custom action in a .desktop file placed under $HOME/.local/share/file-manager/actions.

~/.local/share/file-manager/actions/pdf-repair-action.desktop
[Desktop Entry]
Type=Action
Icon=pdf-repair
Name[en_US]=Repair broken PDF
Name[en]=Repair broken PDF
Name[C]=Repair broken PDF
Name[fr_FR]=Réparer PDF corrompu
Tooltip[en_US]=Use Ghostscript to rebuild a faulty PDF file
Tooltip[en]=Use Ghostscript to rebuild a faulty PDF file
Tooltip[C]=Use Ghostscript to rebuild a faulty PDF file
Tooltip[fr_FR]=Utilise ghostscript pour réparer un fichier PDF mal construit
Profiles=repair_pdf;

[X-Action-Profile repair_pdf]
Exec=pdf-repair %u
MimeTypes=application/pdf;application/x-pdf;application/x-bzpdf;application/x-gzpdf;
Name[en_US]=Default profile
Name[en]=Default profile
Name[C]=Default profile

5. Nautilus specific

If you are using Nautilus file manager, you need one extra step to get this right click menu.

In fact, Nautilus is implementing DES-EMA specifications thru an extra nautilus-actions package which need to be installed :

Terminal
# sudo apt install nautilus-actions

Once installed, launch Nautilus Actions application and configure the settings to unselect "Create a root Nautilus-Actions menu" :

ubuntu-nautilus-action-preferences

As Nautilus does not display menu icons by default, you also need to enable this feature.

Terminal
# gsettings set org.gnome.desktop.interface menus-have-icons true

6. Complete Installation Procedure

A complete installation script is available from my GitHub account.

This script will handle package installation, icon & scripts download.

You just need to download and run it :

Terminal
# wget https://raw.githubusercontent.com/NicolasBernaerts/ubuntu-scripts/master/pdf/pdf-repair-install.sh
# chmod +x pdf-repair-install.sh
# ./pdf-repair-install.sh

After following all these steps, you should get :

  • a new Repair corrupted PDF application
  • a new Repair broken PDF right click menu on PDF files

 

ubuntu-pdf-repair-menu

 

Hope it helps.

Signature Technoblog

This article is published "as is", without any warranty that it will work for your specific need.
If you think this article needs some complement, or simply if you think it saved you lots of time & trouble,
just let me know at This email address is being protected from spambots. You need JavaScript enabled to view it.. Cheers !

icon linux icon debian icon apache icon mysql icon php icon piwik icon googleplus