PDF from multiple PDFs or Images! Reduce PDF size through image compression! (Mac OS X 10.7.3)

I had 3 goals at the outset:

  1. Create a PDF from an image (JPEG, TIFF, PNG, etc.)
  2. Combine PDFs into one larger PDF
  3. Reduce the image quality of the PDF to a manageable size, while preserving readability. This process is not recommended for expert resizing of pictures.
This was all completed, but more importantly, the process wasn’t what I expected. I learned how to do the above, but the way below is simpler:
  1. If you are working with only images, it is useful to highlight all images and “Open With…” then select “Preview” and then Print to PDF. It joined all my images into a single PDF.
  2. If you are working with only PDFs, you can go straight to the section I’ve written below on Multiple PDFs to PDF.
  3. Reduce the image quality of the PDF to a manageable size, while preserving readability.

But there are some advantages and disadvantages about this method in 1. because if you Print to PDF in Preview it will create a white border around all of your images in your new PDF. The alternative is to create a Service in Automator that will go through all your images and convert them to PDFs, then condense all PDFs to one PDF. I haven’t fleshed out this process yet, as I’m satisfied (i.e. unwilling to complain) with the white border.

I’ve been encumbered by the inability to create PDFs from multiple PDFs or multiple images for some time. Today I decided it would be an essential skill so I learned how to do it, and now you’re going to learn. It was another beautiful day where I felt justified in purchasing my Macbook Air. Note that the majority of this instruction was compiled from about 6-8 different websites (so you don’t have to go crazy over tiny misconceptions and errors), so I thought I would throw it all into this one!

Some helpful sites (but note that my write-up is a condensed version of all of these sites to save you some time):

Create PDFs from images
Multiple Page PDF from PDFs
Combine PDF files Service
Reduce file size through quality
Resolving Quartz Filter Issues

Multiple PDFs to PDF

  • We’re going to use the built-in tool “Automator” to get things done. You can access Automator by going to your Launchpad and then clicking on Automator.
  • “Choose a type for your document:” immediately pops up, so you’re going to want to choose “Service.”
  • “Drag actions or files here to build your workflow” is the location where you will drag commands that will execute sequentially (top to bottom). Those commands can be accessed from the left-hand side where you see the library for Actions and Variables. The easiest way to find your desired action is by searching. There is a search bar next to “Actions” “Variables” in the top left of your Automator screen.
  • There is a section above the workflow area that says “Service receives selected” and has a drop-down menu. From the drop-down, select “PDF Files.” You’ll notice that to the right, “in any application” is selected, and should be fine.
  • First, we’ll search and add Combine PDF Pages to our workflow. Make sure “Appending Pages” radio button is selected if desired. Note that there are “Options” you can look choose from. Explore all the options just so you feel comfortable with the workflow.
  • Then search and add to workflow “Copy Finder Items.” I usually send the Finder items to Desktop. I do not check “Replace existing files.”
  • Then search and add to workflow “Rename Finder Items.” You can change the “Add Date or Time” pulldown to whatever naming convention customization you’d like. You should look at options and select “Show this action when the workflow runs.”
  • Finally, search and add to workflow “Move Finder Items.” I usually default moving the items to my Desktop.

After you’ve completed these steps, go to File and then Save! I saved mine as “Combine PDFs.”

These are the workflow steps within Automator to Combine PDFs.

Once you’ve closed Automator, test our your new Service. Highlight multiple PDF files and then Control + Click them to bring up several Finder options, and then scroll down to Services, where you should see Combine PDFs as one of the options.

Select your PDFs and ctrl+click, select Services, then select Combine PDFs and you’re done!

Let me know if it didn’t work for you in the comments and we’ll sort it out! Make sure you execute all the steps correctly before you accuse me of shenanigans!

Custom Reduction of Image Quality Using ColorSync Filters

I needed to reduce a PDF that I combined from something outrageous (like 200 MB to 600 MB) to something manageable (like 2-6 MB). How can we reduce the order of magnitude by 2 and still have recognizable quality? I had the same question. It turns out for practical purposes, if someone scanned a page of writing in very high TIFF quality (each image ~15 MB), they’ve overdone it. So let’s get it back to something more manageable.

I created a custom ColorSync filter, first. Later, this filter will be applied to my really long PDF as a postprocessing step.

  • Open ColorSync by going to your “Launchpad” –> “Utilities” –> “ColorSync Utility.”
  • When you open ColorSync, navigate to Filters on your new window.
  • You’ll notice a little “+” button at the bottom left of your new window. You’ve now added a custom filter. Give it a name.
  • Use the small drop down arrow to the right of your new Filter and select “Add Image Effects Component” and then select “Image Compression.”
  • Navigate to your new filter and expand it. Change the Mode of Image Compression to “JPEG” from the drop-down. I’ve made three filters reducing file sizes from high quality to above-average quality, middle quality, and below average quality. As far as I know, these filters are written (saved) as you make the adjustments. I have found the Reduce File Size High to Below Average (placing Quality at the 1/4 mark between Min and Max to yield the best results).
  • Close the Filters window, but stay in ColorSync Utility. Create a backup of your desired PDF just in case. Open your desired PDF using ColorSync Utility (go to File –> Open). If you do this, at the bottom you’ll notice there is a Filter option. Change the Filter to your desired new created filter, and then hit Apply!
  • File –> Save a copy of your new file! Now hopefully your file is easier to deal with.
For the compression type:

Choose your type of Filter for the PDF.

And for the Filters:

Create your Filters by using ColorSync Utility.

Good luck!

Installing gcc and gfortran for Mac OS X (10.7.3)

Things you’ll need:

  • Knowledge of how to use the terminal
  • An internet connection
  • A Mac developer account (you can get this as we go along)
  • Copy of Xcode (free)
  • About an hour of your time (30 minutes downloading, 15-30 minutes doing things)

Basic steps:

  1. Download and install Xcode
  2. Download command line tools
  3. Download and install gfortran from other source
Note that if you attempt to only download and install gfortran without gcc you might get the following error!

error trying to exec `as': execvp: No such file or directory

Also note that I performed this installation on a Macbook Air.

gcc

Download and install Xcode by clicking this link, or by searching for it in the Apple App Store, where it can be downloaded for free (see image).

Xcode contains gcc

After you’ve downloaded Xcode, you’ll want to open it and agree to their terms of service. Then, you’ll want to navigate to the menu Xcode –> Preferences –> Downloads. Here you’ll see an option to download Command Line Tools (see image). Note that you’ll need a developer account at this stage, and I was redirected to their developer page where I had to fill out a form and create my account (using my existing Apple ID, where a lot of the form was already auto-filled).

CL Tools

Download Command Line Tools from Preferences --> Downloads

After you have successfully installed the command line tools, open your terminal and type something like:

$ which gcc

which should return the path of your gcc in /usr/local/bin. All of this should have been taken care of automatically.
gfortran

I mentioned at the beginning that I got an error when attempting to use gfortran on my machine before I’d even installed gcc. I found that gcc must be installed in order to use gfortran. But my gfortran installation went smooth because it’s very straightforward.

Download gfortran from this link.

After considering my hardware, I chose the option:

Mac OS Lion (10.7) on Intel 64-bit processors (gfortran 4.6.2): download (released on 2011-10-20)

The installation has a walkthrough that comes with the package, like many Mac installations. Straightforward and it should also work automatically. Then, open your terminal and type

$ which gfortran

and it should reveal that it was successfully installed in /usr/local/bin.

Happy programming!

Profiling a simple Fortran code with gprof

I finished working through Chapman’s Introducton to Fortran 90/95, and it was a very interesting (helpful) read. My next step is to work through Chapman’s (no relation?) Using OpenMP, but there are some performance considerations I must first address.

Therefore, I looked into gprof, which is the GNU profiling tool. It will give me an understanding of how quickly my code runs, and which tasks in the workflow are taking up the most resources. Here is what the ifort man pages say about the gprof compiler flag (note that I have a 32-bit processor for this test!):

-p
Compiles and links for function profiling with gprof(1).
Architectures: IA-32, Intel® 64 architectures
Arguments: None
Default: OFF
Files are compiled and linked without profiling.
Description: This option compiles and links for function profiling with gprof(1).
Alternate Options:
Linux and Mac OS X: -pg (only available on systems using IA-32 architecture or Intel® 64 architecture), -qp (this is a deprecated option)

That’s interesting, sure! So with that bit of knowledge, I want to apply it to a large code that might make debugging a pain. I’m going to focus on a much simpler test case (that I’m taking from Chapman’s Fortran 90/95 book, Example 6-10, pg. 340).

gprof Example with Fortran Code

The example I consider has a function called “ave_value” which calculates the average value of a function between two points first_value and last_value. “ave_value” is called by “my_function,” which is declared as external in the test driver program “test_ave_value.” It’s a very simple program with three .f90 files.

I wrote these functions based on the example given in Chapman, and then  I compiled them with the following command:

$ ifort -p ave_value.f90 my_function.f90 test_ave_value.f90 -o test_ave_value

As a reminder, the -p flag allows me to specify our gprof option, and the -o flag allows me to rename the executable.

Now that you have your executable, you can simply run it, as I did:

$ ./test_ave_value

And you’ll notice that it has generated a “gmon.out” file that can be interpreted by gprof to show you your statistics! Writing gmon.out will overwrite any previous versions that you had in the folder, so use caution. Now, run gprof to interpret the gmon.out file.

$ gprof test_ave_value > tav.output

The tav.output was my re-naming of the gprof output. Now we can view the results of gprof in tav.output, in any competent text editor.

Looking at the Numbers

There is sufficient documentation for understanding gprof numbers on their website, but I’ll hit some critical points. The outputs are separated into the Flat Profile and the Call Graph. The Flat Profile conveys how much time your program has spent executing each function. The Call Graph conveys how much time was spent in the function and its children. You can read more here.

Visualization of gprof results

A quick way to put a visualization together (per the documentation of gprof2dot):

gprof path/to/your/executable | gprof2dot.py | dot -Tpng -o output.png

Here, gprof executes your program (which you’ve already compiled and linked with the appropriate flag!). That output is piped to a program called gprof2dot, which then pipes its output to create an output file that you can view in any competent image display tool!

Note that if you download gprof2dot, you’ll need to change the permissions to ensure that it’s an executable. I tried to run the non-executable version with

$ ./gprof2dot.py

but it would not execute because the file permissions were not set to executable.

Now that I learned this, I’m going to try it on a bigger code. Happy profiling!

Installing Python 2.7.2 in Ubuntu 11.10 – UNRESOLVED?

The bottom line: I have a working python installation because I installed it LOCALLY, but when I attempt a global (or system-wide) installation (using sudo), I run into an error I can’t seem to crack.

HISTORY

My goal is to install Python 2.7.2 so I can integrate it with my parallel workflow. It’s not a very ambitious goal. I installed the Intel C compiler (no sweat), MPICH2, and now I’m at this necessary step before I install mpi4py, a wonderful tool developed here.

SYSTEM INFO

During my installation process, I covered two versions of Ubuntu (because I updated midstream). The two versions were the notoriously friendly 10.04, and now 11.10. These are home editions, not server editions. I’ve done a lot of experimental stuff regarding the graphics on my 10.04, so now a lot of stuff is broken, and I thought upgrading to 11.10 would fix a lot of the harm I caused (it did!). I’m using a 64 bit Intel architecture, Corei7.

PACKAGE INFO

I downloaded the Python-2.7.2.tgz package. I unpacked it somewhere friendly. Then I built somewhere else. I usually do this and it has worked out pretty well so far.

COMMAND HISTORY

Here is a list of the commands that I issue. They should work and install Python, in theory.

Command 1
./configure –prefix=$INSTALL_DESTINATION CC=$INTEL_C_COMPILER 2>&1 | tee c.txt

Note that my $INSTALL_DESTINATION was only root accessible, meaning I needed to specify sudo when making any changes to that directory. I do the fancy tee because it is absolutely necessary to keep me from going mad. Printing a history of what I just did and when I did it is great bookkeeping. Keep in mind that I fiddled with my Intel compiler. I tried to use the 32 bit compilers but it wouldn’t configure. I wasn’t sure if that would help, and now I know it doesn’t.

Command 2
make 2>&1 | tee m.txt

Again, this is a simple command that will send its output to a text file. At this stage, I got some warnings. I did not format the warning for your reading enjoyment.

compilation aborted for /home/benjamin/Documents/installs/Python-2.7.2/Modules/_ctypes/libffi/src/x86/ffi64.c (code 2)

Python build finished, but the necessary bits to build these modules were not found:
_bsddb _sqlite3 _tkinter
bsddb185 bz2 dbm
dl gdbm imageop
readline sunaudiodev
To find the necessary bits, look in setup.py in detect_modules() for the module’s name.
Failed to build these modules:
_bisect _codecs_cn _codecs_hk
_codecs_iso2022 _codecs_jp _codecs_kr
_codecs_tw _collections _csv
_ctypes _ctypes_test _curses
_curses_panel _elementtree _functools
_hashlib _heapq _hotshot
_io _json _locale
_lsprof _multibytecodec _multiprocessing
_random _socket _ssl
_struct _testcapi array
audioop binascii cmath
cPickle crypt cStringIO
datetime fcntl future_builtins
grp itertools linuxaudiodev
math mmap nis
operator ossaudiodev parser
pyexpat resource select
spwd strop syslog
termios time unicodedata
zlib

At the very least, we presume that our make install may not go as expected. Considering that I was about to install in a root directory, I issued the command

Command 3
$ sudo make install 2>&1 | tee mi.txt

which gave me the error

Traceback (most recent call last):
  File "/opt/Python-2.7/lib/python2.7/compileall.py", line 17, in <module>
    import struct
  File "/opt/Python-2.7/lib/python2.7/struct.py", line 1, in <module>
    from _struct import *
ImportError: No module named _struct
make: *** [libinstall] Error 1

From here, it’s been all suffering and confusion.

A LITTLE PROGRESS

I came across a forum post that detailed the error very similar to mine (if not exactly similar, but probably not, because it wasn’t their solution that solved my problem). They recommended an upgrade of the make utility. Before their recent upgrade in 2010, make was last upgraded in 2006. Four years of waiting reduced my “Failure to build these modules” from above, to this:

Failed to build these modules:

_ctypes

My recommendation here is to update your installation of make. Yet, with all of these wonderful improvements, my installation still failed with the same error. To check your existing version of make, go into your terminal window and type

$make -v

To automatically download the 3.8.2 edition of make, click here.

So with make upgraded, I was still running into issues with _struct. I did attempt the solution found in the link I provided to the forum posting. It did not work. But I think it’s a good start. I did not find the system-wide python installation absolutely necessary, so installing it locally was a breeze. I may come back to this later to resolve, but I’ll take any comments below to try them out.

Note: If you keep attempting the installation from source, make sure you run

$ make clean

after every time your

$ make install

fails, before you run make again.

Installing Numpy in Ubuntu 11.04 From Source

Download

In order to obtain the Numpy package, you’ll want to go to their website. Once you’re there you can download the source files. It’s also important to note that there are detailed Numpy installation instructions here. I’m just describing my experience.

Prerequisites

Because Numpy is a tool that requires Python…you’re going to need Python. If you’re using Ubuntu 11.04 and have not yet installed Python (or want to re-install a different version), I have a guide for that here.

Unpacking

I moved my source file to a place where I wanted to unpack it. In my case, I’ve made a local installation of Python, and I plan on making a local installation of Numpy. So let’s say I’m unpacking in:

$ mv numpy-1.6.1rc1 ~/opt
$ tar -xzvf numpy-1.6.1rc1.tgz
$ rm numpy-1.6.1rc1.tgz

What this will do is move my Numpy source tarball into my desired source directory, unpackage it, and then remove the undesired source tarball. I usually keep a back-up of my source for a while. You can always download the source again later if you want, I just like having it in case I can’t access the internet for some reason, or if I’m installing on a machine that I’m accessing through ssh.

Building

Here is where your installation of Python will service you. Python takes setup.py files that you’ll typically see in source directories and executes the code. In the case of Numpy, you’ll issue this command from the source directory.

$ python setup.py build

Now here’s where the fun starts. Numpy has a lot of options. I’m only going to address a select few and make you aware of others. If optimization is your interest, you’ll want to check out the options in the site.cfg file so that you can configure ATLAS or BLAS. They’re not necessary and I have not tested their effectiveness. But they are widely used in the field of scientific computing and I highly recommend them!

I’m also going to be using the Intel compilers. That’s a link for instruction provided by the SciPy folks themselves! I chose these options when configuring my build:

$ python setup.py build –fcompiler=intel –compiler=intel build_clib –compiler=intel build_ext

I specified both the Intel C and Fortran compilers. There are two steps to this process. The first is the build, the second is the install.

Installation

Once you have completed your Numpy build, it’s time to install. This is the command I used.

$ python setup.py install –prefix=/home/ben/opt/Python-2.7.2./

This is curious for several reasons. As a new user, when I’d typically run configure (from the world of make files), I felt that setting prefixes (which set the installation directory) would go into the configure step. But python does things a little differently. The ‘configure’ step doesn’t really have a true analog with Python build and install. We specify the installation directory during install!

Notice that I also installed Numpy in the location of my Python installation. The binaries from the installation end up joining my Python binaries. Now, if you followed this tutorial, you’ll notice that you can just type

$ which f2py

from the command line after installation and see that it is in your path! This is because I installed Numpy where Python is installed. Similar to Python, the Numpy executable ‘f2py’ is in $INSTALL_DIR/bin. If you didn’t use the tutorial, no worries! You would adjust your PATH variable similarly as I described in the tutorial, regardless of your installation directory. There is a lot of documentation out there on modifying your PATH if you still have questions. But I highly recommend using the same location as your Python installation. You may have to make some path adjustments if they are different!

Now that f2py is installed and your PATH is configured, try running Python.

$ python
>>> import numpy

If you can successfully import Numpy, you’re almost in business! Start looking for examples to run and get started! If you can pass tests, you can start the really exciting stuff!

Installing Python 2.7.2 in Ubuntu 11.04

Preface: In this example, $ is the prompt. So you don’t type the dollar sign. It’s just a sign that precedes “this is what I typed into my computer terminal”. These instructions are for beginners. Experts are welcome to chime in.

What’s nice about this guide is that because I specify instructions for a local installation, it can be applied to many types of machines (like compute clusters where you do not have root access).

Beginning the Installation

I just installed the Python 2.7.2 software package locally on my desktop. I’m running Ubuntu Natty Narwhal 11.04. I say “locally” because I’m installing it in my home directory, not system-wide. This is useful when you’re not root on your machine or do not have root authority.

Because Python is already installed on my machine, when I issue the command:

$ which python

to find the directory where Python is executed from, I get something like

/usr/bin/python

But I want to install Python in a new, local directory. So I created a directory.

$ mkdir ~/opt

Then I extracted my Python tar.gz file into this directory.

$ tar -xzvf Python-2.7.2.tgz

I went into this directory to poke around.

$ cd ~/opt/Python-2.7.2

Customizing Your Installation

I read the INSTALL.txt file for information about how to customize my installation. I found the options I want. In order to install Python locally, I specify my “prefix” during the configure step. Prefix is basically the location of the installation directory. Right now you’re in the source directory because that’s where the source code is located. I also want to specify my custom C compiler. I use the Intel compiler because it is pretty awesome. Its location on my machine is:
/opt/intel/composerxe-2011.4.191/bin/ia32/icc
Let’s not confuse the two different locations on my machine. Ubuntu has already set up an /opt directory. But I later created a /home/ben/opt directory because that’s what I wanted. You’ll notice I like to use the ‘~’ symbol to represent $HOME. You can find your value of $HOME by typing:

$ echo $HOME

Configure Python

Anyway, it’s time to configure Python. I had two goals: 1) Install it locally and 2) Specify my C compiler (so it doesn’t default to gcc). Experts know there are several ways of doing this without what I’m about to show you. This tutorial is not for experts. In your terminal you’ll need to be in the Python installation directory where you unpackaged your files to issue this command.

$ ./configure –prefix=~/opt/Python-2.7.2 CC=/opt/intel/composerxe-2011.4.191/bin/ia32/icc

Now your machine should configure Python. This may take up to a few minutes. The next step is exciting. Your version of Python will be compiled when you issue the next steps.

Python Make

$ make && make install

Now be aware that ‘make’ and ‘make install’ are two separate commands. I just used ‘&&’ to issue them on one line for convenience, so they execute back to back without my further instruction (because ‘make’ and ‘make install’ can sometimes take a while).

*NOTE* If you previously attempted this step and screwed up, or wrote something you didn’t want to in the configure step, you’ll have to reissue your ‘configure’ with the appropriate options and then:

$ make clean

before you continue with ‘make’ and ‘make install’.

Assigning Your PATH to Find Python

After you finish ‘make install’, Python should finish its compilation. Now I want to do a little organizing. I’d like to be able to execute Python by typing:

$ python

from a fresh terminal window. Right now, if I try to execute Python that way, it launches from /usr/local/bin as I mentioned earlier. This is not what I desire! We can change this by adjusting our path. We’ll go into the .bashrc file and edit with vim. Of course, you’re free to edit with whatever editor you like.

$ cd ~
$ vim .bashrc

Now once we’re viewing the file, get in insert mode and scroll down to assign your path. My path line looks like this after my edits:

PATH=”/home/ben/opt/Python-2.7.2/bin:$PATH”

I put my installation directory that I specified during prefix into my path. It’s represented by /home/ben/opt/Python-2.7.2/. But what’s ‘bin’ have to do with this (very few of you ask…but some may)?

Two important things to note. If I put $PATH at the beginning of my quoted statement, it wouldn’t have found my desired Python. That’s because your machine searches for the first result in your path. That would have been in /usr/local/bin (or whatever else you previously specified). So I had to drag $PATH to the rear of my line. Also, Python is not executed from the directory

~/opt/Python-2.7.2

but from

~/opt/Python-2.7.2/bin

The $INSTALL_DIR/bin is where the Python binary is located. That’s the typical location for binaries.

Final Steps

Now save your .bashrc file. Close your terminal window and open a new one. Type:

$ which python

It should be the location of the Python you just installed.

$ python -V

This will tell you the version of Python you’re running. Now just type ‘python’ to get started!

$ python