Working around problems with xargs

Having access to sites that provide books as pdf per chapter, I often just want to download all of the pdf files and concatenate them with pdftk. All the sites I’ve come across have had rather obscure naming strategies for the pdf files; you cannot just pdftk *.pdf concat output ready.pdf.

So I wrote myself BHOL (bash-huge-one-liner):

cat <(ls prejunk.pdf) \
    <(ls -t1 *chap*|sed -Ee 's/extract chapter number/\1\t&/'|sort -n|cut -f2) \
    <(ls postjunk.pdf) \
        | xargs echo pdftk {} cat output output.pdf

Well of course that does not work; xargs does not by default work with {}, which needs to be enabled via -I argument. Having done that, xargs outputs 2+chapter_count would-be pdftk invocations, which of course is wrong. Looking at the man page, –([iI]|-replace) implies -L1 which makes it impossible to build a single command with all the commands.

This of course sounds insane. I didn’t want to get distracted (and go through source code for the reasoning behind -L1) I figured a workaround:

echo $(cat <(ls prejunk.pdf) \
    <(ls -t1 *chap*|sed -Ee 's/extract chapter number/\1\t&/'|sort -n|cut -f2) \
    <(ls postjunk.pdf) \
        | xargs echo pdftk) cat output output.pdf

Now the final command looks ok. Of course if xargs output would grow out of your environment’s limits, you’d have a problem right there, but lets assume it doesn’t go over.

So how do we execute this? I was already going for mouse for some quick-and-dirty copy-pasting when I remembered that more often than not bash has attempted to execute my assignments and tests as if those were meant to be executables. So the answer is: just remove the first echo, and the whole line will be executed.

Hmph, I just noticed that there was no real reason to use cat or xargs, I could had just surrounded the chapter-listing part with $() in an inline pdftk first.pdf $(...) last.pdf concat output output.pdf. Well, maybe next time.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: