What is the value of the –high-quality=yes setting with scanimage?

September 4th, 2009 by matthias

On the Agfa snapscan e20:

  • Time scanning an A4 page in 600dpi gray, without quality scan: 94s.
  • Time scanning an A4 page in 600dpi gray, with quality scan: 94s.
  • Also, it was impossible to eye-detect any differences in the scanned test image.

Perhaps a speed and quality difference is there if scanning with lower resolution than the maximum.

Posted in AGFA Snapscan e20, Graphikbearbeitung, SANE, Sprache: Englisch, alle Artikel | No Comments »

How to interpolate a greyscale scan so that a b/w scan with higher resolution results, i.e. how to remove anti-aliasing so that the resolution results that is tried to be simulated by anti-aliasing?

September 4th, 2009 by matthias

Scale the image to bigger size; the type of interpolation used (no, linear, cubic, Lanczos3) might affect the result, but the effects are not really heavy and yet to be determined more exactly.

Apply maximum contrast and adjust brightness to your needs
if diagonal borders now have many “stairs” of 1px height, this is what you want.

This procedure shows that a low-resolution grayscale image contains more information than a higher-resolution black-and-white image, as the latter can be constructed from the former. Therefore, master files for archiving documents should better use the grayscale images.

Calculation: one pixel with 256 shades contains the brightness information of a 16×16pixel array of black and white pixels (as each black one can be thought to make the array’s shade one step darker). If borders can be determined by interpolation well enough, a grayscale image contains the information of a black and white image with 16 times higher dpi resolution. Compared to this, it takes 8 bit per pixel, while the equivalent black and white image takes 256 bits for the same area (256 black and white pixel), so has 32 times higher file size.

Posted in Graphikbearbeitung, Sprache: Englisch, alle Artikel | No Comments »

How to do high-quality digital facsimiles, with minimum file size and the option to print out to near-orginal quality books?

September 4th, 2009 by matthias

The goal is here: the better the image is to the 300dpi image conversion of an imagined PDF file of the scanned book, the better. That is: white being kist every pixel white, the same with single-color areas, fonts should have anti-aliasing, no JPG compression of letters etc.. Whatever brings the file nearer to that goal goes into the “master files”, all other steps the lead to information loss relative to the above goal goes in additional files.

You don’t have to record how you’ve created the master files, and creating these can also include manual, file-specific steps. But it’s a good idea to have all work based on the master files (like scaling, JPG conversion, PDF creation) done by scripts. As this will be repeated with various settings, while the master files won’t be recreated. This interface definition also allows to re-create derived works after you’ve done some further fine-tuning to the master files. This procedure makes it possible to start with a quick&dirty solution (relatively rough master files), and incrementally add to this, as time permits.

  1. Print and cut out two stripes of thick paper and tape them to the scanner, to use as horizontal and vertical end stops for designating the origin (0,0) on the scanner glas.
  2. Define the scanner window to be the exact page size, or double page size for A5 books, which makes automated page extraction from the files possible.
    1. DIN A5 page at 600 dpi: 3507×4960px
    2. DIN A4 page or 2 DIN A5 pages at 600dpi: 4960×7015px, 210×297mm
  3. Find the devicename: scanimage -L
  4. scan:
    scanimage \
      --device-name=snapscan:libusb:001:012 \
      --format=tiff \
      --high-quality=yes \
      --resolution 600dpi \
      --mode Color \
      -t 0 -l 0 -x 210 -y 297 \
      --batch=%02d.tif --batch-prompt --batch-start 1

    (This command applied both to A4 and A5, for A5 we’ll partition the images lateron.)

  5. Remove halftone rasters, by applying Gaussian blur, but only to the parts that actually consist of images, and saving the images as PNG (compression level 2).
  6. Automatic page extraction from the files. This is necessary if you scanned two A5 pages at once on a A4 sheet. (However, the scanimage command could be changed for this, also).
    1. separating pages into files:
      for file in *.png ; do
        convert -crop 2480x1754+0+0 $file ${file/.png/.1.png};
        convert -crop 2480x1754+0+1754 $file ${file/.png/.2.png};
      done;

      Note that the GIMP preview and view for the PNG images created with the second command will be empty, but that’s a GIMP problem as kuickshow shows the images correctly.

    2. rotating the pages (90° right, if in landscape format):
      for file in *.png ; do
        mv $file ${file/.png/.orig.png};
        convert -rotate "90>" ${file/.png/.orig.png} $file;
      done;
  7. adjust brightness and contrast for white and black to become white and black (this will reduce file size dramatically)
  8. scale to 300dpi
  9. save as PNGs with maximum compression level (9 in GIMP, -quality 105 for ImageMagick)
  10. the above PNG files are the master files you archive (therefore we use lossless compression); you can now add further optimizations to copies of them, for specific presentation variants (PDF for scree or print usage, e.g.)
  11. optimization for small file size, near-optimal printing and optimal screen reading
    1. in areas with text and few-color drawings, get the colors reduced by image processing, to reduce file size
    2. save as PNG with 16 or 32 colors (optimized palette) for pages where you have few colors (i.e. just the text and drawings, no photos)
    3. delete fuzzy borders:
      convert -page A4 -density 28x28 \
        -border 100x100 -bordercolor white page.*.sw.png buch.pdf
    4. for low-quality originals (e.g. inkjet printouts), screen readability is better when it’s greyscale instead of black and white, and to compensate for bigger file size, less resolution. Printability is, of course, better in black and white mode in higher resolution, but this can be constructed from the grayscale mode by scaling and applying a threshold. Therefore, the master images should be grayscale in these cases.
    5. PNG file size of grayscale scans can be reduced by 90% by applying higher contrast, so that letters become mostly black and paper mostly white. Contrast should be around 80-90 (of 128), so thet gray pixels still preserve the antialiasing information.
    6. PNG File size for 300dpi grayscale A4 text page scans (without images) is approx. 4.5 MiB before all optimizations, and 0.9 MiB after applying high contrast. This is acceptable for master files, and gives a very good basis for optimizing in various directions, as much information is contained.
    7. So the best settings to scan low-quality black and white originals are: 300dpi grayscale, no high-quality scanning. This is both time-efficient and includes all the information that is there to be captured.
  12. conversion to PDF
    1. for file in *.png; do convert $file ${file/.png/.pdf}; done;
    2. pdftk page.*.pdf cat output book.pdf;

Posted in Gimp, Graphikbearbeitung, ImageMagick, Sprache: Englisch, alle Artikel | No Comments »

How to convert to PDF with correct page sizes if convert does not interpret the images’ dpi settings?

September 4th, 2009 by matthias

Enter the dpi setting as an option to convert, like this:

convert -density 97x97 *.jpg file.pdf

Posted in Graphikbearbeitung, ImageMagick, Sprache: Englisch, alle Artikel | No Comments »

How to correct the dpi setting of a bunch of JPG images?

September 4th, 2009 by matthias

for file in Seite\ *;
  do convert -density 97x97 "$file" "${file/.jpg/.97x97.jpg}";
done;

Posted in Graphikbearbeitung, ImageMagick, Sprache: Englisch, alle Artikel | No Comments »

How to print “real copies” (without re-transmitting the data) with the page laser printer Epson EPL-7100, using Linux?

September 7th, 2008 by matthias

If you want to produce larger amount of copies, avoiding the retransmission of data is of course desirable for performance reasons (with this printer 6 pages/min instead of 0,5 pages/min, where the latter value is due to awesome slow transmission).

The printer has two general possibilities for that: setting this via the printer driver, or settng this in the menu on the printer itself. The first approach seems to be impossible at Linux (I have a vague remembrance that it was possible once). The second approach is possible, but not with the default (Laserjet2) printer driver. Instead:

  1. Enter the desired number of copies on the printer itself.
  2. Switch to the system menu of the printer itself, save the current configuration as “Macro 1″ and then load “Macro 1″.
  3. Use Gutenprint to print your file to the printer. Then (and then only) the copy settings in the printer get used, while the Laserjet2 driver overwrites them. To use Gutenprint, print from GIMP. The easiest way to print any document in GIMP is to convert it to PDF, open the PDF in GIMP using the printer’s maximum resulution (here 300dpi), and then print it with GIMP.

Posted in Gimp, Linux, Sprache: Englisch, alle Artikel | No Comments »

How to re-encode a MP3 audiobook to fit on a MP3 player?

August 12th, 2008 by matthias

Say, you have a large MP3 audio book (or song file collection; in this case, the Bible) and want it to fit on a MP3 player, but it does not fit. Then you can re-encode it, using lame on Linux. While a 128kbps MP3 file (normal audio book quality) has approx. 1MiByte/min, a “lame -V9″ variable birate maximum compression MP3 has just 0.4MiByte/min. When allplied to spoken language, the loss in quality is only a little.

To convert your audio book (stored in dirs and files below them), use the following:

for dir in *; do \
  for file in "$dir"/*; do \
    lame -V 9 "$file" "${file/.mp3/_V9.mp3}"; \
  done; \
done;

Then, to sort out the newly generated files, and rename them to short names, do the following:

mkdir AudioBookV9;
for dir in AudioBook/*; do \
  mkdir "${dir/AudioBook/AudioBookV9}"; \
  mv $dir/*_V9.mp3 "${dir/AudioBook/AudioBookV9}"; \
done;
for file in AudioBookV9/*/*; do \
  mv "$file" "${file/_V9.mp3/.mp3}"; \
done;

You now have all relevant files in directoy AudioBookV9, and if the size fits (check with “du -h AudioBookV9″), you want to transfer that to your MP3 player. So mount your MP3 player as a mass storage device (here, to /media/misc/), and do this:

for dir in AudioBookV9/*; do \
  sudo mkdir "/media/misc/${dir/AudioBookV9/}"; \
done;
for file in AudioBookV9/*/*; do \
  sudo cp "$file" "/media/misc/${file/AudioBookV9/}"; \
done;

Here, note that we don’t simply copy the whole AudioBookV9/* stuff recursively. This will work in most cases, but for the KINGZON MP3 Player M3ZH it did not. Here, the order in which files are written to the device is important, as “normal playing mode” means that the device plays files in this order, not in alphabetical order. Copying whole directories makes the files being written in the order they appear in the directory tables, so you’ll have all your chapters mixed up on the device. So use the workaround, or you’ll hear the end of your fascinating novel way too early ;)

Posted in Audiobearbeitung, Sprache: Englisch, alle Artikel | No Comments »

How to change the margins of PDF files while keeping the page size (“scale the content”)?

August 2nd, 2008 by matthias

This is on how to achieve that on Linux. There are multiple alternatives, but none of them proved perfect yet. Let’s start.

Alternative 1: using pstops

The task can be performed with pstops (whereof psnup is a simplified frontend).

  1. Convert your PDF file to a PS file, by printing to a file in Adobe Reader, or by using pdf2ps (Ghostscript-based) or pdftops (xpdf-based).
  2. Use pstops to adjust the marginspstops -p a4 "L@.9(1cm,1cm)" in.ps out.ps

    On mounting pages: here, the the task is to mount two A4 pages in A5 format on one A4 page, guaranteeing page margins of 3cm at left and right and 2cm at top and bottom. We need a width of 150mm and a height of 257mm. To scale 297mm (A4 height) to 150mm, use factor 0.505. Such n-up mounting together with freestyle adjustment of margins is not possible ith psnup, which has a simpler user interface.

    pstops -p a4 "2:0L@.505(18cm,2cm)+1L@.505(18cm,14.85cm)" in.ps out.ps

  3. Convert the PS file back to PDF by using pdftops.

The problem with psnup (from PSUtils Release 1 Patchlevel 17) and also of its frontend psnup is that it converts fonts to bitmap fonts (Adobe Type 3). This can be detected as rastered fonts when viewing the PDF file with Adobe Reader. It generates somewhat lower print quality, but is still acceptable. What is not acceptable (with respect to file size) is that pstops converts the whole file to an image if it has no idea how to treat it.

To debug pstops and psnup output, you can use the -b option, which will mark out the original pages’ borders.

Alternative 2: Using Adobe Reader and printer margins

This has not yet been worked out, but might be possible. When printing (to a file or otherwise) Adobe Reader will fit the pages into the printable area of the selected printer. Now the idea would be to choose the special printer “Custom …”. This allows you to enter a lp print command. Per the lp documentation, CUPS lp understands options to set the margins. For 2cm at top and bottom and 3cm at left and right, use this command:

/usr/bin/lp -o page-top=57 -o page-bottom=57 -o page-left=85 -o page-right=85

However, this currently does not work our, for an unknown reason. It does not change anything, i.e. the document is printed as if you selected “Page scaling: none” instead of “Page scaling: fit to printable area”. If this command is not possible, another alternative would be to set up a printer definition with exactly the margins you desire, in a way that lets you change these margins easily.

If you just need “larger” margins around your page (without exact measures), you can do the following:

  1. Print the file with Adobe Reader to a PS file. Use the option “Page scaling: fit o printable area”. Try several different printers including the “Custom …” special printer to find one that adds margins of the same size all around the page. This step will generate a PS file with larger margins than the original PDF file had, even though this is not correctly shown in the preview of Adobe Reader 8.1.1.
  2. Convert the file to a PDF file by using ps2pdf.
  3. Repeat from step 1 with your new PDF file until the margins are large enough for you.

This method has the obvious disadvantage of not allowing to specify the margins exactly, but at least the file retains vector fonts and graphics (unlike whenusing pstops, see above).

If you need the margin adjustents in combination with Adobe Readers n-up printing (e.g. 2 pages per sheet), and if you can adjust the margins in your source file (before generating the initial PDF), you can do the following: adust the margins in the original file so that, after the n-up scaling, these margins together with the selected printer’s margins, result in the margins you desire. When printing one two pages A4 on one sheet A4 with the special prnter “Custom …” in Adobe Reader 8.1.1, the following margins are used (measured in the output A4 page):

  • left 5,25mm
  • right 13,71mm
  • top 6,59mm
  • bottom 6,59mm (probably)

Alternative 3: Using Adobe Reader and printing the “current view”

Another alternative:

  1. Use Adobe Reader to print your page to a large sheet without scaling.
  2. Use ps2pdf to re-destill the PS file to PDF.
  3. Open the new PDF file in Adobe Reader and choose an appropriate view. Use the “print current view” option in Adobe Reader to print exactly thet view.
  4. Re-destill the PS output to PDF.
  5. That way, you can even achieve n-up mounting: say you generated to pages, each with content in different places and the rest white space. You can overlay these pages on top of each other using pdftk with the background option.

Alternative 4: Using ghostscript

The following would constitute an elegant solution: freely adjustable margins and vector fonts and images, and avoiding the GUI hassles. Sad enough, it does not work yet. The idea come from this thread.

gs \
  -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dSAFER \
  -dCompatibilityLevel="1.3" -dPDFSETTINGS="/printer" \
  -dSubsetFonts=true -dEmbedAllFonts=true \
  -sPAPERSIZE=a4 -sOutputFile="out.pdf" \
  -c "<</BeginPage{0.5 0.5 scale -90 rotate -2384 0 translate}>> setpagedevice" \
  -f in.pdf

The instruction works, apart from the important -c option part, which adds a postscript command. This instruction would also work with PDF files.

Posted in Acrobat Reader, DTP, Sprache: Englisch, alle Artikel | No Comments »

How can I do arbitrary vector-oriented modifications to a PDF file on Linux, though all the editors are still buggy?

August 1st, 2008 by matthias

If you have a PDF which you want to modify (and can’t get hold of the source format file), you might have tried scribus to open and change it. This works well for some simple cases, but there are many import errors for more complex files. Also, the same might happen with Inkscape if you have managed to convert the file to SVG format, e.g. using CorelDraw on a Windows virtual machine.

What’s always possible is to import the file in GIMP (“File -> Open”), using a high resolution such as 300dpi. But you might want to avoid the big file sizes and loss in quality, and want to do it vector-oriented.

Therefore, the following solution should work out:

  1. Convert a low resolution raster image version of the PDF file, e.g. by opening in GIMP (with 150dpi) and saving in JPG format.
  2. Create a OpenOffice.org Draw drawing with the same page size as your PDF file, and place the raster image file into the background.
  3. Now that you have the placement of things, change the appearance by adding other elements on top your background. You might want to get some vector oriented elements out of the PDF file by importing the file in scribus and saving as SVG or EPS file, importing that in OOo Draw.
  4. Export the OOo file to PDF, temporarily removing your background for that. The good thing is that all OOo Draw elements get high-quality vecor oriented PDF elements.
  5. Add your new PDF file on top of the original one, by saying something like:
    pdftk in.pdf stamp overlay.pdf output out.pdf;

As the original vector elements are never converted from PDF to another format and back, no import / export bugs can come in the way!

Posted in DTP, Graphikbearbeitung, OpenOffice, Sprache: Englisch, alle Artikel | No Comments »

How can I combine two PDF files so that they are “overlayed” one to the other?

July 21st, 2008 by matthias

This might become necessary if you want to add content to an already existing PDF file, e.g. when doing pre-press work for flyers etc.. For example, you might resize one PDF file to have additional place on the page, and put other content into that free space.

Get yourself two PDF files to overlay on each other, both of the same page size (else automatic resizing is employed), and both one-sided.

Then use this command to generate the combined PDF file with overlays:

pdftk file1.pdf background file2.pdf output out.pdf

Posted in Acrobat Reader, DTP, Sprache: Englisch, alle Artikel | No Comments »