Saturday, August 10, 2013


I'm a reasonably technical guy, but this really surprised me when I first read of it: a broad range of Xerox brand copiers can change the numbers in a document being copied.  Xerox has owned up to the problem, and appears to be approaching the resolution of it in a constructive way.  It may not be just Xerox copiers, either.  As David Kriesel (the researcher who first reported the issue) points out, this can have consequences that are costly (e.g., invoice with the wrong numbers on it) or even deadly (e.g., a bridge design with the wrong specifications).

My expectation of copiers is that they will produce copies that are perfect duplicates of the original, within the resolution constraints of the device.  They might have optical distortion, color distortion, streaks, etc. – but they shouldn't just move shit around or make shit up!

But truthfully, I should have known better.  After all, I have some idea how a modern copier actually works.  They do not work like many people imagine: kind of like a camera.  Not at all.  Instead, every modern copier is actually a small, special-purpose computer with (at minimum) a scanner and a printer attached.  Fancy copiers may also have a whole bunch of paper handling mechanism attached, and possibly multiple scanners and printers, too.  But for this discussion, let's stick with a simple copier like you might have in your home office.  We have one such copier, which also works as a scanner and a printer – a multifunction device.

These copiers work like this when you press the “Copy” button:
  1. The original document is scanned into the computer's memory as an image.
  2. The image is compressed and stored on the computer's hard disk.
  3. The image is decompressed and printed.
The second step is where the problem lies.  Actually two problems.

The first one has been long recognized, and isn't the main subject of this post: many (probably most, actually) copiers store copies of the documents scanned on the hard disk indefinitely.  The hard disks can easily store millions of pages of documents, so there's not even much danger of filling one up.  That's a security problem for just about anybody, even home owners, as someone could steal that hard disk and retrieve copies of anything you've ever copied. 

The other problem is the new one, and it's related to the image compression step.  “Compression” is a method for reducing the size of the image files the copier saves to disk.  The more compression, the more pages that can be stored.  The Xerox copiers (and many others as well) allow users to configure exactly what kind of compression is used.  The most important choice is between lossless and lossy compression.  Lossless compression will make a perfect copy of the document, but without as much compression.  Lossy compression will make an imperfect copy of the document, but with (much) more compression.  Lossy compression methods (algorithms) achieve their higher compression by compromising on some aspect of the copy.  There are many different ways to do this, with different consequences for the quality of the resulting image copy.

One of the lossy compression choices Xerox provides is an algorithm called JBIG2.  The preceding link has a good general discussion of how JBIG2 works, but here's the relevant bit for this issue: for text (including numerical data), JBIG2 recognizes characters (much like OCR, if you're familiar with that) on the page.  This recognition process is far from perfect, and can easily result in one character being mistaken for another.  The JBIG2 algorithm keeps track of the context of what its compressing – if it sees a column of numbers, it's much more likely to match another number than it is to match a letter.  So the mistakes it makes in recognizing characters to match can, in fact, result in choosing the wrong number.

So what?  Well, let's say the original document has a “5”, but JBIG2 matches a “2” instead.  The copy that gets printed will have a “2” where the “5” should have been.  Oops.

The actual process is much more complicated than I just described.  There's one element of that complexity that matters for this discussion: JBIG2 can be configured for different degrees of “looseness” on the character matching.  The problem I described really only matters when this configuration is set for high levels of compression (which use loose matches).

Xerox has a firmware patch that fixes the problem by the simple expedient of disabling JBIG2 compression.  As Mr. Kriesel points out, getting that patch to all of Xerox's copiers in the field is not simple.  Inevitably there will be customers who never get the message – and if they've configured their copier(s) to use JBIG2 at high compression, they could be in for some very rude – and possibly costly or deadly – surprises...

A big oops!

There's a Message Here...

Just read in the news that the rate of Americans renouncing their citizenship has jumped sixfold – and over 100,000 people (not all Americans, though) have signed up for a one-way trip to Mars.

Used to be that people saw America as the golden land of opportunity.  Now they're willing to go to great lengths to get out of here.

The drums of doom.  I hear them beating...

Quote of the Day...

Nobody beats Mark Steyn for provoking laughter, anger, and tears in the same piece, and sometimes even in the same sentence.  His current piece (Know Thine Enemy) at NRO is a great example of his art, this time on the trial of Major Hasan.  The quote:
“ can’t get much more diverse than letting your military personnel pick which side of the war they want to be on.”
Do read the whole thing...

What the Hell is Happening to My Country, Part 49,925...

Repeal the Bill of Rights?  You betcha!

Words fail me...

Filthy Filner: Just About Out of Support...

Weeks after the public revelations started, the last Democrats willing to defend Filthy Filner are turning on him.  The last city council members have called for his resignation, as have California's two Democratic Senators and nearly all of his former Democratic colleagues in the U.S. House of Representatives.  Even the national lamestream media is starting to take notice, though (of course!) they rarely mention Filner's party association.

Michelle Malkin has a good piece on the Filthy Filner coverup at NRO, and San Diego's 10 News has a straight up piece on the latest twist in the Filthy Filner Saga: allegedly he started his intensive therapy a week early, and so finished a week early – yesterday.

Will Filthy resign?  Will he fight on?  For locals, it's an entertainment smorgasbord, with every choice quite tasty...

Traffic Jam Prank...

Via my mom:

4k TV...

Debbie and I were down the hill yesterday, doing various and sundry errands.  Along the way we had a lovely meal of eastern scallops, definitely the best part of the day!  We also ran into completely unreasonable traffic, reminding us once again of why we really don't like coming down off the hill very much :(

But one of the things we did was kind of fun – we wandered into Fry's Electronics to see what a 4k TV (also called Ultra High Definition TV, or UHD TV) actually looked like.  The model we looked at was one of Sony's latest.  They're quite expensive now, but certain to come down in price very quickly as more manufacturers get into the game and the competition heats up.

We weren't there to buy one, just to look and see what the fuss was about.  We own an HD TV now, so that was the benchmark for our comparison.  Both of us were very impressed; in my case, much more so than I was expecting to be.  What got my attention was this: from four feet away from the screen, if I stood perfectly still while watching an outdoor scene, I could almost imagine that I was looking at a window and not a TV.  The result was better than any slide projector I've ever seen – so immediately I thought about what a great way to show off still photography this would make.  The movies were impressive, too, but I rarely watch either TV or movies, so that's not too compelling for me.  Debbie, on the other hand, is a big movie fan, and watches quite a bit of TV, too.  Her reaction was to comment on the three dimensionality of the sample movies, and the resolution.  When content is available (and the price comes down!), she wants one.

Reader and friend Cliff F. passed along this article explaining about 4k TV; lots of good information in it.

From a technical guys perspective, there's a lot of marketing spin at work here.  First of all, the horizontal resolution isn't actually 4k (4,096), it's 3,840, yielding an image size of 3,840 x 2,160 pixels (for a 16:9 aspect ratio).  But even this is misleading, as (most of) the cameras used to capture this content don't actually have 3,840 RGB-sensing pixels on a line (which would require 3,840 x 3 = 11,520 sensor cells per line).  Instead, they use sensors with 3,840 sensor cells and a technology called “Bayer filters” that provides 1/4 the areal resolution you think you're getting – then uses a “de-mosaicing” algorithm to computationally estimate what an actual RGB sensor would have seen.  This is, of course, very different than the 3x CMOS cameras, which are apparently impractical for 4k TV; these actually deliver the resolution specified.

Being an engineer, this kind of misleading specification really ticks me off.  If someone told me I was buying a “4k” TV camera with a 16:9 aspect ratio, I'd expect to have a 4,096 x 2,304 pixel TV.  That would imply (to me!) a sensor with 4,096 x 2,304 x 3 (for RGB) = 28,311,552 individual sensors on it.  What I actually get is 3,840 x 2,160 = 8,294,400 individual sensors.  That's less than 30% of what I expected from the marketing claim.  Annoying!

I don't know if the displays for UHD TV are specified in the same misleading way, but I'd bet they are.

In any case, that's all an engineer's rant.  From a user's perspective, the 4k TVs are beautiful, but expensive.  But then, there are a lot of things like that :)

Remember When...

Passed along by reader, friend, former colleague, and Idaho real estate mogul Doug S:

Yes, as a matter of fact, I do remember...