TruerWords Logo
Google
 
Web www.truerwords.net

Search TruerWords

This is one of my journal's many "channels."
This chanel only shows items related to "Programming."

Welcome
Sign Up  Log On
Wednesday, November 21, 2007

An Entirely Other Day - Wide vs. Deep

An Entirely Other Day - Wide vs. Deep

So here’s my theory: Managers must work shallow and wide, while programmers must work narrow and deep. People who are naturally tuned to one particular method of work will not only enjoy their jobs a lot more, but be better at them. I’m a deep guy, I should be doing deep work.

This article and his theory remind me of something (er, someone) which seems to be completely unrelated: Michael Jordan. When he retired from the Bulls for the first time (shortly after his Dad died) to see if he could play Major League Baseball, he found it very difficult to hit those legendary pitches.

What's the connection? The pitching coach (of either the White Sox or the AA team where he played... the Barons?) said his problem was one of focus. When you play basketball, you have to be aware of everything going on around you all the time. Peripheral vision is key. When you're trying to hit a 90-mile-per-hour baseball, you need absolute tunnel vision, total focus on that one task.

That's the difference between managing and programming.

Like the author of "Wide vs. Deep," I've done both and I prefer programming. (Managing my crew at Macrobyte during its heyday was fine, but I'm referring to my time at RR Donnelley in the mid-90's.)

(Thanks to DF for the original link.)

Monday, November 12, 2007

Faster Code (Converting HTML to Text)

Over the years I've written this code in a few different languages: take some HTML input, process it according to some of the basic rules a browser would use, and spit out plain text (no tags or HTML entities). By "basic rules a browser would use," I mean that e.g. a series of <p> tags should not result in a long blank gap, but a series of <br /> tags should. Line breaks (r or n) don't matter except within a pre-formatted section. Etc., etc.

My first attempt, I think, was in straight UserTalk. Then i rewrote it a couple of times with regular expressions (still UserTalk) to make it faster. Then a client needed it in a language that could be used on any Mac OS X box, so I rewrote it in Perl, which was faster still (and was much better about converting the HTML entities to UniCode).

The Perl script uses lots of regular expressions, and so makes many passes over the input, changing the text in place. It worked well enough for most HTML, but long documents with a very high ratio of tags-to-text (that is, very tag heavy) would process very slowly. Unfortunately the script was run automatically in the background by a "regular" GUI application, and so the app would seem to freeze up for a little while as it processed one of these pathological cases.

Over the last week I rewrote it again, this time in pure C++. It's a command line tool with the same basic interface that the Perl script had: you can pass it an argument to specify the input file and output files. Omitting either one causes it to use standard input and/or output.

The new tool makes a single pass through the text, doesn't use any regular expressions, and generates slightly better output. Actually, it's more honest to say that it makes three passes through the text: first it converts UTF-8 to UTF-16 (but that's an OS API service), then it processes the UTF-16, then it converts back to UTF-8 (again, just done by the OS) for output.

The timing results speak for themselves. These tests use the worst, most pathological example we had. It's a 200 KB file that's about 90% tags (specifically, it's a long email exchange where everybody top-posted and quoted everything else, and everyone used HTML messages.)

$ time striphtmltags.pl - < ./striphtmltags.input.html > ./striphtmltags.output.txt # old one
 
real    0m20.201s
user    0m19.774s
sys     0m0.352s
 
$ time newstriphtml < ./striphtmltags.input.html > ./striphtmltags.output.txt # new one
 
real    0m0.048s
user    0m0.039s
sys     0m0.010s

They wanted it faster. For this worst-case scenario, it's 420 times faster. Zoom.

Thursday, September 27, 2007

My First Core Image Filter (for Acorn)

While playing with Acorn, I was trying to figure out how to make a scan of some black and white line art appear as black-and-transparent, so it would look as though it was "drawn" directly onto whatever background over which it was placed.

Not *quite* as easy as it sounds, for a couple of reasons. The biggest problem is that you can't just give all white pixels an opacity of 0% (or 100% transparency). Black and white line art "scans" are not just black and white, even after you clean them up: the edges of the black have a sort of anti-aliasing: light gray pixels that help to smooth out the image. The second problem is that Acorn doesn't support channel-based editing. You can change opacity and color all you want to, but you can't just view and edit the image's alpha channel.

I asked Gus if he thought I should do an Acorn plugin (and described what I was trying to accomplish). See, years ago I wrote a little plugin for a little company called Adobe, for their little app called Photoshop (v.4, I think). The money was good, and we still have the B&W G3 and 17" display that they sent me along with the check. Anyway, I figured if I could write a Photoshop plugin for Adobe, I could write an Acorn plugin for my own use, right? Right!?

Gus said I would be better served writing a Core Image Filter. It's like a filter/plugin, but it's for the OS instead of a specific app.

I didn't make time to work on it until today. It took a couple of hours to find the right documentation and actually write the filter. Gus had sent me the heart of the filter in the form of a single, very short function that just manipulated the alpha value of a pixel based on the channel's combined r/g/b values.

Apple's docs for this stuff are surprisingly bad. I found one tutorial that helped a bit, and then in the end I just had to try a few things to make it work.

Three Images, Three Backgrounds

Plain black and white image, no transparency.
White converted to 100% transparency. The 'halo' comes from the light gray pixels that smooth out the black lines when the image is over a white background.
Image after applying the 'White to Alpha' filter in Acorn.
Plain black and white image, no transparency.
White converted to 100% transparency. The 'halo' comes from the light gray pixels that smooth out the black lines when the image is over a white background.
Image after applying the 'White to Alpha' filter in Acorn.
Plain black and white image, no transparency.
White converted to 100% transparency. The 'halo' comes from the light gray pixels that smooth out the black lines when the image is over a white background.
Image after applying the 'White to Alpha' filter in Acorn.

So what does it do, exactly? Well, the idea was to display black-and-white line art so that it would look "good" on any background (not just white). In Photoshop, the easiest way to do this is to copy the artwork to the alpha channel, invert the channel (100% -> 0%, 0% -> 100%, 25% -> 75%, etc.), go back to the main image (either the RGB channels or the Gray channel) and bring the brightness way down and the contrast way up so that any pixels which were anything other than pure white are now pure black.

(Why? Because their "gray" levels will now be opacity levels, so they will blend against any background. What was light gray is now solid black but mostly transparent.)

That's how I'd do it in Photoshop. Now I can do it in Acorn with a single click. :-D

If you want it, feel free to download a copy of the filter. Copy it to /Libryar/Graphics/Image Units/ or ~/Library/Graphics/Image Units/. I've used it in Acorn, Core Image Fun House, and Quartz Composer… so I can assure you that it works on my machine. ;-)

Update: If you install it, you'll find it in Acorn under the Filters->Stylize menu. Maybe not the best place for it, but I'm not sure where it should be.

Tuesday, August 28, 2007

PMC Software Auction 8: Web Developer's Paradise

The eighth auction is running, and is called “Mac Software Bundle: Web Developer's Paradise.”

This is the most ‘focused’ auction to date. These are tools for web (site or app) developers, and include high-end apps like BBEdit for source editing, Coda and Sandvox for designing, SQLGrinder for databases, Interarchy for FTP, and Screen Mimic for screencasts. (I do this work for a living, and seriously wish I had all of these apps!) Also included Daylite and Billings, to help on the business side of being a web developer.

Yeah, ok, that last sentence sounded like a lame attempt at marketing, but it wasn't. I'm serious. I don't have Coda, Sandvox, SQLGrinder, or Screen Mimic, and could certainly use them. (So maybe it's not a good idea for me to sell a sweet bundle like this to my competition!)

Two last details: there's no reserve this time, and it's a seven day auction instead of five.

Saturday, August 18, 2007

GarageSale and the PMC Software Auctions

I'm taking a minute to post a big "Thank You" to the folks at iwascoding, the makers of GarageSale.

They didn't "just" donate five licenses of GarageSale to the auction. They also gave me a license of my own to use for the auctions, AND... no, wait, this deserves it's own paragraph.

AND... they added a feature to GarageSale specifically because I needed it!

Here's the deal: if you're doing charity auctions on eBay, you need to have an account with MissionFish, who guarantees that the charity receives the money. When you create the auction, you need to provide your Mission Fish info, choose your charity (out of a list of thousands), and specify what percentage of the final sale will go to the charity.

It's a serious pain in the neck, just like everything else related to setting up an auction on eBay (IMO).

With GarageSale, though, it's easy. They remember my MissionFish account in the prefs. They remember my choice of charities. I can create new auctions almost instantly, based on previous auction templates. Plus, the software has this practically endless list of aucton styles!

This project will always be a huge amount of work, but I'm whittling away at the process and it's growing easier all the time. GarageSale and it's charity-friendliness is my number-one biggest time saver, though. (Next year, my Rails app for managing all the data for this project will be the number one time-saver. I'm already using it this year, and it's saving me time, but I also wrote it this year so the real savings won't start until next year.)

Um... "P.S." Last time I did this, I got free copies of Knox, VoodooPad, FlySketch and FlyGesture, but I forgot to thank Marko and Gus. Thanks guys!


March, 2014
Sun Mon Tue Wed Thu Fri Sat
  1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31  
Nov  Apr


RSS: RSS Feed

Channel RSS: RSS Feed for channel

TruerWords
is Seth Dillingham's
personal web site.
More than the sum of my parts.