Saturday, June 13, 2009

fundamentals of better programming


Just a quick note. This site, SourceMaking, is probably the best single collection of information on three key areas of software development: Design Patterns, Refactoring and UML.

If you aren't actively practicing the methods described here then you should at least be reading about them and hoping to put them into practice soon. I'm sure you are already a pretty good programmer, why not become a better one?

P.S.
The clickable geek picture will bring you to the 56 Geeks Project where you can find even more geeks you may recognize.

Sunday, June 07, 2009

PDF bookmarks


This isn't a programming post but if you are a programmer, you've probably had to read a PDF based ebook or two. Everyone has had to open a PDF file at one time or another right? You click on a link on a web page which turns out to be a PDF and if you're lucky, Adobe's Acrobat Reader pops up and in a few short moments, you are reading the material you were after. Far too often however you are stuck waiting for a multi-megabyte file to download and then when it finally finishes, Acrobat throws up a baffling dialog with an obscure list of things that it thinks are vital for you to download immediately. It really throws off your rhythm.

It's even worse if you don't have Acrobat installed. It's an ever changing gauntlet to finally get to a page that actually has a link to download the software you need to read what you just clicked. And acrobat just keeps growing! It's up to about 41 MB now and seems to want to dig its grimy little fingers into every part of your system. If it ever finishes installing you will probably be told to restart your browser. Eventually you'll have your browser running again and you'll be staring at the Google home page saying to yourself "Where the hell was I and what was I trying to read?!" and "What the hell are all these new icons on my desktop?"

It's experiences like this that really make me hate PDF files. But if you want to read many of the ebooks that are out there, you don't have much choice. I've found myself reading more and more PDF based books so I can't avoid them anymore but I've made a couple of startling discoveries recently that I felt like sharing.

First, Adobe Acrobat Reader isn't the only free reader available for Windows. What? Yes! It's true. There are probably more than these and I haven't rigorously tested them but here they are in order from big, feature loaded to light and fast:
Now the first two are commercial and allow you to get extra features in exchange for cash but the free versions are more than adequate. You may encounter some of the same kind of nagging you get from Acrobat but it should be much less. Sumatra is free and open source.

After spending an hour or more scouring the internet for these Acrobat alternatives I realized that I should have started at this Wikepedia page of everything PDF related. Oh well.

Where was I? Oh yes, the second realization. None of these 'book' readers can do the one basic, fundamental thing that you can do with any crappy, worn out, dead-tree book on the planet.. stick some reasonably flat object between a couple of pages and walk away, knowing that when you get back, you can resume reading right where you left off. PDF readers have no bookmark feature!

When I realized I didn't want to read my 243 page book in one sitting, I went searching through the menus in Acrobat looking for the bookmark command. When I finally did find something with the word 'bookmark' on it, it turned out to be hardwired into the PDF file and only if the authors decided to provide some. Hmm, a bunch of bookmarks laid out in sections that mark all the key sections of the book... like, oh I don't know.. a table of contents! Ok, so Adobe thinks a table of contents should be called bookmarks. Well, in a world of internet browsers that's not confusing at all. Geez.

The other menu items in Acrobat seem to hint at the ability to edit or mark up a PDF file but that must be some of the functionality they want you to pay for. This is essentially what started me looking for an alternative to Acrobat. Sumatra looked nice and light but it is only a reader pure and simple, no help there. Foxit however looked much more promising.

I wasn't really keen on editing the PDF itself but that seemed like the only option. It turns out there are alot of ways to edit a PDF that all sound slightly similar. I'm sure this is great for publishers but it's way too much for somebody just looking for a way to mark their position in a book.

One thing that looked promising was the ability to add what looked like little post-it notes. You could place them anywhere on a page and they could hold all kinds of text. Unfortunately, the only way to navigate to a note was to page through the PDF file one page at a time until you found the note. There was no way to navigate directly to a note! Unbelievable... back to Google.

After some time, I stumbled on a link to a book called PDF Hacks. There are alot of tips and tricks there, most of which you need to buy the book to see but down the list at #15 was Bookmark PDF Pages in Reader. It seems that Acrobat can can run Javascript and this guy wrote some Javascript that will add the ability to bookmark a page. Best of all, it doesn't modify the PDF file to do it. It is stored with your application settings for Acrobat instead. The next time you open that PDF file, you can go to any of the bookmarks you created for it. Excellent!

This is what I've been looking for. Unfortunately I have to go back to Adobe Acrobat Reader to use it but I can live with that. If you can live with it too, here's how you do it.
  1. Go to this page and download the zip file that is linked there.
  2. Unzip that file. You should end up with a file named "bookmark_page.js".
  3. Copy that file to the "Javascripts" directory for your copy of Acrobat. You will typically find that directory at "C:\Program Files\Adobe\Reader 9.0\Reader\Javascripts", or something like that.
  4. If Acrobat is running, close it and then run it again and open your favorite PDF file. You should now have some bookmark commands under the "view" menu as shown in the following image..

That should do it. Enjoy!

Update:
It would seem that the javascript hack above was written before Adobe made some changes to security settings that affect javascript. This may manifest as an error dialog stating there was an internal error. I don't know the Acrobat API or javascript enough to troubleshoot the problem but a work around is to go to the 'Edit' menu in Acrobat and open 'Preferences'. Find the 'Javascript' settings and uncheck 'Enable global object security policy'. I also don't know exactly what implications that change brings either so use with caution.

Sunday, May 17, 2009

java time


I've dedicated myself for the last 2 years to the task of completely mastering C++. I had used C++ on and off for a long time but never really made an attempt to really know it so I read, and researched and practiced and read again.

It didn't take long to realize that if you want to use C++ effectively, you better establish or learn, some best practices.. alot of them! Every time I peeled back a layer of functionality, I found 5 more underneath. After 2 years, I was starting to wonder.. how long will I have to do this? How many gotchas can one language have? Why do people use this fricken language anyway?

One of the more valuable resources I found while searching for good sources of C++ information was this one.. the C++ FAQ LITE. If you are a C++ programmer, you should be doing the things that are described here.

This FAQ has gems of information like this:

[18.5] What's the difference between "const Fred* p", "Fred* const p" and "const Fred* const p"?

You have to read pointer declarations right-to-left.

  • const Fred* p means "p points to a Fred that is const" — that is, the Fred object can't be changed via p.
  • Fred* const p means "p is a const pointer to a Fred" — that is, you can change the Fred object via p, but you can't change the pointer p itself.
  • const Fred* const p means "p is a const pointer to a const Fred" — that is, you can't change the pointer p itself, nor can you change the Fred object via p.

This was all well and good until one day I found the C++ FQA LITE (frequently questioned answers). The floodgates were opened. All the things that have been bugging me for years about C++ were laid out in a very well considered manner. If you are a C++ programmer that has had enough, you should read this. Heck, you should read it even if you aren't a C++ programmer, it's very entertaining. You'll find gems such as:

The idea to overload "bitwise exclusive or" to mean "power" is just stupid. I wonder where they get these ideas. It's as if someone decided to overload "bitwise left shift" to mean "print to file". Wait a minute - they did that, too... Oh well.

I had been fooling myself for years that C++ was the way it was for a good reason and that if I just kept at it, it would all make sense and I would be master of all that I survey... NOT. It was a great relief however to know that I didn't have to keep slugging it out. I could give up and not feel ashamed. But now what?

I still have to use C++ at work which is fine but what about my own development projects?

I decided to stick to what I had already considered to be the more important requirements for a language. It had to be well thought out, clean, powerful, backed up by a large selection of libraries and be cross platform. Oh, and I shouldn't have to be a masochist to want to use it everyday.

After some consideration Java and Python were the front runners and after a little more consideration Java seemed the clear winner. Python still looks interesting and I may yet give it some time in the future but for right now, Java is it.

Java has come a long way from when it first burst onto the scene back in 1996 (when the JDK 1.0 was release). I have to admit that I held alot of biased opinions of it over the years based purely on what was known about it in that first year or two of its existence. When I finally got around to giving it serious consideration, I found a robust developement framework that I should have adopted years ago.

So.. It's Java time!

Sunday, November 09, 2008

to XMP or not to XMP (part II)

(Goto Part I)

Why XMP?


Ok, so I'm supposed to talk about XMP. Here's how it is. I think the best way to organize pictures is the same way that you should organize mp3s; metadata. Sure, you can create a database and store all of your info about each picture so that you can do nifty queries and stuff. Maybe even store the pictures themselves in the database. I know that certainly crossed my mind. That's one of the reasons why I went down the road with Gallery at the start. But it won't be long before you've copied a few pictures around and realized that all of your data telling people about when the picture was taken and who's in it is sitting back in your database where nobody can reach it.

Afterward I started thinking about all the reasons I liked ID3Tags for my mp3s. I won't go into all the pros and cons here, I'll just summarize that I think metadata tags is the best option. Why can't I have ID3Tags for images? It's something I never really paid much attention to before. I knew there was some sort of tagging going on cause some image viewers could show me various tidbits of information about my pictures that went beyond the normal information I thought would be stored in a .jpg. Things like ISO speed and focal length.

Some quick research on Wikipedia had the explanation (you'll find that I link to Wikipedia quite a bit). The are essentially 3 metadata tag formats for image files: EXIF, IPTC and XMP. Exif is an earlier standard specified in 1998 and is generally supported by all digital cameras. It has many shortcomings however that include the fact that it is no longer maintained by a governing standards body. Also, many camera manufactures use their own modifications of the structure and the data can easily be corrupted by other programs that don't fully parse and recode the data when modifying it. The biggest drawback though is that it is only supported for JPG and TIFF images. In any case, the odds are pretty good that all of your digital pictures have an EXIF tag.

Next is IPTC. This was actually developed as a standard for specifying metadata in news items and any media in general. What we commonly see in photos is a subset of this standard. Just like EXIF, it is really only supported in JPG and TIFF images.

This brings us to XMP. XMP was developed by Adobe in 2001. It is an XML schema that actually incorporates the IPTC definitions. The best part though is that it isn't limited to just JPG and TIFF, it is also supported for these file types too: JPEG2000, GIF, PNG, HTML, PostScript, PDF, SVG, Adobe Illustrator, and DNG.

I was a little hesitant when I found out that XMP is essentially under the control of Adobe but I felt a bit better when I discovered that they released a software toolkit for it under a BSD license. I'm still reading the documentation for the specification and the toolkit so I'll post some more info when I'm done.

Friday, November 07, 2008

to XMP or not to XMP (part I)

Ok, so I've got my mp3s organized. What's next? Well, I'm really overdue getting my pictures in order and that's what I decided to tackle last week.

Here are my main objectives:
  • Private storage. The various web services are pretty good and feature rich but I really don't want to trust my pictures to somebody else.
  • Archive originals. I want to keep my originals completely intact. Edits will always be done to copies.
  • Detect duplicates. Pictures have a habit of multiplying onto every writable device you have.
  • Tagging. I want to be able to search through my collection based on people, things, locations, events and anything else notable.
  • Editing. I don't need anything complicated, just the basics: crop, resize, color adjustment, red eye removal and rotate.
  • Sharing. Even though I don't want to trust online services for preserving my memories, I do eventually want others to see them.
If I was thinking clearly, I would have thought about each of these points and started at the beginning by answering the question "How do I get my pictures from my camera into an archive?" Instead, I jumped right in and tried installing Gallery 2.

Gallery 2 is just a single piece of a much larger "web server" puzzle. To run Gallery 2 you need a web server (Apache), a database (MySQL or PostgreSQL), PHP, and some graphic libraries (ImageMagick or NetPBM). You don't have to use these specific components but they are probably your best choices. I'm sure if I had read the Gallery documentation a little more carefully I would have realized it didn't quite cover everything I wanted but who does that, right?

I downloaded Apache and installed it on my server computer, easy. Next I downloaded PHP and installed that.. easy again. I had used MySQL before and knew that would be easy so I decided to try Postgres because I had read good things and wanted to try it.. and what do you know?.. more easy! This was going pretty good.

The next step was the big one, installing Gallery. To their credit, they are one of the best documented open source projects I've encountered in awhile. They have a very good help file that gives you detailed step-by-step instructions on how to install it. It got a little confusing in parts when they try to explain the different steps depending on which components you use but it is light years ahead of most walk-thrus you will find. There is alot to read though and you must read everything or you will have problems.

Finally, I got to the 'setup' part where you actually run a gallery .php script through your web browser served up by your apache server (or whatever you are using as a web server) and it will take you through all the steps and checks to get Gallery configured and running. This was all going good too until I got to the part that asked for the DB info. No matter what I did, it could not talk to my Postgres server. I spent many hours trying to find information on the web but in the end, I had to bite the bullet and install MySQL. Once I did that, everything went fairly smoothly.

Gallery runs very well and I recommend it to anyone that wants to serve up their own web albums. You can customize it quite a bit and set up finely controlled permissions for users. Be prepared to dedicate the better part of a day installing it and getting it running. As good as the documentation is, you will need to know what you are doing.

I immediately logged in and started uploading pictures after it was all working. It didn't take long to figure out that it wasn't quite what I was looking for. I was hoping I could get it all setup and then go to my wife and say "Look hun! It's easy. You just upload the pictures and then add some tags for each one. You can click here to fix the red-eyes and you are done." Oh well, back to Facebook for now.

I'll get to the XMP stuff in Part II, promise.

Sunday, November 02, 2008

photo stitching

One thing I like to do sometimes when taking pictures with my digital camera is panoramas. You take a bunch of overlapping pictures from the same spot and later you load them up in your favorite image editor and put them together into one wide panoramic view. That's how it's usually done anyways.

If you are lucky, you may have a camera with panorama stitching capabilities built in. I've never tried such a thing but I'm sure it gives satisfactory results for most (feel free to comment on this with your own experiences). Most of the time though, you will use an image editor and most of the time it will give you decent results. When it doesn't, you usually have the option of 'helping' out and sliding the images around to get a better fit. This is usually the point you discover that the camera wasn't quite as level as it should have been for each shot and no matter how much you play around, you can't get a perfect stitch. And that's just for plain horizontal panoramas. What if you wanted to do some arbitrary stitching of a complete mosaic? You know, take a bunch of pictures horizontally and vertically to capture some great scene in far more detail than you ever could with a single picture. Good luck trying to get them all stitched together.

At least, that's what I would be saying before I found Hugin. This program is certainly the king of all photo stitchers. In fact, it is a cleverly unified collection of a few different photo processing technologies. It is open source and is available for many platforms including windows. Finding an actual windows binary can be a bit tricky so I've provided a link to the current installer here.

This is a powerful program so if you are really interested in using its full potential, you'll have to poke around in the documentation a bit. In the mean time, it has a wizard based option that should give you pretty good results. The only part that confused me a bit was the 'align' stage. The program actually did all the aligning automatically and presented the results in a viewer. It wasn't immediately obvious what I was supposed to do next. I eventually figured out that I could click around the image to set a center point and then adjust the sliders at the edge to trim off the rough edges. There is no 'proceed' or 'next' button here, you just have to close the window and then click on the button for the 3rd and final stage that actually outputs the final image. This will come out as a .tiff file so you will need to load that into your favorite image editor to convert it to something more portable and usable. I suggest saving it as a .png and archive it away somewhere. You can then use this master to create a .jpg or resize it to fit your needs.

I've only scratched the surface on this program so please, download it and try it out for yourself. I can't wait to try some more complicated stitches.

Saturday, November 01, 2008

getting organized

Lately I've been trying to organize my various collections of media, starting with my mp3 files. It had been many years since I even thought about them so I figured I had better start from scratch looking for software to organize and manage them. Surely things had advanced greatly in that area by now.

One thing that had not changed in my mind is that the best way to manage mp3 files is to ensure that each one has a sufficiently filled out ID3Tag. If each file has a good tag, it is completely trivial to get software to read all those tags in and give you access to your music in incredibly flexible ways. If you are ripping CDs to mp3 then whatever program you are using to rip will most likely fetch track info from CDDB or FreeDB and you will probably end up with propely filled out tags. Sometimes however you may find that you have some tracks with missing or incomplete tags and these can be a pain to fix.

I had a number of tracks with incomplete or bad tags and it was going to take me a very long time to find the correct info and type it in. Fortunately, finding the correct info is alot easier with a site like AMG's allmusic.com and getting it into a tag was even easier with a program I bought many years ago. It was called Tag&Rename. I looked at alot of programs at the time and this one was clearly the best. It had all the functionality you needed for mass tagging and renaming and above all, it could scrape track info from AMG's site! I didn't have to type anything.

Alas, that functionality had to be removed as it was against AMG's TOS. When I updated the software I found that it now used Amazon for this source of data. Amazon is a good source for track info but it is not nearly as extensive as what AMG has. ID3Tag&Rename also provides access to tracktype.org which goes a long way to filling in the gaps. Where Amazon's data falls short is when you start trying to tag tracks earlier than 1990. Sure, you may find the album/track info but you will probably have to intervene to get the original release dates correct as they are more likely to list dates for re-releases or CD versions of earlier LPs.

If you want quick an dirty automatic tag filling you may want to try Winamp. I should mention however that I find much of Winamp completely confusing and should generally be considered a perfect example of how not to design a user interface. But once you figure out how to get it to fill tags though, it can do a pretty decent job. The show stopper for me unfortunately was I couldn't figure out how to tell it to save cover art to each track. I really hate the concept of putting a .jpg in a directory and hoping that it manages to stay associated with the tracks it is referenced by when I move them around and transfer them between computers and mp3 players. Some people will tell you that it's a waste of space to store the same image in 12 or 15 tags but the size is small enough to ignore and the savings in peace of mind is much greater.

Ultimately, I had to give Winamp a miss and replaced it with Media Monkey. I found this to be a more than capable music player and had a reasonably sensible interface. I especially liked the fact that I could tell it to recognize tracks by a check sum based only on the music data. This means it will not easily get confused when I need to fix a tag for example which more often than not will mean that I need to fix the file name as well. By having a check sum based on the data only, Media Monkey won't think it is a new track or think that the previous version has disappeared. My play lists will still be able to find and play the track when it gets to it.

File renaming is the second thing that ID3Tag&Rename does extremely well. It's another reason why you want to have properly filled out tags too. With a single click, you can tell this software (and many others) to rename all of your files using a specific pattern based on almost any tag attribute. It will also create any level of sub directories based on tag info too.

One final consideration for managing your collection effectively is being able to take it offline. If your collection is large, you probably want it backed up to removable media like recordable DVDs. Media Monkey makes some mention of this but I haven't played with it yet. I will mention that I was very please with a program called Music Library. It hasn't been updated since 2006 but I think this is because it's one of those few examples of software that picks one thing to do and does it right. Basically you would backup all of your music on to multiple CDs or DVDs and then feed them into this program one at a time. It will slurp up all the tag info and create a database of it. Once this is done, you can search and create play lists at will. What was really cool though is if you created a play list of songs that spanned 8 or 9 different discs and wanted to play them from your hard drive, it would prompt you for each disc in succession and copy all of the tracks effortlessly.

I used these programs to tag and organize not only all my mp3s but my wife's and kid's music too. They are now all neatly available from a network drive and are safely stored on a RAID 2 array with read-only access. I took a little time to show them how to create play lists and use them to sync to their mp3 players. It's alot easier for everyone. They don't have to worry about copying files around by hand and I don't have to worry about losing files :)

There are many other programs that are pretty good for these tasks as well, far too many to list or talk about here. If you know of any that deserve mention, let me know in a comment.