Convert from CHM to HTML for free
Tuesday, September 29th, 2009First we need to extract the contents of the CHM file. For this I use a program on linux Chmlib.
Install Chmlib in Ubuntu
sudo apt-get install libchm-bin
Convert .chm files in to HTML files by running the following command
extract_chmLib book.chm outdir
Once this is finished you should be left with a lot of files in your new directory. Amongst these there are two which we still need a bit of love. book.hhc (Table of Contents) and book.hhk (Index). You can open these files in an editor and see the links and menu structure but it’s not true HTML. I created a php file based on one written by Darren James Harkness which will sift through the hhc file and convert it to good semantic HTML and also add the required javascript we need to make a usable menu.
Download convertCHM.tgz. This includes the php conversion file and the javascript to run the menu and the css file.
Convert the hhc file with the following command:
php5-cgi convertCHM.php chmtitle=GeoBase
You will also need to copy docTree.js and docStyle.css to the same location as your webtoc.html file.
Also copy the images in the images directory to your images folder.
One last thing to note is that if your images are not showing correctly it’s likely that there’s some case sensitive problems. These don’t have an issue on a windows platform since it doesn’t care about case sensitivity. However on Linux the file names and folders need to match that of the html.
A quick way to detect which images (and links) are broken throughout your HTML documentation is to run Xenu Link Sleuth on it.
Note: Although not used, the hhk (Index) file is also converted in the process but I did not use this file as I could not find a decent javascript file to run it nicely.