Gigi on IT: 2009

Saturday, November 7, 2009

defrag interface: one year later

In my first post to this blog last year, I showed the difference between the Windows Disk Defragmenter on Windows XP and that on Windows Vista. The bottom line was that the Vista version lost the graphical interface and the user has no idea what is going on in the background.

With the release of Windows 7, one would hope that this annoyance has been taken care of. Sadly, this is not the case. The interface does give some information on what it is doing, but the user gets no idea of how much progress has been made, and the graphical display we have come to love in XP is long gone.

Fortunately, the good folks at Piriform have created a defragmentation utility called Defraggler that shows the defragmentation operations on the hard disk in real-time. This is even better than XP's defrag utility. The visual display is, of course, only one of the great features of this utility. I only gave it a quick try (and loved the Quick Defrag option), but the interface was enough to amend the sorrows caused by Vista and Windows 7.

Wednesday, October 14, 2009

Using wget

With Geocities going down in less that 2 weeks' time, I found myself needing to archive a number of websites hosted there that would otherwise disappear. For this purpose, one can go through the frustrating experience of saving a webpage's files one by one, but that would be stupid when there exist tools that automate the whole process.

The tool for the job is GNU Wget. While I've used this tool before for similar purposes, Geocities has several annoying things that made me need to learn to use the tool a bit better.

For starters, this is how to use the tool:

wget http://www.geocities.com/mitzenos/

Great, that downloaded index.html. But we want to download the whole site! So we use the -r option to make it recursive. This means that it will follow references to files used by each webpage, using attributes such as href and src. While this recursion could potentially go on forever, what limits it is the (default) recursion depth (i.e. follow such references only to a certain limit) and the fact that wget will, by default, not follow links that span different hosts (i.e. jump to different domains). Here's how the recursion thing works:

wget -r http://www.geocities.com/mitzenos/

OK, that downloads an entire site. In the case of Geocities, which hosts many different accounts, wget may end up downloading other sites on Geocities. If /mitzenos/ links to /xenerkes/, for example, both accounts are technically on the same host, so wget will just as well download them both. We can solve this problem by using the -np switch [ref1] [ref2]. Note combining -r and -np as -rnp does not work (at least on Windows it doesn't).

wget -r -np http://www.geocities.com/mitzenos/

So that solved most of the problems. Now when we try downloading /xenerkes/ separately, Geocities ends up taking down the site for an hour because of bandwidth restrictions, and you see a lot of 503 Service Temporarily Unavailable errors in the wget output. This is because Geocities impose a 4.2MB hourly limit on bandwidth (bastards). Since the webspace limit for Geocities is 15MB, it makes it difficult to download a site with size between 4.2MB and 15MB.

The solution to this problem is to force wget to download files at a slower rate, so that if the site is, say, 5MB, then the bandwidth will be spread over more than one hour. This is done using the -w switch [ref: download options], which by default takes an argument in seconds (you can also specify minutes, etc). For Geocities, 40-60 seconds per file should be enough, if the files aren't very large. Back when Geocities was popular, it wasn't really normal to have very large files on the web, so that isn't really an issue. This is the line that solves it:

wget -r -np -w 60 http://www.geocities.com/mitzenos/

This command will obviously take several hours to download a site if there are a lot of files, so choose the download interval wisely. If you're exceeding the bandwidth limit then use a large interval (around 60 seconds); if there are lots of files and the download is too damn slow, then use a smaller interval (30-40 seconds).

Saturday, April 25, 2009

Early Sierra games playable online

Some interesting stuff that surfaced between yesterday and today:

Old Sierra Games Playable In Browser Through Open Source Game Engine - a great piece of work at Sarien.net.
Microsoft Suffers Leaks, Lagging Sales Numbers As They Look Forward To Windows 8 - article about the leak of Windows 7 Release Candidate, and the fact that Microsoft are already beginning to plan Windows 8.
My thesis is going through a public evaluation beta this week - check out the chatbot evaluation page.

Friday, April 24, 2009

Oracle buys Sun; Geocities to die soon; Ubuntu 9.04 released

A lot of stuff has happened this week, and keeping up to date with Slashdot is a good idea. Some highlights:

On Monday 20th April, Oracle bought Sun Microsystems
On Thursday 23rd April, MySQL split into two separate forks
On Thursday 23rd April, Ubuntu 9.04 was released
On Thursday 23rd April, Yahoo! announced the end of Geocities

Thursday, April 16, 2009

Google Android SDK 1.5 Early Look

A few days ago, a pre-release of the Google Android SDK 1.5 was released.

Google Android is an operating system for mobile phones. I had to write a program for it in April 2008 (as one of my University Assigned Practical Tasks), back when there was no mobile phone supporting it, and when the SDK was so alpha or beta that it didn't even have a version number and was identified by a milestone number and release number.

Today, the SDK appears to have matured a lot, and so have the tools that come with it, including the emulator. Out of curiosity, I re-installed the Android SDK to see how the emulator changed over the past year. Below are a couple of screenshots.

Anyone wishing to install this pre-release version should follow the instructions on the pre-release page since there are a few differences from the procedure described by the current SDK documentation. Also, running the emulator has become slightly more complex, because of the extra step of having to create an AVD (Android Virtual Device). This tiny complication is for the better, however, as it allows you to create several different emulator configurations.

Sunday, April 5, 2009

A quick look at Ubuntu 9.04 Beta

Ubuntu 9.04 (Jaunty Jackalope) is currently in beta, and is due for a stable release on 23rd April 2009.

I'm mainly a Windows user, but for some tasks (especially programming) I like using Linux. I'm not extremely technical, so this little review is more about covering what the average user expects to get from an operating system, rather than exciting new features like the ext4 filesystem.

For about a year I've been using Kubuntu 7.04, and although support for it has long since stopped, I preferred not to upgrade. One of the main reasons was that I simply did not like the latest versions of KDE. I got my first taste of KDE when I tried Knoppix, and immediately loved it. When the time came to install Linux rather than using a live CD, Kubuntu was an obvious choice over Ubuntu. But today, this is not so obvious any more. Even Linus agrees that KDE 4 is a mess.

Since it is about time I upgrade my Linux distribution, I thought I'd try Ubuntu 9.04. I never liked GNOME (mostly due to aesthetic reasons, but I also never felt comfortable with the menu bar on top), but since KDE has become far worse, I thought I'd give it a second chance. Ubuntu 9.04 comes with GNOME 2.26 [ref: Ars Technica article]. Now this version number means very little to me, but this Ubuntu feels more like Windows than Kubuntu ever did... at least the left mouse button simply selects items rather than trying to run them; and dragging an item will always move it rather than opening up a silly context menu every time.

Ubuntu 9.04 comes with a number of good pieces of software pre-installed. Among these are Firefox 3.0.7 (on Kubuntu I am still stuck with Firefox 2 because I never managed to install Firefox 3), OpenOffice.org 3.0.1, and Pidgin, which to me looks very much like Gaim, but has a fresher look, is easier to configure, and has much less grotesque conversation windows.

One of the features I really liked on Kubuntu was how screenshots are saved. You press Print Screen, and a dialogue comes up prompting you to save the screenshot, without you having to even paste the screenshot in some image editor. This feature is still there in Ubuntu 9.04.

For those people like me who work with multiple computers with different operating systems, it is important to be able to transfer files from one PC to another over the network. In Kubuntu I used to go to "Remote Places" and then to "Samba Shares", and proceed from there. It worked great, but was painful because I had to nagivate through several virtual network folders every time I wanted to locate my shared folder.

In Ubuntu 9.04, there is something similar. You go to "Places" > "Network" and then find your network and host and shared folder. Ubuntu is nicer because it actually mounts the shared folder, so you can easily access it from your desktop next time.

Listening to music on Ubuntu 9.04, unfortunately, is not such a pleasant experience. Both pre-installed media players, "Movie Player" and "Rhythmbox Music Player", aren't capable of playing MP3s without a plugin. Also, I was unable to find my usual 2 Linux media players (VLC and XMMS) using both apt-get and the Add/Remove Applications program, and Amarok and JuK failed to install. I am still lost as to how to play MP3s on this version of Ubuntu... something I had no problem doing on good old Kubuntu.

A couple of things I never managed to do on Linux are printing, and watching videos on YouTube since the Flash player plugin for Firefox is not compatible with x64 architectures. The Flash issue can't be blamed on Ubuntu, but I think anyone would expect to be able to print without much hassle on any decent operating system. With this new version of Ubuntu, I still had no luck in either area.

Other minor things I don't like include how access to the Terminal could be easier ("Applications" > "Accessories" > "Terminal"), and how the shutdown options are available in a counter-intuitive "Live session user" menu in the top-right.

On the whole, Ubuntu 9.04 seems to be very promising, and assuming that some issues get fixed, I may seriously consider using it as my next Linux operating system once my thesis is finished.

Saturday, March 7, 2009

HTTP Communication: A Closer Look

About four months ago, I wrote a very simple HTTP server in Python, since my thesis has a Python artifact and I wanted to integrate it with a server. I've known the basics of HTTP for a year and a half now, but actually writing a server is obviously another story.

For those who aren't familiar with HTTP (I mean the actual protocol itself... everyone knows what it does, but not everyone knows what it looks like), or who know the basics but need to see a few examples, "HTTP Made Really Easy" is a great place to start.

My very simple HTTP server worked nicely, but I soon ran into a problem. If the server was hosted on Windows and I accessed it from a browser on Linux, I wasn't getting the payload of the POST packets (I wasn't getting all of the header either, although I didn't notice at first). I got the payload for all other Windows/Linux combinations (server on Linux, browser on Windows; and both server and browser on same system).

Till today, I had no clue what could be (in my mind) causing Linux to send requests without the payload. Then I decided to use Wireshark to find out what exactly was happening to the packets. Wireshark is a great tool that lets you see the actual data in packets you send and receive.

By using Wireshark, I noticed that the HTTP requests were being split into multiple packets when sent from Linux, while a Windows would send the request as one whole packet. This means the payload would arrive in a second or third packet, and since I had only one send() call, I would not receive it.

The solution is to keep a buffer associated with each single connection (identified by ip:port), and append each packet to it. You know when you've reached the end from the Content-Length field in the HTTP header, which tells you the size of the payload. The payload starts after the first "\r\n\r\n" double-newline, so you can start counting from there.

Now, regarding connections... here's another thing I learned by poking around in Wireshark. I used to think that a browser keeps the same connection open for each website, until the browser is closed, so that it can use the same connection (for efficiency purposes) for the same website rather than opening new connections all the time. Well, that's not the case.

Apparently, each HTTP request starts a new connection, so if you're watching requests coming in from the server side, you'll see the client's port number increase by one each time. Connections are reused only to send multiple packets associated with the same request (as above). In other words, a Linux client would open a connection, split the request into a number of packets, send them via the same connection, and close the connection. Well, almost.

The browser actually doesn't close the connection. If your server just sends the data, the browser will keep waiting for data to arrive. So your server must close the connection immediately after sending the response.

Friday, March 6, 2009

DOSBox for Dummies

What is DOSBox?

DOSBox is a program that emulates DOS, allowing you to run most old games that might not run on modern operating systems.

This is what it looks like (as soon as you run it):

This is what it looks like after running a game:

How do I use DOSBox?

DOSBox can be used like any command line interface. The commands are pretty standard: cd changes directory; typing the name of an executable runs that executable, etc.

But first, before you can access your files, you need to mount a drive. In DOSBox you start at drive Z:, which is virtual, so you need to map a drive in DOSBox (e.g. C:) to a particular drive or folder on your hard disk. You can mount drive C: as follows:

mount c c:\

DOSBox recommends that you don't map a DOSBox drive directly to a root directory, so you should use some folder instead of C:\.

Note that in case a folder name is longer than 8 characters or contains spaces, you should use the tilde version of that folder name (e.g. administrator -> admini~1, i.e. take the first 6 characters and add ~1). In case a folder name contains spaces you should do something similar but take the first word instead, but I believe DOSBox still has problems with folder names containing spaces.

Writing Batch Files for DOSBox

If you use DOSBox often and don't want to do some of the repetitive tasks (e.g. mounting a drive, or running your favourite game) every time, you can write a batch file to automate the process.

A batch file (on Windows) is a text file with a .bat extension (e.g. u5.bat). In the batch file you write a list of commands that you would write in the command line; each new line runs the previous command.

The following is an example of a batch file I wrote to run Ultima 5 right away:

cd\
cd tools
cd dosbox
dosbox -c "mount c C:\docume~1\admini~1\Desktop" -c "C:" -c "cd U5" -c "ultima"

Each line is run as a separate command. In the first 3 lines, I'm going to the DOSBox directory, and in the fourth I'm running DOSBox.

Now DOSBox is nice because you can give it certain parameters, one of which is "-c". "-c" means that the following parameter is a command to be run in DOSBox. Like this, you can make DOSBox run several commands as soon as it starts, without you having to type them. Line 4 shows four commands: first I'm mounting the C drive, then I'm switching to it, then I'm going to the U5 directory, then I'm running Ultima 5.

Speed

Some games may run too fast or too slow. Hit Ctrl+F11 to slow DOSBox, or Ctrl+F12 to speed it up.

Wednesday, February 18, 2009

Google Earth 5 beta

It's been a long time since I last used Google Earth. I first discovered it two or three years ago, and loved it. You could go anywhere on Earth, and see all kinds of landscapes. Occasionally I would look for a famous monument, or simply trek through the countryside in some far away country.

Eventually I stopped using it, and uninstalled it. Today I suddenly felt like visiting the beauty of New Zealand (after watching The Lord of the Rings: The Return of the King yesterday - it was filmed there), so I downloaded Google Earth again and installed it.

I was impressed with the new features available. Admittedly, I haven't used Google Earth for a while, so some features might not be exactly new. What is definitely new is the ability to explore the ocean floor.

The feature I love most is the Sky feature. It allows you to see the stars and other heavenly bodies visible in the night sky. I've always loved stuff about constellations (Japanese cartoons are full of them, and I'm currently back to watching I Cavalieri dello Zodiaco), zodiac, astronomy, etc.

Another thing I really like is the Mars feature. As if Google Mars wasn't brilliant enough already, you can now fly over the surface of Mars in Google Earth.

Yet another feature that is new to me is the Sun feature. This shows the sunlight and shade on the globe, and you can drag a time slider to actually see the shade moving across the globe.

Finally, this is an old feature, but still worth mentioning. It's always nice that Google Earth lets you view buildings in certain cities in 3D. Being able to see any place on Earth is already great, but seeing tall buildings in all their majesty is a definite plus over seeing them on a flat photo surface, their height being hinted at only by the amount of shadow they cast on their surroundings.

Well, that's it. This wasn't exactly a review of Google Earth; it's more like a 5-minute account of the new features I noticed while quickly revisiting this masterpiece of a program.

Monday, January 26, 2009

Microsoft 'Songsmith'

An article by Slashdot has a few links about 'Songsmith', a program from 'Microsoft Research'.

A shame for any software company.