Archive for February, 2012

Perl UTF 8 å ä ö and html

A common thing I do is to scrape a Web page, run it through some Perl magic and marvel over the result. A frequent reason of contention in this process is the issue of getting å’s ä’s and ö’s correctly handled by Perl and various terminals, here’s a write up of a simple example.

The webpage is UTF-8 encoded, I save it to disk using “Save as…” in my browser. The resulting file on disk is UTF-8 encoded.

In this example the file is reasonably small so I use File::Slurp to get the full file in a scalar…


my $text = read_file( <filename> ) ; # Slurp the file
utf8::decode($text); # Decode the file from UTF-8

I can now match with å ä and ö in my Perl code like this:


my ($address) = ($text =~ m{title="Visa alla bilder för ([^"]+)"}sm);

Later when I have finished my text processing and want to print the result in my terminal, Cygwin in this case I do:


my $output = "";

$output .= "Adress: " . $house->{address} . "\n" if defined($house->{address});
$output .= "Område: " . $house->{area} . "\n" if defined($house->{area});

...

utf8::encode($output); # Encode the text as UTF-8 which is correctly displayed by Cygwin
print $output;

Note: You should not “use utf8;” in this Perl script, “use utf8;” should only be used if your Perl script is written in UTF-8!

Tags:

Mining of Massive Datasets

Found a little gem called “Mining of Massive Datasets” from Jeffrey D. Ullman of “Dragon Book” fame.
The PDF version of the book is downloadable from.

http://infolab.stanford.edu/~ullman/mmds.html

I find it to be a very good read!

Some Comments on the Turtle Beach PX5 wireless headphones

Background:

I wanted a set of wireless headphones for work which where Bluetooth capable (to connect to my iPhone) to listen to music when working in an open landscape, which I find distracting when people are talking all around.

The Good:

The mic. allows for handling calls without having to switch headsets.

Quite light with a nice build feel to them eventhough they are a bit “plastic”.

Good Bluetooth performance, no problems connecting/reconnecting to the iPhone.

The not soo good:

The builtin equalizer presets does not apply when listening to music over Bluetooth.

No Audio/Video Remote Control Profile (AVRCP), which means you have to fiddle with the phone to change tracks, pause/play and not through controls on the headset.

Audio leakage, I find that they leak fairly much risking to disturb your landscape neighbours.

Sometimes there are some initial audio drop-outs which seem to be Bluetooth connection “stabilizing”, but after a few seconds it is up and running and there are no further problems.

No re-chargable batteries.

Summary:

The PX5 is clearly marketed as “Gamer Headsets”, for which I think alot of my issues are non-issues. In retrospect the PX5 was not perfectly suited for my use-case.

Tags: