Java 8 Compiler is More Strict

I was trying to build Apache Hadoop 2.0.3-alpha with the ARM Java 8 Preview (to test out hard-float build support, which needs a Hadoop build fix), and I hit a compile error in InputSampler.java in the Map/Reduce code. It seems that the Java 8 compiler won’t allow a raw type to be used in a context where previous versions did. Stripping that file down to the essentials:

public class InputSampler<K,V> {
  public interface InputFormat<K,V> {
  }
  public interface Sampler<K,V> {
    K[] getSample(InputFormat<K,V> inf);
  }
  public static <K,V> void writePartitionFile(Sampler<K,V> sampler) {
    final InputFormat inf = null; // Yuck, a raw type!
    K[] samples = sampler.getSample(inf);
  }
}

This code fails to compile with Java 8 (regardless of any -source setting):

$ java -version
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
Java HotSpot(TM) Server VM (build 25.0-b04, mixed mode)
$ javac InputSampler.java
InputSampler.java:9: error: incompatible types: Object[] cannot be converted to K[]
    K[] samples = sampler.getSample(inf);
                                   ^
  where K,V are type-variables:
    K extends Object declared in method <K,V>writePartitionFile(Sampler<K,V>)
    V extends Object declared in method <K,V>writePartitionFile(Sampler<K,V>)
Note: InputSampler.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error

Compilation succeeds with Java 7 (with a warning):

$ java -version
java version "1.7.0_11"
Java(TM) SE Runtime Environment (build 1.7.0_11-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)
$ javac InputSampler.java
Note: InputSampler.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

The code above is definitely invalid according to JLS7 15.12.2.6, which demands that methods applicable via unchecked conversion have their return type erased, but making breaking strictness changes in the compiler is likely going to slow the adoption of Java 8.

Syncing large file libraries over the Internet

With the demise of FolderShare, I’ve found it rather difficult to keep my music library synchronized between home and the office. While there is software you can buy like SuperSync, you can pretty easily achieve a more general solution using free software, with the added benefit of universal OS support (e.g. Windows, Mac OS X, and Linux).

First, for connectivity between private LANs across the Internet, Hamachi works very well (and is free for non-commercial use). It’s supported on Windows and OS X, and in beta for Linux and Windows Mobile. Just install the client on each machine, and you’ll have a virtual IP address that you can use to securely tunnel over the Internet to other machines in your mesh using any application. I created a LogMeIn account to use Managed mode which (obviously) makes management easier, but Unmanaged mode works as well.

Next, you can use rsync to efficiently copy and synchronize files through the Hamachi tunnel. It’s usually already installed on Linux and OS X, and can be easily installed on Windows using the Cygwin installer.

While Hamachi makes it easy to browse Windows shares (assuming you’re running Windows) via the tunnel, or even rsync them using a UNC path, the SMB protocol is glacially slow. Therefore, you’ll want to set up the rsync daemon on the machine containing your authoritative version. (While rsync can easily provide two-way synchronization, it is much less risky and mentally taxing to treat one copy as authoritative and access it using a read-only rsync share.) To configure the daemon, you’ll need to create a file called /etc/rsyncd.conf that specifies global and per-library options. On Windows, to share a music library in M:\Music, the file would look like this:

use chroot = false
strict modes = false

[Music]
path = /cygdrive/M/Music
read only = true

Then you can fire up the rsync daemon simply by running rsync --daemon from a shell, command prompt, Run dialog (Windows+R), etc. rsync won’t output anything, but you can verify in your process/task manager that it’s running. If you want the rsync daemon to run automatically as a Windows service, that can be done using cygrunsrv, but you’ll have to refer elsewhere for instructions.

Finally, to actually sync the library to a client, you’ll need to enter a slightly gnarly rsync command (which the difficulty of remembering inspired this blog post) on the client:

rsync -rtOh --chmod=ugo=rwX --ignore-existing --delete --progress rsync://<server IP from Hamachi client>/Music/* /cygdrive/c/Music

This command will copy the everything in the Music module (defined above as M:\Music) to C:\Music, deleting anything in C:\Music that doesn’t exist in M:\Music. Refer to the rsync man page for details on these options and others, but here’s a brief explanation:

  • -r, –recursive: recurse into directories
  • -t, –times: preserve modification times (to aid future synchronization)
  • -O, –omit-dir-times: omit directories from –times
  • -h, –human-readable: output numbers in a human-readable format (we’re humans, right?)
  • –chmod=ugo=rwX: give new files the destination-default permissions (otherwise they’ll end up with screwy permissions)
  • –ignore-existing: skip updating files that already exist on the destination (unless you want things like MP3 tag changes to trigger lots of delta transfers)
  • –delete: delete extraneous files from the receiving side (to avoid accumulating duplicates when files are renamed on the sender)
  • –progress: print information showing the progress of the transfer (since it will probably take a long time to sync your whole media library)

Note: For your first attempt, you’ll probably want to also include the -n option, which executes a dry-run, so you don’t accidentally end up deleting files you didn’t intend to.

Easily centralize a local Git repository

If you’re like me (and condolences if you are), you’ve often started a local Git repository for a new project, and then later wanted to create a shared upstream repository on another computer. (GitHub walks you through this, but let’s assume you want to manage the shared repository yourself.) Before I knew Git very well, I did this clumsily by copying the .git directory and tweaking its config file and permissions (which was especially difficult when moving from a Windows desktop to a Linux server). But that is very un-Git; here’s a much better method:

On the upstream server, go to where you want the repository to be located and create a bare, shared repository:

$ cd /var/git
$ git init --bare --shared myrepo.git

Then on your local machine, simply add a remote for the new shared repository and push the local repository up to it with the -u option:

$ git remote add origin ssh://myuser@myserver/var/git/myrepo.git
$ git push -u origin master

The -u option is short for --set-upstream. From the git-push man page:

For every branch that is up to date or successfully pushed, add upstream (tracking) reference, used by argument-less git-pull(1) and other commands.

Now with just two Git commands, your local repository acts just like you’d cloned it from your new shared repository.

Note: If you have a group of users that you want to be able to modify the repository, ensure that it is owned by that group and is group writable. For example:

$ chgrp -R git /var/git
$ chmod -R g+rwx /var/git
$ ls -ld /var/git
drwxrwx--- 22 git git 4.0K Jul 16 12:00 /var/git

Displaying CPU cache hierarchy on Linux

This is a note to myself as much as anything, since it took a minute to get the command right, but maybe someone else will find it useful. The following command will dump the cache hierarchy for CPU 0 in a fairly readable format (though I truncated the shared_cpu_map entries by hand for purposes of this blog post).

$ find /sys/devices/system/cpu/cpu0/cache/ -type f -printf '%P: ' -exec cat '{}' \;
index0/type: Data
index0/level: 1
index0/coherency_line_size: 64
index0/physical_line_partition: 1
index0/ways_of_associativity: 2
index0/number_of_sets: 512
index0/size: 64K
index0/shared_cpu_map: 00000001
index0/shared_cpu_list: 0
index1/type: Instruction
index1/level: 1
index1/coherency_line_size: 64
index1/physical_line_partition: 1
index1/ways_of_associativity: 2
index1/number_of_sets: 512
index1/size: 64K
index1/shared_cpu_map: 00000001
index1/shared_cpu_list: 0
index2/type: Unified
index2/level: 2
index2/coherency_line_size: 64
index2/physical_line_partition: 1
index2/ways_of_associativity: 16
index2/number_of_sets: 512
index2/size: 512K
index2/shared_cpu_map: 00000001
index2/shared_cpu_list: 0
index3/type: Unified
index3/level: 3
index3/coherency_line_size: 64
index3/physical_line_partition: 1
index3/ways_of_associativity: 48
index3/number_of_sets: 1706
index3/size: 5118K
index3/shared_cpu_map: 0000000f
index3/shared_cpu_list: 0-3
index3/cache_disable_0: FREE
index3/cache_disable_1: FREE

E*trade Sucks (and Ameritrade too)

The Sanmina-SCI stock I foolishly purchased through their Employee Stock Purchase Program is currently trading at a third of its purchase price. Now, to add insult to injury, I see that E*trade (the broker Sanmina used for the ESPP) charged me $20 for the 6:1 reverse split last August. They call it a “mandatory reorganization fee” and claim that it’s standard practice to charge this fee. Yet, Fidelity, Scottrade, and USAA, to name a few, don’t seem to agree, as they do not charge this fee. E*trade customer service, both email and phone, refuses to reverse the charge, so it is now my moral imperative to liquidate and close the account. I’ll never use E*trade again, and hope that others will heed this warning as well. Avoid Ameritrade too, which also charges this fee.

E*trade Sucks

According to Kiplinger’s November 2008 report, Fidelity wins as best overall online brokerage, combining tools, research, large selection of funds and bonds, and relatively low fees. Most importantly, they don’t charge for mandatory reorganizations.

Anointing the best broker is tricky because so much depends on the needs and wants of customers. Investors who feel they need a lot of hand-holding may gravitate toward Fidelity and Charles Schwab, which run neck and neck in the race to provide customers with all the advantages of a full-service broker at a discounter’s price. Investors who are willing to settle for fewer bells and whistles will appreciate Muriel Siebert, a small firm that stands out for its selection of mutual funds and third-party research. Price-conscious customers might favor TradeKing and newcomer Options-House, which charge low commissions — $4.95 per stock trade, regardless of the account size — and provide good customer service.

Putting UTF-8 into C/C++ Source Code

After much googling, I could not find any tools for converting a UTF-8 string into an escaped C/C++ string literal suitable for pasting into an ASCII source file. Therefore I produced this Perl script which seems to provide a fairly readable escaped string:

use strict;
chomp;
print '"';
my $prev_esc = 0;
print map
    {
        if (ord $_ > 0x7f) {
            $prev_esc = 1;
            sprintf('\\x%lx', ord $_);
        } else {
            my $need_break = $prev_esc && /[0-9A-Fa-f]/;
            $prev_esc = 0;
            ($need_break ? '" "' : '') . $_;
        }
    }
    split('', $_);
print '"' . "\n";

Run it with the Perl -n option, and it will output an escaped string literal for each line input:

$perl -n utf8esc.pl
Grüße aus Bärenhöfe
"Gr\xc3\xbc\xc3\x9f" "e aus B\xc3\xa4renh\xc3\xb6" "fe"

Hit Ctrl-D on a blank line to exit.

Unfortunately, C/C++ seems to have the strange rule that all hex characters following a “\x” apply to that escape sequence, even though the maximum value allowed is 0xff. Therefore it is necessary to break the string into separate segments.

Misinformation about health care reform from Senator Hutchison?

I just read the Senator’s health care column from yesterday. The 4th and 5th paragraphs

The Administration’s proposal contains tax penalties and fees on small businesses that are not able offer health insurance. To pay these added costs, many small businesses could be forced to decrease workers wages, hire fewer employees, implement layoffs, or cut into other benefits. And some employers will have to pay a tax even if they already provide health insurance! A Kaiser Family Foundation survey found that roughly 3 in 5 small businesses will be hit with new taxes under the Democrats’ proposal.

Imposing new taxes on small businesses promises to wreak havoc on our economy. By raising taxes on some small businesses as high as 45 percent, they will be paying 10 percent more than what major corporations pay – and the U.S. corporate rate is among the highest in the world.

seem to be in direct opposition to the White House report on small businesses.

Unfortunately, the Senator does not city any specific legislation or documentation. However, the Kaiser Family Foundation she cites has a report on the President(-Elect)’s proposal that indicates no health care requirement for small business, and even provides a tax credit for them:

Require large employers to offer “meaningful” coverage or contribute a percentage of payroll toward the costs of the public plan; small businesses will be exempt from this requirement.

Provide small businesses with a refundable tax credit of up to 50 percent of premiums paid on behalf of their employees if the employer pays a “meaningful share” of the cost of a “quality health plan”.

I’ve written the Senator to find out what facts this column is based on.

Build Boost 1.39 for 64-bit Windows

Building Boost for 64-bit Windows turns out not to be completely straightforward. These instructions assume you’re using Visual Studio 2008 and running from a VS 2008 x64 Command Prompt.

First, running bootstrap.bat appears to fail:

Building Boost.Jam build engine

Failed to build Boost.Jam build engine.
Please consult bjam.log for furter diagnostics.

You can try to obtain a prebuilt binary from

http://sf.net/project/showfiles.php?group_id=7586&package_id=72941

Also, you can file an issue at http://svn.boost.org
Please attach bjam.log in that case.

Actually, it’s not failing to build, it’s just trying to copy the executable from the wrong path. All you need to do is copy tools\jam\src\bin.ntx86_64\bjam.exe to the top-level Boost directory.

Next, you need to install any of the optional libraries you want to support. For Python support, you can install the Windows AMD64 MSI Installer, and then add the install directory (C:\Python26 by default) to your PATH. For Graphviz parsing support, download and unzip Expat 2.0.1 and my VS 2008 x64 project files, then use Batch Build in VS 2008 to build the x64 targets. You’ll also need to rename the Expat library reference in Boost’s libs\graph\build\Jamfile.v2 from “expat” to “libexpat”.

Finally, run the following commands to build Debug and Release DLLs, setting the Expat paths appropriately and the -j option to the number of processor cores you have:

set EXPAT_INCLUDE=C:\expat-2.0.1\lib

set EXPAT_LIBPATH=C:\expat-2.0.1\win64\bin\Debug
bjam -j3 --stagedir=stage64 --without-mpi toolset=msvc address-model=64 variant=debug link=shared threading=multi runtime-link=shared stage > build-shared-multi-debug64.log

set EXPAT_LIBPATH=C:\expat-2.0.1\win64\bin\Release
bjam -j3 --stagedir=stage64 --without-mpi toolset=msvc address-model=64 variant=release link=shared threading=multi runtime-link=shared stage > build-shared-multi-release64.log

With any luck, in about 5-15 minutes, you’ll have a set of DLLs and import libraries in stage64/lib. The build generates lots of output and warnings, which is the reason for redirecting the output to a file, but you’ll want to check the beginning and end of each log to make sure it built everything you expected.

Expat 2.0.1 for 64-bit Windows

I couldn’t find any instructions for building Expat for 64-bit Windows (x64), so I updated the 32-bit projects files myself: expat-vs2008-x64.zip. See my projects page for brief usage instructions.

Update: With the March 2012 release of Expat 2.1.0, Expat now supports building with CMake, and these build files are no longer necessary. After installing CMake and downloading and extracting the Expat tarball, just do the following to build solution and project files for Visual Studio 11 (aka 2012) Win64 from the Developer Command Prompt for VS2012:

C:\dev\expat-2.1.0>md build

C:\dev\expat-2.1.0>cd build

C:\dev\expat-2.1.0\build>cmake .. -G "Visual Studio 11 Win64"
-- The C compiler identification is MSVC 17.0.50727.1
-- The CXX compiler identification is MSVC 17.0.50727.1
-- Check for working C compiler using: Visual Studio 11 Win64
-- Check for working C compiler using: Visual Studio 11 Win64 -- works
...
-- Configuring done
-- Generating done
-- Build files have been written to: C:/dev/expat-2.1.0/build

C:\dev\expat-2.1.0\build>msbuild expat.sln /p:Configuration=Release
Microsoft (R) Build Engine version 4.0.30319.17929
[Microsoft .NET Framework, version 4.0.30319.17929]
Copyright (C) Microsoft Corporation. All rights reserved.

Building the projects in this solution one at a time. To enable parallel build, please add the "/m" switch.
Build started 2/20/2013 10:42:23 AM.
Project "C:\dev\expat-2.1.0\build\expat.sln" on node 1 (default targets).
ValidateSolutionConfiguration:
Building solution configuration "Release|x64".
...
66 Warning(s)
0 Error(s)

Time Elapsed 00:00:03.38

expat.dll will be in the Release directory. CMake supports generating project files for versions as old as Visual Studio 6, so this approach completely replaces the need for building project files by hand.

XMLSpy 2009 Disappointment

I’ve always thought XMLSpy was a great product. Unfortunately, today I discovered the caveat that you have to spend lots of $$$ on it to be impressed.

I just bought a copy of XMLSpy 2009 Standard ($129) after evaluating 2009 Enterprise ($999), since that’s the version that the free trial links go to. Anyway, I’m highly disappointed about the read-only grid view and read-only schema view in the Standard edition. The schema editor is pretty cool and unique, so I can understand it being Pro-level ($499), but it’s pretty lame to have the grid read-only. I work at a small company, so I couldn’t justify $370 for grid view anyway, but I probably would have stuck with free XML Notepad had I known.