What's That Noise?! [Ian Kallen's Weblog]

All | LAMP | Music | Java | Ruby | The Agilist | Musings | Commute | Ball
Main | Next month (Mar 2006) »

20060222 Wednesday February 22, 2006

Making Velocity Be Quiet About Resource Loading

I'd never dug into where velocity's annoying messages were coming from but I decided enough is enough already. These tiresome messages from velocity were showing up on every page load:

2006-02-22 12:08:02 StandardContext[/webapp] Velocity   [info] ResourceManager : found /path/to/resource.vm with loader org.apache.velocity.tools.view.servlet.WebappLoader
Such messages might be good for debugging your setup but once you're up and running, they're just obnoxious. They definitely weren't coming from the log4j.properties in the webapp. So I took a look at velocity's defaults. The logging properties that velocity ships with in velocity.properties concern display of stacktraces but the constant chatter in Tomcat's logs weren't in there either. So I unwrapped the velocity source and found it in org.apache.velocity.runtime.RuntimeConstants -- all I had to do is add this to velocity.properties and there was peace:
resource.manager.logwhenfound = false
Ah, much better!

They shoulda named that property resource.manager.cmon.feel.the.noise, seriously.

( Feb 22 2006, 12:40:20 PM PST ) Permalink


20060213 Monday February 13, 2006

URL.hashCode() Busted?

I did a double take on this:

        HashSet set = new HashSet();
        set.add(new URL("http://postsecret.blogspot.com"));
        set.add(new URL("http://dorion.blogspot.com"));
        for (Iterator it = set.iterator(); it.hasNext();) {
            System.out.println(it.next());
        }
I was expecting to get output like
http://postsecret.blogspot.com
http://dorion.blogspot.com
or
http://dorion.blogspot.com
http://postsecret.blogspot.com
But all that I got was
http://postsecret.blogspot.com

Hmmm....

The java.net.URL javadoc says what I'd expect "Creates an integer suitable for hash table indexing." So I tried this:

        URL url1 = new URL("http://postsecret.blogspot.com");
        URL url2 = new URL("http://dorion.blogspot.com");
        System.out.println(url1.hashCode() + " " + url1);
        System.out.println(url2.hashCode() + " " + url2); 
and got this
1117198397 http://postsecret.blogspot.com
1117198397 http://dorion.blogspot.com
I was expecting different hashCode's. Either java.net.URL is busted or I'm blowing it and my understanding of the contract with java.lang.Object and its hashCode() method is busted.

( Feb 13 2006, 07:37:29 PM PST ) Permalink


20060208 Wednesday February 08, 2006

BerkeleyDB's "Tied Hash" for Java

One of the really wonderful and evil things about Perl is the tie interface. You get a persistent hash without writing a boat load of code. With Sleepycat's BerkeleyDB Java Edition you can do something very similar.

Here's a quick re-cap: I've mentioned fiddling with BerkeleyDB-JE before with a crude "hello world" app. You can use the native code version with Perl with obscene simplicity, too. In years past, I enjoyed excellent performance with older versions of BerkeleyDB that used a class called "DB_File" -- today, the thing to use is the "BerkeleyDB" library off of CPAN (note, you need db4.x+ something for this to work). Here's a sample that writes to a BDB:

#!/usr/bin/perl

use BerkeleyDB;
use Time::HiRes qw(gettimeofday);
use strict;

my $filename = '/var/tmp/bdbtest';
my %hash = ();
tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_CREATE });
$hash{'539'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://www.sifry.com/alerts";
$hash{'540'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://epeus.blogspot.com";
$hash{'541'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://http://joi.ito.com";
untie(%hash);
Yes, I'm intentionally using plain old strings, not Storable, FreezeThaw or any of that stuff.
To prove that our hash was really persistent, we might do this:
#!/usr/bin/perl

use BerkeleyDB;
use strict;

my $filename = '/var/tmp/bdbtest';
my %hash = ();
tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_RDONLY });
for my $bid (keys %hash) {
    my %blog = split(/\t/,$hash{$bid});
    print "$bid:\n";
    while(my($k,$v) = each(%blog)) {
        print "\t$k => $v\n";
    }
    
}
untie(%hash); 
Which would render output like this:
541:
        u => http://http://joi.ito.com
        d => 1139388034903283
539:
        u => http://www.sifry.com/alerts
        d => 1139388034902888
540:
        u => http://epeus.blogspot.com
        d => 1139388034903227 

Java has no tie operator (that's probably a good thing). But Sleepycat has incorporated a Collections framework that's pretty cool and gets you pretty close to tied hash functionality. Note however that it's not entirely compatible with the interfaces in the Java Collections Framework but if you know those APIs, you'll immediately know the Sleepycat APIs.

com.sleepycat.collections.StoredMap implements java.util.Map with the folloing cavaets:

  1. It doesn't know how big it is, so don't call the size() method unless you want to see a UnsupportedOperationException
  2. You can't just abandon java.util.Iterators that have been working on a StoredMap, you have to use com.sleepycat.collections.StoredIterator's .close(Iterator) method to tidy up.
But that's no big deal.

So what does the code look like? Well, let's say you wanted to store a bunch of these vanilla beans in the database:

public final class ImmutableBlog implements Serializable {

    private static final long serialVersionUID = -7882532723565612191L;
    private long lastmodified;
    private String url;
    private int id;
    
    public ImmutableBlog(final int id, final long lastmodified, final String url) {
        this.id = id;
        this.lastmodified = lastmodified;
        this.url = url;
    }
    public int getId() {
        return id;
    }
    public long getLastmodified() {
        return lastmodified;
    }
    public String getUrl() {
        return url;
    }
    
    public boolean equals(Object o) {
        if (!(o instanceof ImmutableBlog))
            return false;
        if (o == this)
            return true;
        ImmutableBlog other = (ImmutableBlog)o;
        return other.getId() == this.getId() &&
            other.getLastmodified() == this.getLastmodified() &&
            other.getUrl().equals(this.getUrl());
    }
    
    public int hashCode() {
        return (int) (id * 51 + url.hashCode() * 17 + lastmodified * 29);
    }
    
    public String toString() {
        StringBuffer sb = new StringBuffer(this.getClass().getName());
        sb.append("[id=")
        .append(id)
        .append(",lastmodified=")
        .append(lastmodified)
        .append(",url=")
        .append(url)
        .append("]");
        return sb.toString();
    }
}
note that it implements java.io.Serializable
This is a class that knows how to persist ImmutableBlogs and provides a method to fetch the Map:
public class StoredBlogMap  {
    
    private StoredMap blogMap;
    public StoredBlogMap() throws Exception {
        init();
    }
    
    protected void init() throws Exception {
        File dir = new File(System.getProperty("java.io.tmpdir") +
                File.separator + "StoredBlogMap");
        dir.mkdirs();
        EnvironmentConfig envConfig = new EnvironmentConfig();
        envConfig.setAllowCreate(true);
        Environment env = new Environment(dir, envConfig);
        DatabaseConfig dbConfig = new DatabaseConfig();
        dbConfig.setAllowCreate(true);
        Database blogsdb = env.openDatabase(null, "blogsdb", dbConfig);
        Database classdb = env.openDatabase(null, "classes", dbConfig);
        StoredClassCatalog catalog = new StoredClassCatalog(classdb);
        blogMap = new StoredMap(blogsdb,
                new IntegerBinding(), new SerialBinding(catalog, 
                        ImmutableBlog.class), true);
    }
    
    public Map getBlogMap() {
        return blogMap;
    }
}
The majority of the code is just plumbing for setting up the underlying database and typing the keys and values.
Here's a unit test:
public class StoredBlogMapTest extends TestCase {

    private static Map testMap;
    static {
        testMap = new HashMap();
        testMap.put(new Integer(539), 
                new ImmutableBlog(539, System.currentTimeMillis(), 
                        "http://www.sifry.com/alerts"));
        testMap.put(new Integer(540), 
                new ImmutableBlog(540, System.currentTimeMillis(), 
                        "http://epeus.blogspot.com"));
        testMap.put(new Integer(541), 
                new ImmutableBlog(541, System.currentTimeMillis(), 
                        "http://www.arachna.com/roller/page/spidaman"));        
    };
    private StoredBlogMap blogMap;
    
    protected void setUp() throws Exception {
        super.setUp();
        blogMap = new StoredBlogMap();
    }
    
    public void testWriteBlogs() throws Exception {
        Map blogs = blogMap.getBlogMap();
        for (Iterator iter = testMap.entrySet().iterator(); iter.hasNext();) {
            Map.Entry ent = (Map.Entry) iter.next();
            blogs.put((Integer)ent.getKey(), (ImmutableBlog)ent.getValue());
        }
        int i = 0;
        for (Iterator iter = blogMap.getBlogMap().keySet().iterator(); iter.hasNext();) {
            iter.next();
            i++;
        }
        assertEquals(testMap.size(), i);
    }
    
    public void testReadBlogs() throws Exception {
        Map blogs = blogMap.getBlogMap();
        Iterator iter = blogs.entrySet().iterator();
        while (iter.hasNext()) {
            Map.Entry ent = (Map.Entry) iter.next();
            ImmutableBlog test = (ImmutableBlog) testMap.get(ent.getKey());
            ImmutableBlog stored = (ImmutableBlog) ent.getValue();
            assertEquals(test, stored);
        }
        StoredIterator.close(iter);
    }

    public static void main(String[] args) {
        junit.textui.TestRunner.run(StoredBlogMapTest.class);
    }
}
These assertions all succeed, so assigning to and fetching from a persistent Map works! One of the notable things about the BDB library, it will allocate generous portions of the heap if you let it. The upside is that you get very high performance from the BDB cache. The downside is... using up heap that other things want. This is tunable, in the StoredBlogMap ctor, add this:
// cache size is the number of bytes to allow Sleepycat to nail up
envConfig.setCacheSize(cacheSize);
// ... now setup the Environment

The basic stuff here functions very well, however I haven't run the any production code that uses Sleepycat's Collections yet. My last project with BDB needed to run an asynchronous database entry remover, so I wanted to remove as much "padding" as possible.

( Feb 08 2006, 12:22:21 AM PST ) Permalink


20060201 Wednesday February 01, 2006

Large Heap Sizes and OOM

Note to self: If you're getting OutOfMemoryError's, bumping up the heap size may actually make the problem worse. Usually, OOM means you've exceeded the JVM's capacity... so you set -Xms and -Xmx to a higher strata of memory allocation. Well, at least I thought that was the conventional wisdom. Having cranked it up to 1850M to open very large data structures, OOM's were still bringing down the house. OK, spread the work around in smaller chunks across multiple JVMs. But it still bombs out. It turns out that you have to be very particular about giving the JVM a lot of heap up front. This set of posts seems to peg it. I'd figured that nailing a big heap allocation was how I'd prevent OOM'ing. Looks like it's time for me to bone up on JVM tuning. I should probably dig into 64-bit Java while I'm at it.

( Feb 01 2006, 11:57:15 PM PST ) Permalink