One of the really wonderful and evil things about Perl is the tie interface. You get a persistent hash without writing a boat load of code. With Sleepycat's BerkeleyDB Java Edition you can do something very similar.
Here's a quick re-cap: I've mentioned fiddling with BerkeleyDB-JE before with a crude "hello world" app. You can use the native code version with Perl with obscene simplicity, too. In years past, I enjoyed excellent performance with older versions of BerkeleyDB that used a class called "DB_File" -- today, the thing to use is the "BerkeleyDB" library off of CPAN (note, you need db4.x+ something for this to work). Here's a sample that writes to a BDB:
#!/usr/bin/perl use BerkeleyDB; use Time::HiRes qw(gettimeofday); use strict; my $filename = '/var/tmp/bdbtest'; my %hash = (); tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_CREATE }); $hash{'539'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://www.sifry.com/alerts"; $hash{'540'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://epeus.blogspot.com"; $hash{'541'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://http://joi.ito.com"; untie(%hash);Yes, I'm intentionally using plain old strings, not Storable, FreezeThaw or any of that stuff.
#!/usr/bin/perl use BerkeleyDB; use strict; my $filename = '/var/tmp/bdbtest'; my %hash = (); tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_RDONLY }); for my $bid (keys %hash) { my %blog = split(/\t/,$hash{$bid}); print "$bid:\n"; while(my($k,$v) = each(%blog)) { print "\t$k => $v\n"; } } untie(%hash);Which would render output like this:
541: u => http://http://joi.ito.com d => 1139388034903283 539: u => http://www.sifry.com/alerts d => 1139388034902888 540: u => http://epeus.blogspot.com d => 1139388034903227
Java has no tie operator (that's probably a good thing). But Sleepycat has incorporated a Collections framework that's pretty cool and gets you pretty close to tied hash functionality. Note however that it's not entirely compatible with the interfaces in the Java Collections Framework but if you know those APIs, you'll immediately know the Sleepycat APIs.
com.sleepycat.collections.StoredMap
implements java.util.Map
with the folloing cavaets:
java.util.Iterator
s that have been working on a StoredMap, you have to use com.sleepycat.collections.StoredIterator
's .close(Iterator) method to tidy up.So what does the code look like? Well, let's say you wanted to store a bunch of these vanilla beans in the database:
public final class ImmutableBlog implements Serializable { private static final long serialVersionUID = -7882532723565612191L; private long lastmodified; private String url; private int id; public ImmutableBlog(final int id, final long lastmodified, final String url) { this.id = id; this.lastmodified = lastmodified; this.url = url; } public int getId() { return id; } public long getLastmodified() { return lastmodified; } public String getUrl() { return url; } public boolean equals(Object o) { if (!(o instanceof ImmutableBlog)) return false; if (o == this) return true; ImmutableBlog other = (ImmutableBlog)o; return other.getId() == this.getId() && other.getLastmodified() == this.getLastmodified() && other.getUrl().equals(this.getUrl()); } public int hashCode() { return (int) (id * 51 + url.hashCode() * 17 + lastmodified * 29); } public String toString() { StringBuffer sb = new StringBuffer(this.getClass().getName()); sb.append("[id=") .append(id) .append(",lastmodified=") .append(lastmodified) .append(",url=") .append(url) .append("]"); return sb.toString(); } }note that it implements java.io.Serializable
public class StoredBlogMap { private StoredMap blogMap; public StoredBlogMap() throws Exception { init(); } protected void init() throws Exception { File dir = new File(System.getProperty("java.io.tmpdir") + File.separator + "StoredBlogMap"); dir.mkdirs(); EnvironmentConfig envConfig = new EnvironmentConfig(); envConfig.setAllowCreate(true); Environment env = new Environment(dir, envConfig); DatabaseConfig dbConfig = new DatabaseConfig(); dbConfig.setAllowCreate(true); Database blogsdb = env.openDatabase(null, "blogsdb", dbConfig); Database classdb = env.openDatabase(null, "classes", dbConfig); StoredClassCatalog catalog = new StoredClassCatalog(classdb); blogMap = new StoredMap(blogsdb, new IntegerBinding(), new SerialBinding(catalog, ImmutableBlog.class), true); } public Map getBlogMap() { return blogMap; } }The majority of the code is just plumbing for setting up the underlying database and typing the keys and values.
public class StoredBlogMapTest extends TestCase { private static Map testMap; static { testMap = new HashMap(); testMap.put(new Integer(539), new ImmutableBlog(539, System.currentTimeMillis(), "http://www.sifry.com/alerts")); testMap.put(new Integer(540), new ImmutableBlog(540, System.currentTimeMillis(), "http://epeus.blogspot.com")); testMap.put(new Integer(541), new ImmutableBlog(541, System.currentTimeMillis(), "http://www.arachna.com/roller/page/spidaman")); }; private StoredBlogMap blogMap; protected void setUp() throws Exception { super.setUp(); blogMap = new StoredBlogMap(); } public void testWriteBlogs() throws Exception { Map blogs = blogMap.getBlogMap(); for (Iterator iter = testMap.entrySet().iterator(); iter.hasNext();) { Map.Entry ent = (Map.Entry) iter.next(); blogs.put((Integer)ent.getKey(), (ImmutableBlog)ent.getValue()); } int i = 0; for (Iterator iter = blogMap.getBlogMap().keySet().iterator(); iter.hasNext();) { iter.next(); i++; } assertEquals(testMap.size(), i); } public void testReadBlogs() throws Exception { Map blogs = blogMap.getBlogMap(); Iterator iter = blogs.entrySet().iterator(); while (iter.hasNext()) { Map.Entry ent = (Map.Entry) iter.next(); ImmutableBlog test = (ImmutableBlog) testMap.get(ent.getKey()); ImmutableBlog stored = (ImmutableBlog) ent.getValue(); assertEquals(test, stored); } StoredIterator.close(iter); } public static void main(String[] args) { junit.textui.TestRunner.run(StoredBlogMapTest.class); } }These assertions all succeed, so assigning to and fetching from a persistent Map works! One of the notable things about the BDB library, it will allocate generous portions of the heap if you let it. The upside is that you get very high performance from the BDB cache. The downside is... using up heap that other things want. This is tunable, in the StoredBlogMap ctor, add this:
// cache size is the number of bytes to allow Sleepycat to nail up envConfig.setCacheSize(cacheSize); // ... now setup the Environment
The basic stuff here functions very well, however I haven't run the any production code that uses Sleepycat's Collections yet. My last project with BDB needed to run an asynchronous database entry remover, so I wanted to remove as much "padding" as possible.
( Feb 08 2006, 12:22:21 AM PST ) Permalink