What's That Noise?! [Ian Kallen's Weblog]

Main | Next month (Jan 2005) »

Sunday December 12, 2004

l10n development practices I've used Java's handy-dandy ResourceBundles to do some proof-of-concept localizations. But my past proofs were limited; they only used other European languages that utilized ISO-8859-1 characters. I don't recall having to do anything special with the property files in those cases.

Working on a recent Japanese localization project was an eye opening experience. It turns out the java.util.Properties expects ISO-8859-1 characters. I guess that's the downside of having a super-simple file format. I got the localized display boostrapped by using native2ascii to get the UTF-8 localization text rendered as escaped unicode. On a one-off basis, that's easy enough. But collaborative development always begs the tools question, how do folks typically manage this?

Give the localization team instructions on how to use native2ascii on a command line basis?
Put it in the build system with the native2ascii ant task?
Use a light weight GUI tool like resourcebundleeditor (or a heavy weight GUI tool ...)?

What about input encoding? If there's an HTML form on a page and the input has multibyte characters in the query string (or POST data), are characters escaped to ISO-8859-1? My recollection was that HTTP headers must be ISO-8859-1.... but looking at the docs for PHP's mbstring and the encoding_translation parameter, it looks like server-side handling of the request needs to account for other character set encodings. Do browsers honor charset specification as a form attribute, like

<form action=... method=... accept-charset="UTF-8">

(looks like Struts supports this) or is it presumed that the browser always escapes unicode? Or perhaps they simply URL encode the characters so it's a non-issue? On the server side the must the request handling do this

request.setCharacterEncoding("UTF-8");
String raw = request.getParameter("foo");
String clean = new String(raw.getBytes("ISO-8859-1"), "UTF-8");

or is it all supposed to transparently just work (obviating String cleansing) if request.setCharacterEncoding("UTF-8") is used? ...for all of the hand-waving in the docs for ResourceBundle, etc establishing a clear practice for input String handling in a webapp remains murky.

As far as sending responses, is it safe to always just send UTF-8 and include "charset=UTF-8" in the Content-type header? Is it standard practice to presume that the client will send a request header Accept-Charset (which indicates what an acceptable response is)? If they send it and UTF-8 isn't on the list, must the server go through a big String re-writing exercise to encode response to the browser's preference or is UTF-8 presumed to be implicitly acceptable at all times?

So many questions... I'm still digging for anwers.

( Dec 12 2004, 11:51:01 PM PST ) Permalink

Tuesday December 07, 2004

Runtime inseration of struts tiles in Velocity I've been impressed with TilesTool, it comes in the Velocity Tools package. It runs Velocity views through the struts MVC machine for processing reusable "subviews". However, there's no support for runtime insertion of components!

You can do this in tiles-defs.xml

   <definition name=".dog" extends=".animal.layout">
     <put name="body"    value=".dog.display" /> 
     <put name="head"    value=".dog.head" /> 
   </definition>
   <definition name=".cosmos.head" extends=".head">
     <put name="titleKey" value="dog.title" />
   </definition>
   <definition name=".dog.display" 
     controllerUrl="/dog.do"
     path="/tile/dog.vm"
     />

and so forth. Declaritive tile composition works just fine. But what about programmatic composition at runtime?

With JSTL and struts, I can do this:

<c:forEach var="bit" items="${kibble}">
  <tiles:insert page="/tile/bark.jsp">
    <tiles:put name="bit" beanName="bit" />
  </tiles:insert>
</c:forEach>

I would imagine that the Velocity equivalent would look like this:

<ol>
#foreach ($bit in $kibble)
 $tiles.put("/tile/bark.vm", { "bit" : $bit })
#end
</ol>

but alas, it's not implemented by TilesTool. I can work around this by moving "bark.vm" to its own velocimacro but that it fugly as hell. I would prefer parameterized components.

( Dec 07 2004, 06:53:07 AM PST ) Permalink

Monday December 06, 2004

Servlet container forward from inside Velocity I ported some JSP UI code to Velocity, it's been fun learning the Velocity paradigm (being able to cleanly process template components outside the container rocks). One of the things in the JSP UI handled forwarding requests that bypassed the struts controller back through struts.

In JSP with struts tags, it looks like this (assume web.xml has "struts-logic" mapped):

<%@ taglib uri="struts-logic" prefix="logic" %>
<logic:redirect forward="home"/>

But what about Velocity? Well, it turns out that the VelocityViewServlet stuffs the basic servlet container things into the Velocity context, much like JSTL does in JSPville. Ergo, the $request object itself can be invoked like this:

$request.getRequestDispatcher("/home.do").forward($request,$response)

Seems kinda grotty to not be able to use struts symbolic name, but so far that's where my read of the Velocity docs has taken me. As I unpeel the onion, I may be inspired to subclass the VelocityViewServlet as a StrutsViewServlet... it seems like however you're invoking the rendering, you should be able to access, if present, other runtime services such as struts, spring, etc. ( Dec 06 2004, 10:05:35 AM PST ) Permalink

Sunday December 05, 2004

memcached in a service oriented functionality ecosystem Among my efforts over recent months have been those focused on decoupling. Technorati has a very high update rate as it taps the ping streams, fetches update contents, analyzes links and keyword indexes the substance of posts in the blogsphere. Such a system doesn't work well when components are closely coupled; the availability of the whole system is subject to the whim of the system's weakest links. Often, weaknesses are combinatorial; the weakness of the whole is greater than the weakness of the parts. That's what I'm focused on undoing. Fixing weaknesses in the components is important but decoupling them first is more so.

When folks say "service oriented architecture" it still cannotes monolithicism to me. An architecture implies a level of structure definition that sounds rigid; can you re-pour that foundation to adapt redrawn plans? Software development agility and loose coupling should reinforce each other. I prefer to think of architectures and ecosystems. A service oriented functionality ecosystem supplies application functionality as a suite of services. Supporting requirements (as opposed to the core business requirements) such security, logging, persistence, redundancy and caching are each handled independently; they in turn may be provisioned as services that higher level services rely on. This is part of the evolution under way at Technorati; some of the changes are evident in Dave's recent posts but some are just revisions that we're quietly rolling out.

Queues and distributed memory caches are natural elements of a such an environment. In the December issue of Linux Journal, Technorati's use of open source building blocks such as memcached is discussed by Doc Searls.

This is the game:
A memcached server (or a set of servers) can be accessed over the network to store things in a table kept in RAM. When storing things, you can specify a maximum age for the cache entry -- if you go back to fetch it and the elapsed time since it was stored exceeds that age, it gets treated as a cache miss.

Storing things in memcached with the timeout parameter and invalidating cache entries works as long as you have consistent mechanism for calculating the key. If internally you're managing "stories" and each one has an "id" attribute that is unique (a primary key), that's a good candidate to store them with. So for instance putting memcache inside a content management system (CMS) "content service" seems natural. In babytalk code:

  public Story fetchStory(int storyId) {
      Story story = memc.get(storyId);
      if (story == null) // perhaps more rigorous validation of the fetched object
          return story;
      story = StoryDB.findById(storyId);
      memc.put(storyId, story, AGE);
      return(story);
  }

If it's difficult to determine whether something is new or an update because it doesn't have an id and uniqueness is determined by some combination of attributes, then the lookup cycle can be helped by caching with composite keys. It gets a little more complicated:

  public Story fetchStory(Map atts) {
      // encapulate whatever attributes uniquely identify a thing
      CacheKey key = new CacheKey(attrs); 
      Story story = memc.get(key);
      if (story == null) 
          return story;
      story = StoryDB.findByAttrs(attrs);
      memc.put(key, story, AGE);
      return(story);
  }

We're in the process of evolving Technorati's infrastructure to one that is loosely coupled, redundant and robust. Our use of memcached is one of the enabling technologies of that evolution.

( Dec 05 2004, 09:22:23 AM PST ) Permalink

Sun	Mon	Tue	Wed	Thu	Fri	Sat
« December 2004 »
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Today