What's That Noise?! [Ian Kallen's Weblog]

Main | Next day (Apr 15, 2004) »

20040414 Wednesday April 14, 2004

PHP versus J2EE I don't like language wars, different languages and frameworks usually have a mix of strengths and weaknesses. However, I'm presently looking at some fixes (mostly resiliency issues) that have to be applied to a web tier built on PHP and am thinking about what's over the horizon for future scaling requirements.

I first used PHP in 1996 (it was called PHP/FI at the time) -- it offered a very innovative alternative to Perl and CGI generated content by making the execution flow out UI embedded logic. I liked it. The language was easy; very Perl-ish. But over the years, other frameworks have emerged that, despite the many improvements PHP has enjoyed, easily eclipse PHP.

PHP's primary strength is its enablement of rapid development of database backended web applications. For prototyping and providing "functional mockups", that's all well and good. But what I'm seeing is that as soon as you want to scale the application along some axis (runtime traffic, i18l, collaborative development, multiple presentation formats), it's difficult to justify sticking with PHP. From what I reckon, a J2EE web tier (servlet container, JSP w/jstl taglibs, MVC i.e. struts, etc) offers a lot of high level infrastructure and, when contrasted with PHP, is a clear winner:

Runtime scaling
  • There are opcode caching extensions but it's not a core component of the PHP runtime environment
  • PHP is not thread-safe. You pretty much have to run PHP on a pre-forked Apache server. That's not too bad but it means that if you have a large Apache process pool, each process has to have its own connection to a database. If you have multiple persistent connections and a few hundred Apache processes, you're taxing the databases (because of the large number of open connections) and taxing the OS (because of the large number file descriptors open and memory consumption). And every forked process is an island of process space; sharing data between them is a PITA (you have to manage shared memory) and sharing resources is impossible (e.g. forget about pooling your persistent database connections).
  • Servlet container hosted runtime engines (such as JSP or Velocity) usually have code caching out-of-the-box. Higher level caches such as OSCache promise to provide even more to reduce the computational burden on the web tier.
  • They easily take advantage of resources shared between threads such as database connection pools, external service interfaces (e.g. remote and/or local EJB factories; the home interfaces) and the container itself.
Content format scaling If you want to co-brand a PHP site, plan on having conditional logic scattered and tangled all over the UI code. This quickly degrades down into a maintenance nightmare. A J2EE MVC framework such as Jakarta Struts (with tiles) provides a centralized mechanism to declaritively and programmatically control how content is assembled for presentation.
Locale scaling PHP allow you to extend the runtime with the GNU gettext framework. IMO, this is a difficult to use system; you have to manage PO files that have a peculiar file format. The standard Java library has ResourceBundle support built-in with a simple file format (properties files) and, again, web tier infrastructure such as the servlet container, JSTL and Struts provides easy-to-use tools to access ResourceBundles.
Functional scaling
  • Extending PHP often requires recompiling the PHP runtime to statically link in different extensions. This results in closely binding your runtime requirements and your infrastructure compilation requirements.
  • Maintaining persistence flexibility is difficult with PHP; while the PEAR abstraction layer provides a baseline of RDBMS independence the language is still bloated with database function calls with every backend you use. Furthermore, as outlined above, changing backends requires platform dependent code, perhaps recompiling PHP itself and then having client libraries installed where the runtime is executing.
  • Most Java libraries are available as "Pure Java" i.e. jni support, deploying OS-native libraries, isn't an issue. You never have to recompile Java to extend it.
  • JDBC provides a standard interface to different database backends. While this won't help with things like SQL implementation differences, higher level tools like Hibernate's object-relational framework potentially will. Furthermore most database drivers are "Pure Java" i.e. there's no need to recompile anything and no need to, say, make sure LD_LIBRARY_PATH is set properly to enable your database drivers.
Collaborative scaling The distinction between UI code and business logic is fuzzy and requires lots of developer discipline to keep it cleanly separated; there's little support in the framework itself. This pretty guarantees closely mingling formatting and display code with lots of logic. Yuck. While it's certainly possible to write horrible applications with J2EE web infrastructure (i.e. the the fact that you can embed Java code, er, scriptlets, directly into a JSP is a terrible fact of life), the wealth of framework support (JSTL, Struts, etc) makes it easy to follow practices that keep the separation of concerns clean.
A clean separation of concerns is important. Collaborative web application development usually has different roles (the Java programmer providing business API's and beans, the UI layout coder developing the HTML structures, the UI logic programmer writing tag libraries and/or integrating them with the HTML structures) -- even if you're one person coding all of these out, having the different work elements clearly defined and cleanly separated will make your job easier. When you're working in a collaborative environment, the separation of concerns is essential.

There are other little things in the pro's and con's. PHP can support URL rewriting session tracking without doing anything special in the markup code -- nice that it can do that unintrusively but icky on other counts

I'd rather use a taglib in a servlet container to keep track of the URL rewriting needed, if any.

All of these points of contrast lay atop the basic structural differences between the PHP and Java languages: Java has Object Oriented Programming (OOP) as a core part of it's design, PHP has OOP as an odd afterthought. Java has real exception handling, PHP awkwardly provides function calls to register error handlers and function to trigger an error. OO is a core element of code modularization, reuse and extension.

Additionally, while I have my misgivings about EJB's and their misuse, EJB's are an established framework for separating the business and persistence tiers from that of the UI. Service oriented architecture (SOA) patterns are also well established amongst the J2EE development community -- this also better enables collaborative development and clean separations of concern, ergo, long term maintainability.

A final point of comparison and contrast is how closely bound PHP is to the web server in a typical architecture. It's nice to have the ability to scale the HTTP interface independently from the application layer. With PHP, the only option there is to run another webserver; a reverse proxy to offload all of the HTTP servicing. With a servlet container there are various options to connect the HTTP interface to the Java engine via a connector. For example with Apache and Tomcat, you can use mod_jk to connect them, and just to sweeten the deal a bit, mod_jk provides a bit of scaffolding to support load balancing.

It's not my intention to sweepingly indict PHP. For the simple stuff, I like it. And I know it's extremely popular ("50 Million Elvis Fans Can't Be Wrong!"). But when you get beyond needing "Server Side Includes On Steroids" and have to deploy a scaled up database backended application, PHP's weaknesses come to the foreground. Yea, yea... I know that Yahoo! uses PHP and I bet there's a lot to learn from their experiences with it -- I expect PHP creator Rasmus Lerdorf's employment with Yahoo! to drive a lot of innovation in PHP's future. But at this point in time, for the web application requirements I'm looking at, I gotta give it to J2EE.

Other links of interest

( Apr 14 2004, 12:12:22 PM PDT ) Permalink
Comments [2]