COP-2808 Java EE Lecture Notes (Draft)

Lecture 14 — Java EE (J2EE) Overview

Most real world applications need access to central resources, such as up-to-date inventory, available seats on a plane, highest bid to date on an auction item, etc. Often these resources are stored on a database, but other resources are accessed centrally too: current stock prices, credit-card processing, etc.

It is possible to write a Java application access such resources on the local computer. But in the real work such central resources are on a different host. We can write code that uses networking APIs to contact these central resources. What’s wrong with that?

· Every change to the central resources will require every copy of the application to be updated. Customers don’t like frequent updates, and if they don’t update and the application fails, they will go to your competitor.

· By allowing remote applications to access your precious resources, all sorts of security problems can arise, including unauthorized use and denial of service.

· A remote application will need to read the central resources every time, from the network. It is usually not feasible to cache this on each and every client. So frequently access information, such as a catalog, will be slow to access. If you do cache information you risk it going stale (out of date), unless you check for changes frequently. No matter how you look at it, your central resources will have lots of network traffic to contend with. Network bandwidth is expensive.

· With remote applications doing all the work, logging and auditing become impossible: If something doesn’t work right you won’t know until the customer complains. You will not have a complete audit trail for legal and financial purposes. (Imagine a customer buys something, the credit card service is contacted directly and the customer billed, but the network connection back to your database fails and the order isn’t completed!)

One simple way around this is to keep the application with the central resources, and only have a remote user interface. This is called client-server computing and it works very well. This scheme is been in use for over 20 years. Since some of the application is in one place and other parts are elsewhere, this is also called distributed computing, enterprise computing, or other terms depending on who you ask.

Definition of Terms for Enterprise Computing (from Sun’s Java Enterprise Cert)

· Scalability – ability to support the required quality of service as the load increases

· Reliability – the assurance of the integrity and consistency of the application and all of its transactions. Reliability spans from the OS to the application, to the service provided.

· Availability – the assurance that a service/resource is always accessible

· Maintainability – the ability to correct flaws in the existing functionality without impacting other components/systems

· Extensibility – the ability to add/modify additional functionality without impacting existing functionality

· Manageability – the ability to manage the system in order to ensure the continued health of a system with respect to scalability, reliability, availability, performance, and security.

Client server computing

Client server systems come in different configurations:

· Thick (fat) client — A standard (or web-start) Java application or applet that communicates across a network to the server. The client is the user interface or presentation layer or tier. Such systems can use different approaches to networking: using TCP/IP and define a custom protocol for the application, use HTTP GET and POST protocols to pass messages and data, use SOAP ( See SOA=service oriented architecture, WSDL (Web Service Description Language), UDDI, apache AXIS for Java SOAP to pass objects back and forth, and others too.

With a fat client, most if not all the work is done at the client. This design requires large executables to be downloaded. Updates require another d/l. The server (if any) only provides a central storage place with very little business logic.

· Chubby client — all the grunt work (SQL etc.) is done on the server along with all the business logic. However the client has a lot of (JavaScript) code which runs to provide the user with a “rich experience”. This code is reloaded every time it changes with no interaction by the user. AJAX (and Web 2.0) falls into this category.

· Thin client — A web browser talks to a web server, which presents a form typically using JSP or servlet technology. (Of course, PHP and ASP pages could be used too, but in this course we will stick to learning only one language, Java, to keep the amount of languages a developer needs to master to a minimum.) Note the client in this case is the JSP/servlet web server code. This can get confusing (the client of the server is on a web server). This is how most web pages currently work.

The thick client has the advantage of a rich and fast user interface, that can include extra features such as caching. Also a thin client can operate even if the Internet connection is unavailable (common for mobile users), and in any case appears much more responsive (no network latency for updates). However the thick client has many of the same problems (especially deployment and security) that plague stand-alone (non-client-server) applications. Java WebStart is an attempt to mitigate this problem but is not widely used today.

The thin client is limited to what a web browser will show. This can be more difficult as a good looking web page requires use of HTML/XML, JavaScript, and CSS. Worse, there are browser compatibility issues so some page may not look good on some older, different, newer, or different platform web browser unless you go for dull boring user interfaces. Still, this approach does keep everything in one place, so no deployment and few security issues arise. Because the processing is done on a server, the user experience may be slower than a thick client. If the Internet is unavailable the thin client fails to work at all. The thin client also needs to worry about issues such as the back button and multiple browser windows open at once.

Today the thin client is the preference of many developers especially with AJAX which provides some thick client benefits. However this changes over time as more powerful laptops with more RAM become popular, more different web browsers become incompatible, and networks become more or less reliable and secure.

Either way the client must communicate with the server. Often this is with RMI or other means such as CORBA or even very low-level sockets. (CORBA was supposed to do this but ended up too complex and although Java supports this it is not often used. This situation may be changing though. Sockets is too low a level to develop significant distributed applications with.)

Multi-Layer or Multi-tier Design

The client talks to your application on the server, which access the central resources. There are different types of resources, and even from the same type (such as a database) there are different vendors. Often you need to access mainframes (yes these are used in the world!) or legacy systems running proprietary software that can’t be easily (or at all) updated. (Even if you don’t need that now, your application may need to support this someday!)

Such resources are known as Enterprise Information Systems or EIS. It has been found to be a generally useful thing to split your application into several parts:

· Presentation layer (or tier) - Separating the client code from the rest means you can have different sorts of clients for the same application: a web interface, a standalone (or WebStart) application, or even an applet. Also you can have designers make functional and attractive UIs, and developers (with little artistic talent) work on the logic of the application. With Java EE 5 a framework called Java Server Faces (JSF) is used to develop the UI for a Java EE application. Often a web interface is developed on JSF that uses another standard technology, servlets and Java Server Pages (JSPs).

· Business logic layer (or tier) - The is the meat of your application. The client (presentation layer or tier) code talks to this code. In Java terms, the client will create objects of these classes and invoke their methods; the results are then formatted and displayed to the user.

· Data access and integration layer (or tier) - This is the code that talks to the EIS components. This code hides the messy details from your nice simple business layer code. The code in the business layer will invoke methods of objects in this layer. This layer can also be known as the EIS layer or tier.

Here’s an example: The user clicks a link on a web page to view their current balance (before they get to this page assume they logged in somehow). The link (which may be a form’s submit button or just a link) sends the request to the client, typically a servlet. The servlet in turn invokes the business logic code to get the client’s payment information, then requests the amount of outstanding orders, and adds all that up and returns the data to the presentation layer, which formats it as HTML. The business logic methods don’t invoke JDBC or networking API directly, rather they invoke methods in the data access and integration layer. This code contains the JDBC and SQL or other code to get the information from the EIS (say the Visa interface to the bank, PayPal, or we track it ourselves in a database).

There are many different ways to organize this code. The common way is to use Enterprise JavaBeans (EJBs) in both these layers, but of course it may be better to use a different technology or organization in any given case. Knowing when to use which technology needs a lot of background knowledge and experience, which is why enterprise computing is so hard to learn!

Scalability, Reliability, and Availability

You won’t get a lot of business done with a single client talking to a single server. In some large-scale systems, it is hard to predict the number and behavior of end users.

Scalability refers to a system’s ability to support fast increasing numbers of users. [From Wikipedia:] Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added.

The intuitive way to scale up the number of concurrent sessions handled by a server is to add resources (memory, CPU or hard disk) to it. To scale vertically (or scale up) means to add resources to a single node in a system. This typically involves the addition of CPUs or memory to a single computer. Such vertical scaling of existing systems also enables them to leverage virtualization technology more effectively, as it provides more resources for the hosted set of operating systems and application modules to share.

Vertical scaling has limits. You typically run one JVM per server, with many threads (one per request). However this design only scales up to one to four thousand concurrent requests, no matter how many CPUs are available. Also, 32-bit systems cannot use more the 2GB of RAM per process (per JVM). Tasks that use I/O heavily can only use so much CPU power; the bottleneck becomes the IO. Running multiple JVMs per server is possible to make use of additional RAM but leads to other problems, such as resource sharing, singletons, synchronization, etc.

To scale horizontally (or scale out) means to add more nodes to a system, such as adding a new computer to a distributed software application. An example might be scaling out from one web server system to three. As computer prices drop and performance continues to increase, low cost commodity systems can be used for high performance computing applications such as seismic analysis and biotechnology workloads that could in the past only be handled by supercomputers. Hundreds of small computers may be configured in a cluster to obtain aggregate computing power which often exceeds that of traditional RISC processor based scientific computers.

Clustering or grid computing are horizontal scaling alternatives that allow a group of servers to share tasks and operate as a single server logically, enabling the system to serve multiple clients at the same time.

Note you may have a cluster of web servers and a single application server, a single web server and a cluster of application servers, or (independent) clusters of each.

This approach requires middleware, software that can direct incoming requests from users to available thin clients, and route thin client requests to available servers. This is called load balancing. A good load balancer/middleware is very complex and most people buy these instead of writing their own. A server running middleware, which in turn invokes your application code, is called an application server.

A typical picture of an enterprise application cluster looks like this:

Clustering is a popular technology that provides highly available and scalable services, with fault tolerance. The lack of details in the Java EE specification means different vendors implement clustering differently. Some popular products include the BIG-IP™ load balancer from F5.com. Tangosol’s Coherence™ provides a distributed data management platform which can be embedded into most popular Java EE containers to provide clustering environment. Coherence also provides distributed cache system which can share java objects among different JVM instances effectively. See www.tangosol.com/. Other solutions include Terracotta DSO and JavaSpaces.

Additionally a single-server solution (add memory and CPU) for scalability is not robust due to a single point of failure. Mission-critical applications such as banking, billing, medical systems, etc., cannot tolerate any service outages. These services must always be accessible with reasonable and predictable response times. Clustering is a solution to achieve this kind of high availability by providing redundant servers in the cluster in case one server fails to provide service.

Clustering in the Web tier is the most important and fundamental type of clustering. Web clustering technique includes web server load balancing and HTTP Session failover.

”JavaGroups” is currently the communication layer of JBoss and Tomcat clustering, and provides core features such as “Group membership protocols” and “message multicast”, which are very useful in making clustering work. For more information about JavaGroups see www.jgroups.org/javagroupsnew/docs/.

Another scalability problem is usually the Database management system. This is the most common bottleneck for most of enterprise and Web 2.0 applications, for the database is normally shared by the JVM threads. So effectiveness of database access, and the isolation levels between database transactions will affect the scalability significantly. We have seen a lot of projects where most of the business logic resides in the database in terms of stored procedures, while keeping the Web tier very lightweight just to perform simple data filtering actions and process the stored procedures in database. This architecture is causing a lot of issues with respect to scalability as the number of requests grow.

Question: What issues arise with a clustered design that you don’t have with a simple application? [Adopted in part from “Mastering Enterprise JavaBeans” by Ed roman, et. al., (c) by Wiley.]

· Remote method Invocation (RMI), so client code can call server code.

· Load balancing. This is one of the key technologies behind clustering, which obtains high availability and performance by dispatching incoming requests to different servers. The middleware must be able to know if some application server is busy and direct the client request onto a different application server.

A load balancer could be a hardware product with SSL accelerators such as the F5 Load Balancer, or just another web server with load balancing plug-ins. Even a simple Linux box with iptables (packet filter firewall) can load balance very well.

Dispatching can be implemented with round-robin, random, source-IP based, or work load based algorithms.

In addition to dispatching requests a load balancer should perform some other important tasks such as “session stickiness” to have a user session live entirely on one server and “health check” (or heartbeat) to prevent dispatching requests to a failing server. Sometimes the load balancer will participant in the Failover process (see below).

· What if a web or application server crashes? The middleware must be able to restart (or at least back out of) any transaction and user sessions with a different server. This is known as transparent fall-over. There is usually a time delay associated with this in seconds (could be minutes if this feature isn’t carefully designed). Note that data corruption must be avoided even if a server fails in the middle of a session; this is called fault tolerance.

· Transaction processing. If you use files or lightweight databases, you need to worry that two or more sessions might simultaneously access the same piece of data. Even with powerful back-end databases you need to be careful. This is because DBs use connections to separate transactions, and as theses are slow to setup, you normally use a technique known as resource pooling to reuse connections. Also, the user session might start on one application server (using one DB connection) and finish on another (using a different DB connection).

· Application server state. Ideally all persistent information is stored in the EIS so nothing is lost if the session moves to a different server (for instance when one server crashes). However this design is slow and causes a lot of network traffic to the EIS resources. So, if you store some information on the application server, does it need to be copied (replicated) on all servers? When a server restarts, how is that information loaded/initialized?

Just about every web based application has some session state, such as remembering whether you are logged in or the contents of your shopping cart. Because the HTTP protocol is itself stateless session state needs to be stored somewhere and associated with your browsing session in a way that can be easily retrieved the next time you request a page from the same web application. This can be done in several ways, the best is to store a session “cookie” on the browser. This cookie is sent to the server with every HTTP request. A session cookie expires when the browser session terminates (that is when the browser closes). Also the servlet container (web server) can associate a timeout for sessions.

When a browser visits a stateful web application this application creates a session object in memory to store information for later use. It sends the browser a globally unique HTTPSession ID cookie which can identify this session object in the cluster. There are methods to add and access data in a session object.

Other options include using complex and long URLs that contain session information, or hidden form fields.

Note the problem of load balancing and fail-over with session data; if the session can be dispatched to a different server next, where is the session data kept? The middleware must make one or more copies of the session objects on different servers, and keep them synchronized as changes occur. (Using the EIS DB is simple by very slow.) Tomcat 5 copies all session objects to all servers on all updates. JBoss and other servlet containers follow a more efficient approaches of paired servers or paired se3ssions, so each session object is backed up onto one other server. IBM’s Websphere uses a special high-speed central DB for sessions.

No matter what the choice you may need to configure (“tune”) the frequency of session replication, and the granularity (whole session or just the modified bits).

· Deployment. How can you cleanly update your application code an all the servers without stopping all of them at once?

· Clean shutdown of application server. When a server must be stopped for updates or other maintenance, you don’t want to abort the currently running sessions.

· Logging and auditing. You need a central log for determining if problems exist (and if so, what went wrong), and for auditing (information about what was done, by whom, and when). Auditing is required (compliance) by various legal, accounting, and regulatory bodies (FBI, SEC, FTC, ...). Be careful about logging user identification data: don’t log if possible, else use data blinding (sub user IDs with unique random strings: you can still relate all msgs to that user but don’t know which user it is). Use encryption to protect logs from modification (digital signature per entry, plus manager sig: dual controls) and access.

· Systems management (IT). Who gets paged (and how) when a problem occurs, or in the event of a disaster (power loss, fire, ...)?

· Distributed applications. One object’s methods may invoke another object’s methods. With distributed applications, the second object may be loaded in the same JVM, on the same server but loaded by a different JVM, or loaded on a different server (and thus a different JVM). So, how do you make sure to not load the same class twice? What happens if two copies of some class are loaded (and you change static properties), or you create too many copies of some object (e.g., a DB connection to your Oracle server)? Given the many different vendors of application servers, where are your classes put so that the JVM classloader can find them? (This is why you often use a custom ClassLoader.)

· Multi-threading issues. Since each web server will likely have to handle several user sessions, and will each application server, you need to support multi processing or multi threading.

· Support message oriented middleware. Some EIS are web services that you talk to, not by RMI or by some proprietary network protocol, but by passing (simple) messages back and forth.

· Resource pooling. The obvious candidate is database connection pooling, but other resource that take time to setup and initialize, and to tear down when done, can often benefit from having a bunch of such resources pre-created in a pool (a collection of some kind), ready to go. When you need (for instance) to send a SQL query to the database, you grab a pre-built connection from the pool, pass it the SQL code, and run it. After the connection returns the results, you clean up the connection and return it to the pool.

· Security. You must validate all user input and sanitize it (SQL/XML injection, XSS): be careful of HTML constructs such as “<”, “&”, “>”, other illegal characters, length restrictions, numerical limits, etc. User data used for filenames has extra restrictions: “..”, “/” and “\”, etc. Validate everything, not just user input fields but cookies and all data. After a user has been validated, issue a new session ID. Invalidate session IDs at logout (or after timeout). Use a quality source of random numbers (i.e., have a large entropy pool). Avoid native methods (no security). Store sensitive data (credentials) security encoded, even in RAM. Hide all low-level error messages from user (show only generic message, ok to log full message.) Mark all scaffolding/testing code with annotation, to easily find and remove before production release. Use BigInteger/BigDecimal to convert string input data to numbers, then check limits before converting to int/float so no over/underflow errors.

· Caching. All sorts of data are frequently needed by applications. It makes sense to have copies on each server for quick access. But these caches need to be updated when the data in the EIS changes.

· Business tier object management (lifecycle). These objects need to be created when the client makes a request and/or a session is started. They need to be destroyed when no longer needed.

· Persistence of objects. An object representing some user’s shopping cart needs to persist for a long time (minutes). What happens if the application is restarted on a different server, or if the server crashes? Such objects need to be persistent. You could of course write your business logic so that every change by the user is immediately sent to the database, but such a design causes way too many database connections! If too many users try to use your system at once you will overwhelm your DB and/or network, and it may cause data loss, dropped sessions, or crashed DB or servers.

Another problem is exactly how to store objects into relational database tables (known as O/R Mapping, or object/relational mapping). With Java EE 5 and newer, JPA (Java Persistence API) does this. You only need to define objects, then run an SQL-like query language to populate the fields of these objects or to store any changes made back to the DB. (Of course someplace you need to define the DB, username, password, and other connection information.)

A common question is why bother using all this ORM stuff if you already know SQL? One reason is the SQL isn’t portable; it isn’t really standardized except for the most basic uses. Using some ORM also gives you:
          - caching (first level & second level)
          - lazy loading of related objects
          - transaction support (ejb3, transaction context propagation)
          - better/cleaner query language
          - fetching of related objects in a join, good for performance

But, many expensive DBMSs offer useful features not available if using ORM, such as Fine Grained Access Control, hierarchical queries, triggers, stored procedures, and Analytic Functions. However most ORMs allow one to mix in SQL when needed.

On any project you must weigh factors such as portability, need for advanced features, local expertise, and ease of development.

A number of technologies for this are available: JDBC (no mapping, straight SQL), JDO (mostly a dead project now), PJO (?), Entity EJB (either bean-managed persistence, BMP, or container managed persistence, CMP; this EJB 2.x standard has been replaced with JPA in EJB 3), Hibernate, Oracle Toplink, and others. Each has unique strengths and weaknesses, so there is no one best solution.

Example of JPA code:

import javax.persistence.Entity;
import javax.persistence.Id;

@Entity public class Customer {
@Id private Long id;
private String firstname; private String lastname;
private String telephone; private String email;
private Integer age;
// constuctors, getters, setters
}

//...

@PersistenceContext(unitName="JPAModel")
EntityManager em;

Employee emp = new Employee(...);
emp.setName("Miller");
em.persist(emp);
// ...
Employee emp = em.find(Employee.class, ("Miller");
emp.setAge( new Integer(21) ); // DB updates automatically!
// ...
if (emp != null) em.remove(emp);

Thw EntityManager takes a parameter that refers to a specific persistence unit (“JPAModel” in the code above). Persistence units are declared in the persistence.xml file and contain information such as the database to use, the JDBC driver, and any implementation-specific properties. Note the table and column names are automatically derived from the Customer class (but may easily be over-ridden if desired).

· Naming services. In a clustered environment, how do different objects find each other? How does the client “know” where to find the server? How does the various business logic objects “know” which EIS to use and how to contact it? How do you refer to some EJB or other object, when that object may not be loaded in the local JVM but rather some other server’s JVM? An analogy is “how does a web browser know where some website such as www.wpollock.com is?”

The obvious answer is to hard-code that information everywhere. Then when your DB changes it IP address you have to re-write and then deploy all new versions of your application. Not a good idea! The answer is a name-to-resource database (just like a browser uses DNS to translate a name “wpollock.com” to an IP address.) Your objects can “lookup” some resource by name, and then use the returned information to access that resource. This is the JNDI API. Note some servers don’t use a global (cluster-wide) JNDI DB but rather one per server (Sun and IBM). However these use special agent software to achieve the same effect.

Enterprise applications are often quite large: either a large number of users (so performance matters), a large geographical area, a large amount of data to organize, or some combination of these factors. Such applications end up as very large applications (in terms of code size). Because of the size it is quite common to purchase some parts “off the shelf” from vendors such as Sun, IBM, BEA, Oracle, etc. This is true of the middleware, load balancing, web servers, backend databases, and especially common application services (transactions, persistence, logging, creation/destruction of business objections, ...). Other parts (the presentation web pages/forms/servlets/JSPs, business logic, and sometimes EIS tier code) will be developed by independent (teams of) programmers.

Many of the issues above (and may others) are common enough that a market was created for “application servers”, systems that included middleware and support for persistence of objects, transactions, load balancing, systems management, etc. To date there have been about 50 such products you can purchase. In the past every one of these application servers had proprietary interfaces. So you business logic classes (and other code) had to be written for one specific brand of application server, or you had to do all that code yourself. If you picked the wrong company or product, you probably would need to start the application over nearly from scratch.

The answer to the need for off-the-shelf services that don’t have proprietary specifications and interfaces is J2EE. J2EE is a specification, not a product. So if the database product you bought yesterday is J2EE compliant you can be confident it will work with the J2EE application server you buy tomorrow. And if you develop J2EE business logic components (sets of classes), you know they can be deployed without change in any J2EE compliant application server.

Sun has changed the names of the Java platforms. There will be no more “J2EE”. The full name of the platform is now “Java Platform, Enterprise Edition”, abbreviated “Java EE”. The next release will be “Java EE 5”. Everyone still calls it “J2EE”.

Some of the most common J2EE specifications include EJBs (Enterprise Java Beans), servlets, JSPs (Java server pages), JDBC, JavaMail, JMS (Java Management Services), etc. Often the data is passed around using XML. Security technologies such as user authentication and data encryption (and IPsec for network security) are common.

Containers are standardized runtime environments that provide specific services to components. Components can expect these services to be available on any J2EE platform from any vendor. Thus we talk about servlet (or web) containers, or full J2EE containers (usually referred to as EJB containers).

EJBs come in three different types: session beans, entity beans (now replaced with JPA), and message beans. Note an EJB is different than a JavaBean. Not all applications use all types of EJBs but session beans are common.

Because the business logic deployed in an application server is so often a set of EJBs, an application server compliant with J2EE is also called an EJB container. (Similarly web server part of the client code is often deployed in a Web application container or servlet container, which is just a web server that knows about servlets, JSPs, and WARs.) The client computer where application code runs (especially for thick clients) is sometimes referred to as a application container.

A typical enterprise application is used this way: a user connects to the client part of an enterprise application, starting a session. The client is often implemented as part of a web server with a servlet or JSP page. Once the session is established the client and application exchange messages or even objects (which are serialized and send as regular messages) using XML and HTTP or other network protocols. The application often gets data from other servers, perhaps a central database server. Updates (orders, reservations, ...) are handled with transaction management. When finished the session is terminated (or times out).

To fully understand enterprise applications (well enough to design them properly) you need to master many technologies including web server technology, database technology, networking technology, security technology, multithreading/clustering (for high volume and good performance), and others too. Security and encryption technology should also be understood. You don’t have to be able to write code for all these technologies but you need to know enough to decide which technologies you need for a given application, how to buy a product with the required features, and how to write a bit of “glue” code to tie the different pieces together.

Sun also provides a series of classes that can be used to implement parts of a J2EE enterprise application (such as JDBC to talk to legacy databases using SQL, messaging interfaces, etc.), as well as other useful bits such as deployment tools.

A competing technology is the .NET series of standards from Microsoft. But these are not being widely adopted (at least not yet outside of MS).

J2EE Development Issues

One problem today is that the Internet is a hostile place. The data must be encrypted when send and the sender (and recipient) must be verified using special security methods. Since many parts of the Internet are protected by firewalls today, many enterprise applications with thick clients use port 80 (the standard web server access) port to go though the firewalls. This creates additional security problems.

Another problem is that not all parts of an enterprise application are standardized through J2EE. A common example is the configuration of the application server or web server. Newer editions of the J2EE standard provide the standards, tools, and Java classes help solve these problems.

In spite of all the care that makes J2EE applications work on nearly any server, some of the APIs used are “low-level”: full of messy details that are difficult to make correct, efficient, or both, or are just plain repetitive and boring for most programmers. To help with this a number of technologies were created, not really part of the J2EE standard but built on top of these APIs and specifications, that are at a “higher” level and hopefully easier to learn, use, and reuse. These are called Frameworks. Examples include Struts, JavaFaces, and Apache Tapestry, to simplify the web/presentation layer development for user interfaces (so you can avoid servlets and non-standard custom JSP tag libraries); Hibernate, JDO, and others to manage persistent data (so you can avoid JDBC; however there is now the JPA or Java persistence architecture so these frameworks are less needed), Spring and seam to simplify overall J2EE development (especially of CRUD—Create, read, update, delete—apps), and others such as AspectJ (See Wikipedia for this). Other frameworks include JBI (servicemix.org) and RAILs (for Ruby). (Rails doesn’t scale to clusters though, all state in in a single DB.)

Design pattern such as “Singleton” use static variables to share a state among many objects. For example tracking the number of users. Using static variables works well on a single server but fails in a cluster. Each instance in the cluster would maintain its own copy of the static variable in its own JVM instance, thereby breaking the mechanism of the singleton pattern. A preferable way in a cluster is to store all state data to a database.

Caching state is a big problem. Putting all state in the backend DB works but can be very slow. Storing some state in the presentation or application tier can greatly enhance performance, but stateful session beans cause a lot of trouble with clustering, failover, performance (the non-functional requirements, or NFRs). One technique is only use stateless session beans, and hold any session state in the servlet containers (the presentation tier). Another is to replicate the state on one or more (or all) application and/or web app servers. This required more sophisticated middleware/app servers.

Similar problems arise in a cluster if you store any data in files. Besides not using files, consider a SAN (storage area network) to provide a central file service all servers share.

Remember that although there are a lot of APIs and technologies as part of J2EE, all are optional! You only use what your application needs. Usually there is more than one way to design the application too. Knowing the best (or at least a good) design takes experience.

The specific API’s mandated for Java EE 5 are:

· Enterprise JavaBeans (EJB) 3.0

· Servlet 2.4

· JavaServer Pages (JSP) 2.1

· Java Message Service (JMS) 1.1

· Java Transaction API (JTA) 1.0

· JavaMail 1.3

· JavaBeans Activation Framework 1.1

· J2EE Connector Architecture 1.5

· Web Services for J2EE 1.1

· Java API for XML-based RPC (JAX-RPC) 1.1

· Java API for XML Web Services (JAX-WS) 2.0

· Java Architecture for XML Binding (JAXB) 2.0

· SOAP with Attachments API for Java (SAAJ) 1.3

· Java API for XML Registries (JAXR) 1.0

· Java 2 Platform, Enterprise Edition Management API 1.0

· Java 2 Platform, Enterprise Edition Deployment API 1.1

· Java Authorization Service Provider Contract for Containers 1.0

· Debugging Support for Other Languages (JSR-45)

· Standard Tag Library for JavaServer Pages (JSTL) 1.1

· Web Services Metadata for the Java Platform 1.0

· JavaServer Faces 1.2 Requirements

· Common Annotations for the Java Platform 1.0

· Streaming API for XML (StAX) 1.0

· Java Persistence API 1.0

Issues for selecting J2EE application servers

J2EE branding, pluggable JRE, conversion tools (Old J2EE to new J2EE), 3rd party JDBS driver support, Lazy-loading (Load beans on-demand), deferred DB writes (don’t update the DP until commit), Data Cache (some servers us pass-through so every access results in a slow DB operation), Presentation-EJB integration (one server with one JVM doing both), scalability references, high availability, security (JAAS support), IDE, UML Editor integration, intelligent load balancing, transparent fail-over (for stateless sessions), clustering with transparent fail-over, support for optional J2EE features: Java Management Extension J2EE Connector Architecture (integration of EIS components), Hot Deployment (no reboot needed to re-deploy), EJB instance pooling and automatic generation, clean shutdown, some useful EJB components included, ...

Almost every J2EE project uses object caching to improve performance, and all popular application servers provide some type of caching. Caching is not currently part of the Java EE specs, so every vendor does this differently. Note these caches are typically designed for a standalone environment and can only work within one JVM instance.

Caching is needed because some objects are so complex that creating a new one will cost much (i.e., DB lookups across the Internet). So the server maintains a pool to reuse the object instances without further creation.

You gain performance only if the maintenance of the cache is cheaper than objects creation. In a clustered environment each JVM instance will maintain its own copy of the cache which should be synchronized with others to provide inconsistent state in all server instances. Sometimes this kind of sync will bring worse performance than not caching!

JBoss began as an open source EJB container project six years ago has become a fully certified J2EE 1.4 application server with the largest market share, competitive with proprietary Java application servers in features and quality. JBoss isn’t just a J2EE server. It has a dynamic architecture which allows one to alter the services to make J2EE work the way you want, or even throw J2EE away completely. Sun J2EE SDK contains PE Server, another J2EE application container. Others include Resin, Glassfish (the open source version of Sun’s PE server), Geronimo, CarbonAS, Oracle’s WebLogic, and WebSphere. For learning purposes I recommend Sun’s free glassfish v2. (JBoss v5 is very popular.)

Sun Microsystems was bought by Oracle in 4/2009, so the future of Glassfish is uncertain. (As is MySQL, another Sun technology.)

Perhaps the most important aspect of clustering is memory replication. In memory replication, information about the state of user sessions with deployed applications in a server instance is replicated to a peer instance in the cluster. With the glassfish v2 J2EE container, each instance in a cluster sends session-state information to the next instance in the cluster, its replication partner. The order of instances in a cluster is based on the names of the instances. As the session-state information for a deployed application is updated in any instance, it is replicated around the cluster.

Glassfish also has a load balancer plug-in. The load balancer is responsible for distributing workload among multiple server instances. It also takes part in rerouting a session when an instance fails. If a GlassFish v2 instance in a cluster fails, the load balancer reroutes the sessions interacting with that instance to another instance in the cluster.

Java Management Extensions

[ From: http://www.theserverside.com/articles/article.tss?l=JMXWebApps ]

The Java Management Extensions (JMX) standard provides a standard for the management of applications, as well as application servers and other infrastructure software. The point of this is to improve productivity of your support (and development) staff, and to improve efficiency (reducing downtime of servers, and other efficiency measures). JMX makes it possible to manage and monitor applications using a choice of management systems and consoles, including SNMP (simple network management protocol, a common standard) consoles. It also simplifies the task of making applications manageable, i.e. management instrumentation, and enables improved management of deployed applications in the enterprise. JMX provides a lot of management value with little effort.

While management tasks should be part of the design requirements, it isn’t always possible to anticipate these needs. It’s hard to change the application software at that point. This is where business-specific application management can be very useful.

The first step in managing applications is monitoring availability and performance. In order to minimize downtime and avoid disruption to business operations, you must monitor and measure the availability of applications (and their servers). Also the performance of the application needs to be tracked. This ensures the application performs to the service levels needed, and enables a quick response if the application is under-performing. Performance is usually measured by counting key business method invocations or user transactions, in a given interval of time (say per 5 minutes).

Beyond monitoring availability and performance, managing business-specific aspects of the application can be helpful in maximizing productivity. For example, a banking application may require the ability to monitor cash withdrawals to ensure sufficient cash at a required ATM or branch office. Another example is the need to monitor inventory levels for an on-line store. While some of these management needs may be anticipated and designed into the application, many of these needs arise from production use of the application.

Common web applications may include financial applications like banking, insurance, or brokerage applications, as well as a number of other intranet applications like HR, call-center, and other applications. All of these can benefit from management, to some degree or another. Work-flows and click-trails can be studied to make the application’s interface easier to navigate, marketing and other data can be collected to base business decisions on, inventory levels can be monitored, etc.

Here’s a simple example: an application that allows users to transfer money between accounts using a set of JSP pages. To complete the transaction, the user navigates through four pages: a start page, a select transaction details page, a confirm transaction page, and a transaction complete page.

In this example, as in many web applications, the user navigates a series of pages to accomplish a task or transaction. Depending on the application, it may be useful to know whether users are having difficulty completing these tasks. For example, if a number of users start a transaction but fail to complete it, it may indicate some work needs to be done on making navigation easier. You can use the session tracking capabilities available with JSP to track the data we need on how users are doing.

In addition to tracking user success in completing their transactions, for this specific application it may be useful to track aspects of the money transfer. For example, we may wish to send a management notification when the transfer amount exceeds a threshold.

The JMX architecture is based on a manager and agent model. The agent typically resides with the application server and provides management instrumentation, i.e. application data and control. The management console collects and presents management information from one or more agents on the network.

JMX management instrumentation is a set of Java components called MBeans. The MBean components are Java classes that fit design patterns based on the type of MBean. The defined MBean types are Standard, Dynamic, Open, and Model MBeans. JMX agent MBeans are all registered with a common MBeanServer interface, which remote management consoles access to get management information from the agent. (A bit like RMI.)

Standard MBeans provide a defined interface that follows a naming convention. The interface class is named by adding “MBean” to the name of the standard MBean class. For example if you call a standard MBean MyJspStats, then the MBean interface will need to be named MyJspStatsMBean.

Determining the size of an application (determining resources needed such as number of servers in a cluster)

You can simulate a system using queuing theory. A number of tools are available (none for free AFAIK) that allow you to adjust various parameters, such as the number of customers, time to process each session, etc., and graph the results in terms of response time and max load versus cluster size. Or you could hire a math magician to implement the queuing models and provide you with the numbers. (But playing with a GUI Java applet with sliders and textfields for input is more fun!)

Those numbers can be learned either by experience or by observing a similar (e.g., the current website) system and collecting statistics. (An interesting project is to write a program to parse a month’s worth of server logs, to determine number of customers number of sessions, number of page hits per session, and average length of time for each session and each page load.)

What constitutes acceptable response time varies per organization, but there are publicly available marketing results that show a page must be rendered in under X seconds or customers will go elsewhere. X is usually quoted as either 4.something or 8.something seconds, as far as I remember.

Now, if you can get acceptable response time on a single server, then you can avoid cluster issues and the project is either small or medium. If a cluster is needed, it is large (some experienced developers might say medium or large, but if you are new to this, it is large!)

Most useful websites require persistent storage and the common solution is to use a database system. If you can get acceptable performance with a single host providing both the web service and DB service, the project is small.

Usually a separate server will be needed for each, in which case you need to deal with networking and possibly encryption (if you use credit card numbers for instance). To me, these extra dimensions make the project medium or large.

Until you gain enough experience to estimate a project’s scope on your own, one option is to hire a "mentor" consultant who has the experience and who can get you started on your project with a good estimate of the resources needed. Often this is cheaper than hiring a consultant to do the project for you.

Depending on local expertise available, a small project need not be done in Java. However medium and larger projects are often easier if done using Java EE technology, which provides a (sometimes) simpler API for the common tasks needed. Usually a servlet container will do for small and medium projects, and an EJB container will make life easier if you have a large one. Also the extra frameworks available and common (seam, struts, spring, hibernate, ...) handle these tasks at an even higher level, and in theory that makes the project more manageable.

My experience is that learning too many frameworks for your first big (medium or large) project is unlikely to save time on that project. But maybe subsequent ones. Indeed, practicing re-doing an existing website using a different framework is a good way to learn the frameworks. This may be true even (or especially) for a small site that doesn’t really need the framework.

Synchronization Issues Affecting Scalability

[http://www.theserverside.com/tt/articles/article.tss?track=NL-461&ad=664753USCA&l=ScalingYourJavaEEApplications&asrc=EM_NLN_4588276&uid=3138114]

Let’s take a real world case as an example. This is an ERP system for manufacture, when tested its performance in one of our latest CMT servers (2 CPUs, 16 cores, 128 strands ), we found the CPU usage was more than 90%. This was a big surprise, because few applications can scale so well in this type of machine. Our excitement just lasted for 5 minutes before we discovered that the average response time was very high and the throughput was unbelievable low. What were these CPUs doing? Weren’t they busy? What were they busy with? Through the tracing tools in the OS, we found almost all the CPUs were doing the same thing - “HashMap.get()”, and it seemed that all CPUs were in infinite loops. Then we tested this application on diverse servers with different numbers of CPUs. The result was that the more CPUs the server has, the more chances this infinite loop would happen.

The root cause of the infinite loop is on an unprotected shared variable — a “HashMap” data structure. After added “synchronized” marker to all the access methods, everything was normal. (Sun engineers didn’t think it a bug, but rather suggested the use of “ConcurrentHashMap”. So take it into consideration when building a scalable system.)

Lecture 15 — Enterprise JavaBeans (EJBs)

[Adapted from "Mastering Enterprise JavaBeans" by Ed Roman et. al., Wiley pub.]

A software component is some code (typically one or more classes, often in their own package) that implements a set of well-defined interfaces. It is sometimes possible to purchase software components that fulfill some need in your application, rather than write the software yourself. If you do end up writing the software yourself, you may be able to reuse it in another project someday. And even if not, writing your code to a set of interfaces is in general a very good idea.

An Enterprise JavaBean is a server-side component that can be deployed in a distributed multi-user environment.

Consider the pricing of goods and services you sell. There are base prices, quantity discounts, bundling discounts (buy both and save), preferred customer discounts, location discounts, taxes, and shipping and handling charges. If every part of your application (or of several applications) needed to compute prices, it makes a lot of sense to have a pricing component that can be used over and over.

When a client wants to use an EJB, it cannot invoke the EJB directly. This is because the EJB may not be running on that server but instead some JVM on another server. The client can only invoke a local object called a stub which acts as a proxy to the remote object and has the same (remote) interface. The stub is responsible for accepting method calls locally and delegating those method calls to the remote EJBs across the network. Stubs are running within the client JVM, and know how to look over the network for the real object through RMI/IIOP. The stubs can be created manually (and were in old Java EE versions), but a simple JNDI lookup can find EJBs and create the stubs.

A smart stub (such as found in JBoss) contains the list of target instances it can access, can detect any failure of the target instances, and also contains complex load-balancing and fail-over logic to dispatch requests to the targets. Also if the cluster topology changes (for example: new instances are added or removed), the smart stub can update its target list to reflect the new topology without manual reconfiguration.

If the web/servlet container and the EJB container are on the same server, they can share a single JVM. This means the stub can avoid any network access and is thus noticeably more efficient. If your cluster uses this collocation design (rather than the standard distributed design) you can use an EJB’s local interface rather than its remote interface.

Deployment Descriptors

To use software components you include them in an archive file (usually), placed in the application server’s specific location that it uses to look for components, and then configure them, usually through a deployment descriptor, which is an XML file that tells the server what the component is, what middleware services it needs, and how and when to deploy it. The deployment descriptor is a key piece of J2EE. Prior to this each middleware vendor provided a proprietary way to pass this information for each component.

In some cases an additional step is needed. You may be to run a middleware-vendor-supplied tool to generate client request interceptors. These are similar to the stubs of RMI. The request interceptor takes the client request, performs any middleware actions to provide required security, transaction, and persistence support, and forwards the request to the component.

Today there is no real market for enterprise software components (but lookup crystal reports in the Internet).

Enterprise Java Beans

The Enterprise JavaBean (or EJB or Enterprise Bean) standard provides a component architecture for deployable server-side components. An application server can import, load, and communicate with EJBs using this standard. The EJB standard is a 500+ page PDF you can obtain from java.sun.com. It also includes a set of interfaces that components and application servers must adhere to.

EJBs are designed to communication with other software: in the presentation layer, other EJBs, Web services, and EIS layer components.

For some projects this is overkill. If your application is just a GUI to a database, or is very simple, or you need the initial system built rapidly (say as a prototype), then the three-tiered approach is not appropriate. Instead just use a two layer approach, having JSPs use simple Java classes to talk to the backend DB. However if you need to support transactions, sessions, banner ads, special services, communications with outside EIS or web services, or anticipate growth so scalability becomes very important, then EJB can be the way to go.

Types of EJBs

There are different types of beans, each used for a specific purpose. Not all types must be used in any given application. The most common are session beans. These beans model business processes and workflows (such as a user session while shopping on-line). Session beans typically do something; when your description uses a verb, think session bean. Such beans might be either stateful or stateless.

Entity beans model data, such as back accounts, items, bids, purchase orders, etc. These components perform the object/relational mapping and form the bridge between the business-logic layer and the EIS components. They are often referred to the persistence components.

Entity beans have been less commonly used because of poor performance and other problems. Many newer J2EE servers provide better support for the newer EJB standard, which is supposed to address these problems. However many replacement technologies (not part of J2EE, but supported by many application servers) have sprung up to solve this problem: Java Data Objects (JDOs), Hibernate, etc.

Entity beans may use either bean-managed or container-managed persistence.

Message-driven beans are similar to session beans in that they perform some task, but the interface is via messages passed back and forth according to some protocol.

EJB Interfaces

The EJB specification defines standard interfaces that you bean classes can implement, depending on the type of bean it is and the services it requires/provides. All EJBs regardless of type, must implement javax.ejb.EnterpriseBean interface. This marker interface (doesn’t require any methods to be implemented) makes the class serializable.

Different types of beans have more specialized interfaces, that extend this one. Session beans must implement javax.ejb.SessionBean, entity beans implement javax.ejb.EntityBean, and message-driven beans implement javax.ejb.MessageDrivenBean.

The clients access your bean’s methods through the request interceptor. But how does the request interceptor know what methods of your bean classes to intercept? And how do clients know what bean methods are available? The answer is to define the interface to the methods your bean wants to expose. This is called the remote interface. This interface must extend the javax.ejb.EJBObject interface. The methods defined in this interface are automatically implemented by the application server (a.k.a. the EJB container). You only need to implement the additional methods you add to your remote interface, implemented by your bean.

javax.ejb.EJBObject (and hence your remote interface) implement the java.rmi.Remote interface, making all your beans accessible via RMI (actually RMI-IIOP). All the normal RMI restrictions apply: pass-by-copy, must throw RemoteException, ...

Unlike RMI there is no central registry of javabeans. To get a reference to an EJB object, a client must obtain one from an EJB Object Factory, a part of the EJB container responsible for creating, locating, and destroying EJB objects as needed. The EJB specs calls this factory a home object.

Home objects are specific to each vendor’s EJB container, because the do container-specific tasks (such as load balancing). Home objects are automatically generated by the EJB container. However the home object needs information about your bean, such as the constructors to use and their arguments. You provide this information by providing an interface, called the bean’s home interface, that extends javax.ejb.EJBHome.

Although accessing beans via the remote and home interfaces is logically simple (relatively anyway), because such accesses must use RMI they are potentially very slow. In version 2.0 of the EJB standard, an alternative was provided for access on a single server. Clients can use a local interface (javax.ejb.EJBLocalObject) and the EJB container can use a local home interface (javax.ejb.EJBLocalHome). These interfaces cut out the RMI part and so are must faster. However, note that defining local interfaces is optional but you must still provide the remote and home interfaces, even if not used.

EJB Jar File

To deploy your bean you bundle the bean classes, home and remote (and possibly local) interfaces, the deployment descriptor, and possibly vendor-specific extensions, all in a jar file. Some IDEs will do this for you.

Lecture 16 — Servlets and JSPs, WARs and EARs

JSP — Java Server Pages

An HTML page may contain various special (large) tags, which contain Java code. Each chunk of code is called a scriptlet. When the JSP is first loaded, the page is parsed, the scriptlets are collected into a .java file, the .java file is compiled into a servlet .class, and thereafter using the JSP URL actually returns the output of the compiled servlet. It is possible to configure tomcat to pre-load JSPs, to make this process seem faster to the user. Usually files that contain JSP use the extension “.jsp” rather than “.htm” or “.html”.

JSPs often use JavaBeans to actually do something, such as fetch data from a database. This is handy if some tasks is too complex to be mixed in with the HTML directly. You just create a JavaBean that has methods that do what you want, and invoke them from the JSP. A special XML tag is mixed in with the HTML to indicate the use of a JavaBean.

Note that the Java code and special tags in a JSP page is considered part of the middle (business logic) tier, and not the presentation layer (tier). However some could argue the other way around.

JSP supports a variety of other special XML tags that perform specific tasks. Soon JSP provided a way for a user to define and use their own XML tags, which would do custom tasks. A collection of such tags is known as a taglib.

Eventually the set of standard tags was greatly expanded, and organized into 4-5 standard taglibs. This collection of taglibs is called the JSP Standard Tag Library, or JSTL. This will be discussed below.

In addition to the JSTL, the 2.0 version of JSP is now available with many new features that make using JSP a lot simpler.

The basic HTML page “basic.htm” looks like this:

<html>

<head>

<title> Sample HTML Basic Page </title>

</head>

<body>

<h1> Sample HTML Basic Page </h1>

<p> This is a dull web page!

I have nothing more to say.

</p>

</body>

</html>

(Discuss HTML and HTTP GET and POST). Such a page contains only static data. It is possible to jazz it up with animated images, Java applets, CSS, and JavaScript. However these technologies are limited in what they can do, and all the processing is done on the client workstation (by the web browser).

To include dynamic content you need the file to include special tags. The web server doesn’t send the page directly to the web browser in this case. Instead it reads the file itself, ignoring HTML but looking for the special tags. The server than does something for each such tag, usually this involves running some servlet code and replacing the special tag with the output of the method. Finally the document has been processed and now only contains HTML (and possible CSS, JavaScript, ...). It is then sent to the client.

The server uses the file extension to determine how to process the file. It might use PHP, ASP, JSP, or other special processors depending on the type of file. These all work the same way, looking for special non-HTML tags and replacing them with some other HTML. (Some documents today are not HTML at all, but a related technology of xHTML or even XML. Still the special processor doesn’t care.)

The simplest JSP tag contains a legal Java expression. The tag will be replaced with the result of the expression evaluation. Here are a couple of examples:

A simple JSP page “basic1.jsp” looks like this:

<html>

<head>

<title> Sample JSP Basic Page </title>

</head>

<body>

<h1> Sample HTML Basic Page </h1>

<p> This is a dynamic web page!

The current time

is <%= new java.util.Date() %>. Hit

refresh/reload to see the time update.

</p>

</body>

</html>

The “<%= expression %>” tag can be used for any valid Java expression, such as “2+2”, “StockQuote.getPrice("IBM")”, etc. The result is always converted to a string and inserted in place of the tag.

The special tags don’t have to fit all on one line. They can appear anywhere, even inside of HTML tag attributes or inside quotes.

To deploy basic1.jsp you must be using a JSP-aware web server, sometimes called a “JSP container” or “servlet container” or “Web Application Container”. Apache Tomcat is often used for this. Your web content (including non-JSP files too) must be placed in a directory named for the web application, and placed within the server's magic location: (“server‑install‑directory/webapps/MyApp/basic1.jsp”). Then you start the server and use the proper URL to get there. For a locally installed Tomcat server this would be "http://localhost:8080/MyApp/basic1.jsp”.

Note a web application requires special files and sub-directories too. The whole set of files is often put into a single jar file called a web archive or WAR.

Arbitrary Java can be placed in a similar tag. Here the result of running the Java code is not used to replace the tag. Instead you must send output to be inserted by writing to a pre-created PrintWriter called “out”. Here’s the same example using this type of tag:

The JSP page “basic2.jsp” looks like this:

<html>

<head>

<title> Sample JSP Basic Page </title>

</head>

<body>

<% java.util.Date d = new java.util.Date();

int sum = 2 + 2;

<h1> Sample HTML Basic Page </h1>

<p> This is a dynamic web page!

The current time

is <% out.print( d ); %>.

The sum of 2 and 2 is <%= sum %>

</p>

</body>

</html>

Notice the sole difference: a missing "=". Also note how you can define variables in one scriptlet and use them in a later one. This works because all the scriptlets are collected into a single servlet, in the order they appear. This technique is used to generate HTML tables from Java arrays:

The JSP page “table.jsp” looks like this:

<html>

<head>

<title> Sample JSP table </title>

</head>

<body>

<h1> Sample JSP table </h1>

<table>

<% for ( int i=1; i<=10; ++i) { %>

<% } %>

</body>

</html>

when using control-flow statements, always use curly braces.

There are other special JSP tags too. A JSP comment looks like this:

<%-- The client never sees this --%>

(Note you can create a comment with “<% /* comment */ %>” too.) A regular HTML comment tag will be passed to the client with the page but not a JSP comment.

Although the ability to include arbitrary Java code mixed in with your HTML is good, too much code spoils the readability. Remember that JSPs are primarily web pages, and will likely need to be understood by Web designer and not Java programmers.

To invoke methods of your JavaBeans you use a set of special JSP tags:

<jsp:useBean id="user" class="PersonBean" />

which invokes the default constructor as if you had used:

PersonBean user = new PersonBean();

You can add the attribute “scope” to control the life of the bean object. Use a bean like this:

<jsp:useBean id="cart" class="shoppingCartBean"
scope="session" />

to have the cart object created the first time a user visits the page. That same object will be available each subsequent visit by the same user in the same session. The default is page scope, which creates a new instance each time the page is used. You can also specify application scope.

To set and get properties of your bean you use these special tags:

<jsp:setProperty name="user"
property="name" value="Joe" />

<jsp:getProperty name="user" property="age" />

which invokes the getter method. The result is converted to a String and replaces the tag, much like the “<%= expression %>” tag. To make this work you must compile and install the JavaBeans to the server’s required directory, in tomcat this would be “WEB-INF/classes/”, which is a subdirectory of wherever the JSP is installed (webapps).

You can use a special tag to abandon processing the current page, and cause the JSP server to instead load a different page:

<jsp:forward page="URL" />

Use within if-statements:

<% if (...) { %>
<jsp:forward ... />
<% } %>

Pre-defined JSP Objects

In addition to out there are a number of pre-created objects you can use:

request is a HttpServletRequest, used to access the data in the request. This can include useful information such as locale, timezone, browser type, user name, client workstation IP address URL information (such as the query string), etc. If the user clicked a “submit” button on some form to get to this page, you can also access the form field values with request.getParameter("name").

response is a HttpServletResponse, used to obtain an output stream out, and to set the response headers (especially the type).

out is a PrintWriter obtained from the response.

session is a HttpSession associated with the request, used to store persistent data for this session only.

application is a ServletContext, which refers to the server itself and allows persistent data to be stored and retrieved by any client from any session.

config ServletConfig, used by the server to initialize servlets.

pageContext is a PageContext, that provides a few useful methods such as getException().

page refers to the Servlet object, and is the same as using this.

Gotchas:

HTML is case-insensitive but XML and JSP and Java are not. As all JSP tags are really XML tags the normal rules apply: All tags must end with “/>” or have a separate closing tag. All values must be quoted with either single or double quotes. If you use double quotes than any Java included inside must escape any double quote with a backslash.

Don’t use a semicolon after expression, but don’t forget them after statements!

Don’t confuse the “id” attribute (used with jsp:useBean) and the “name” attribute (used with jsp:getProperty and jsp:setProperty).

Don’t forget that your bean classes must really meet the JavaBean standard. If not, they will likely compile and run without errors or warnings. But they won’t work!

Servlets

Part of Web Application, see WAR example on website and tomcat install directions (tomcat is reference implementation of the servlet API). Today you rarely need to write a servlet directly. Instead you use JSP. The Java code from the JSP page is collected and compiled automatically as a servlet. (So the first use of a JSP page may be slow.) One good use of a servlet is to control which page to display next. This would be confusing in a JSP. Here’s a small sample:

public class PickPaymentMethod extends HttpServlet

{ protected void doPost ( HttpServletRequest req,

HttpServletResponse res )

{ ...

String nextPage = "/process-" + paymentMethod + ".jsp";

ServletContext sc = getServletContext();

RequestDispatcher rd = getRequestDispatcher( nextPage );

rd.forward( req, res );

}

JSTL — JSP Standard Tag Library (and custom tags)

JSTL encapsulates, as simple XML tags, core functionality common to many JSP applications. For example, instead of suggesting that you iterate over lists using a scriptlet (servlet), or use different iteration syntax from numerous vendors, JSTL defines a standard <forEach> tag that works the same everywhere. Instead of writing JSP this way:

<% if ((User)session.getAttribute("user")) .getName().equals("Joe") { %>

you can use the simpler:

<c:if test="<%= user.name %> == 'Joe'">

This standardization lets you learn a single tag and use it on multiple JSP containers. Also, when tags are standard containers can recognize them and optimize their implementations.

The standard tag library contains 5 collections of tags. These are documented at java.sun.com/products/jsp/jstl/1.1/docs/tlddocs/index.html .

Each collection is called a taglib. The tags all look like:

<taglibName:tagName attr1="val1" attr2="val2" ...>
body (optional)
</ taglibName:tagName>

If there is no body, just as with standard XML you can use the shortcut “<... />”. If the value is the result of some javabean method call, or (as in the example below) the value of some previously defined Java variable, you use the standard JSP expression syntax to refer to it:

<someTags:aTag attribute="<%= pageContext.getAttribute("aName") %>">

In JSP 2.0 they added a special “expression language”, that allows a simpler way:

<someTags:aTag attribute="${aName}">

Besides fetching data you can use simple math and boolean expressions:

<c:if test="${user.name == 'Joe'}">

In addition to the standard taglibs you can define custom taglibs. This is done by creating java classes with methods for each tag you define in your taglib. Then you create an XML file that tells the web container how to map tag names to classes and methods.

You declare that a JSP page will use tags defined in a tag library by including a taglib directive in the page before any custom tag is used:

<%@ taglib uri="/tlt" prefix="tlt" %>

For more information consult the Sun on-line tutorial at http://java.sun.com/products/jsp/tutorial/TagLibrariesTOC.html .

Deployment —Anatomy of a Java EE Enterprise Application Archive (EAR)

J2EE code today is mostly J2EE components assembled into an application. It is hoped that some components can be reused as thus reduce the overall cost and time of development. A bundle of similar type of component is called a module. The EJB module will contain all the EJB components, the web module contains all the web server components (JSPs, servlets, and other web files). Additionally some enterprise applications may use thick clients and thus have an application module. Finally some designs use standard interfaces to access EIS components. These interfaces (e.g., database specific JDBC drivers) form a resource access module.

A given application typically has some but not all four modules. For example a very simple J2EE application may only have a few servlets and HTML files, that directly use JDBC to talk to some database. In this case your whole application might only consist of the web module. More commonly you will have both that module and the EJB module.

WARs — Web Archive Files

The web module may consist of one or more Web ARchive files, or WARs. Each WAR is a jar file that contains servlets, JSPs, and other web application components such as HTML files, graphics and other resource files. Sometimes your J2EE application is small and simple, and the entire application can consist of a single WAR. Larger applications may consist of one or more WARs as part of the EAR.

WARs can be deployed in any Web Application server/container (sometimes called a web container, a servlet container, or JSP container), including the popular Tomcat, usually by simply dropping the WAR file into the appropriate place. (Depending on the server you may have to update it configuration, usually by editing an XML file.

Working with EARs

Each module contains the classes (and interfaces) for the components, plus a deployment descriptor that tells the container what components are there, what container services are needed, and how everything hooks together. This description is an XML file. It and the classes are packaged in a jar file, which becomes the module. Note that the EJB and application modules use the regular .jar extension, but the web module jar file should be renamed “.war”, and the resource access module extension should be “.rar”.

The idea is you can list all JARs needed on the CLASSPATH in this file, plus any other options needed. Note for enterprise applications you often need to do other tasks such as initialize a database or define what should be included in the web.xml or deploy.xml files. You may need to specify config data such as JNDI data, persistance.xml (needed to use a DB), logging server config info, etc.

To play with the database, use the JMX console, click on “database=localDB,service=Hypersonic” link, then click on the “Invoke” button below the “void startDatabaseManager() \n MBean Operation” line. This starts up a database management console app.

The (up to four) modules are then packaged in a jar with the extension “.ear”, which forms the J2EE application that can be dropped into the application server (sometimes called an EJB container or J2EE container) for deployment.

EARs also include a deployment descriptor that tells a J2EE compliant application server which services are needed and which classes implement required interfaces. The application server runs code to deploy the EAR components in the right places.

There are few books or resources to help you design a J2EE enterprise application. Most books and on-line resources focus on one or more of the J2EE technologies. There are a couple of case studies though, such as the Pet Store example from Sun, as well as some books from Sun (one of which can be read on-line).

A typical development process for J2EE applications contains two steps:

1. Create JavaServer Pages (JSP), servlet, and other web components that handle the presentation logic of the application; and build Enterprise JavaBeans (EJB) components to manage the application's business logic.

2. Assemble those components into modules (with the appropriate deployment descriptors), and then combine those modules into an EAR that can be deployed to a J2EE platform-compliant application server, such as Sun’s Application Server PE 8.

The deployable unit can be either of the following:

· J2EE module — A collection of J2EE components that reside in the same container. The modules can be a Web archive (WAR), an EJB Java archive (JAR), a resource archive (RAR), or an application client JAR. Such stand-alone modules are used in tiny applications. (Note RAR is not stand-alone.)

· J2EE application — A collection of J2EE modules, along with application deployment descriptors, in the form of an EAR file.

JBoss Setup

JBoss is a free Java EE application server that supports all Java EE 5 plus EJB 3.0. Visit www.JBoss.com or .org. I think Red Hat has purchased JBoss. This product uses Tomcat for the Web app container part so you automatically get that too. To develop software and to run your WARs and EARs you use the JARs of your application server. You don’t need the Sun Java EE SDK, but it may pay to bookmark the Javadoc APIs for Java EE.

Download JBoss from labs.jboss.com/portal/jbossas/download

Just click on the “run installer” button. This uses “WebStart” to download and run the installer program. First pick “Eng” for English language installer. Click Next to see some version info and a link to the JBoss.com homepage. Click Next to see the Release notes. It might be a good idea to look those over.

Click Next to see the open source licenses used and to click the “I Agree...” button, then click Next.

Select the installation directory. The default of C:\Program Files\jboss‑4.0.5.GA is fine. Then click Next.

On the next screen you pick which component sets to install. To use the current (new) EJB 3.0, you need to select the “ejb3” choice. You will see a warning that this requires Java SE 5, and that this is not J2SE 1.4 compliant. That’s because the new EJB spec will be part of J2EE 1.5. Note the “all” choice includes cluster-middleware support, but not EJB3.

The next screen is a list of “packs” (really JARs implementing various parts of J2EE. By default all are checked and that is fine.

On the next screen you pick the name of the configuration to use. The default is “default”, and if you change that you will have to supply the name on each command line, so don’t change it.

The next screen allows you to select which database product to install and use. If you don’t use the default choice (“Hypersonic SQL”) you will need to install the various drivers and update the config files. I suggest you use the default.

The next screen “Enable deployment isolation/call by value” I don’t understand, so I left the default of unchecked.

The next screen lets you chose to use secure console access (https) and to set the administrator name and password. I left the default (use security) and “admin” username and “admin” password.

The final screen shows your choices. Next the install begins. It didn’t take long!

Now set some environment variables: JAVA_HOME to point to your jdk install directory (this may be set already) and JBOSS_HOME, to where you installed JBoss. On my system: JAVA_HOME=C:\Java\jdk and JBOSS_HOME=C:\Program Files\jboss-4.0.5.GA

Run the Application Server

To start the server run %JBOSS_HOME%\bin\run.bat.

Deploy EARs and WARs with JBoss

To deploy WARs and EARs, copy them into the directory %JBOSS_HOME%\server\default\deploy. (There is a bunch of files in there already.

To run your web app, point your browser to http://localhost:8080/nameOfWarFile. For example: http://localhost:8080/myServletWAR. This should show the index.htm page. To run the servlet use the the URL http://localhost:8080/myServletWAR/hello

(Note! The link in the index.htm file points to .../myServletWAR/servlet/hello, which doesn’t work! The JSP page works though.)

Compiling Java EE Code; using Apache Ant

To compile your code you need to include the various JAR files with the JavaEE APIs for your Application Server, in this case all the JAR files in %JBOSS_HOME\client\*.jar.

You could copy all these JARs to your JDK JRE’s “ext” (extensions) directory, or list them on the javac cmd-line (with the “-cp” arg), but the common and simplest way is to use an Apache.org tool called “ant”. This is similar to the old Unix tool “make”. “Make” is more flexible because you write scripts to do various tasks. However that ties you to a platform that supports that scripting language (shell). Ant uses an XML file with pre-defined tasks for you (you can define new ones), such as tasks to build a WAR or EAR file automatically. The tasks are actually Java classes. Using XML means ant build file errors are caught automatically. Ant is common and supported by Eclipse and other IDEs.

[Project: DL/Install Ant (although built into Eclipse, you can tell IDE to use another Ant. Read Ant manual (chapter in text?) at ant.apache.org/manual/. Create a build.xml.)

Ant reads a file build.xml to determine exactly what do to for each task. You create a build.xml file (or find one to use as a model on the Internet), put it with your source, then edit your code and to compile it use “ant task”. You define the task names, such as “compile”, “test”, “package”, “deploy”, “generate‑javadocs”, “clean”, etc., and list a default task to use if you just type “ant”. Also tasks can be dependent on one another, so you can have “package” depend on “compile”. (See “AntBestPractices.htm” and other ant demos and resources.)

Suppose a project puts all the source code under .../src, and the resulting class, jar, war, ear, etc files in .../build. The package name for classes is com.example.hello:

<property name="classes.dir"

value="${build.dir}/classes"/>

<property name="main-class"

value="com.example.hello.HelloWorld"/>

</target>

</target>

<jar destfile="${jar.dir}/${ant.project.name}.jar"

basedir="${classes.dir}">

</manifest>

</jar>

</target>

<java jar="${jar.dir}/${ant.project.name}.jar"

fork="true"/>

</target>

</classpath>

</batchtest>

</junit>

</target>

</project>

Now you can build with “ant”, run with “ant run” and test with “ant junit”! You can also use ant to build deployment descriptors and create WARs and EARs, and even deploy them.