Thursday, December 17, 2015

SOAP Web Services using CXF/JiBX [JAX-WS]

SOAP Web Services - Architecture

Fig. 1 SOAP Web Services Architecture (SOAP - WSDL - UDDI)



Advantages of Apache CXF
Apache CXF, is currently the most widely used framework and preferred over Axis due to:

1. It separates the JAX-WS code from the core application code.
2. It is compliant with the Spring Framework.
3. It is more performant than all JAX-WS implementations.
4. It supports both JAX-WS and JAX-RS.
5. It is the most popular JAX-WS framework and is replacing all other implementations.   
6. It is relatively easy to use and generates lesser code with Eclipse support being available.



 [image publicly available at http://cxf.apache.org]

Fig. 2 CXF Architecture with all Components 
[Especially Note the 'Front-Ends' and 'Pluggable Data Bindings']



Advantages of JiBX
JiBX, unlike other binding frameworks provides two distinct features or advantages:

1. It generates an additional binding class, apart from the regular schema and this 'separates' the java code changes from the binding changes; which mean both can be changed independently of each other.
2. It optimizes the bytecode generated, and is more performant data binding mechanism by a factor of 2 (!!?) or more than other binding mechanisms. 
3. It does using various 'goals' or 'phases' including Code Generation, Binding Compiler and the Binding Runtime. The number of java classes and configuration code generated by  JiBX is far greater and is the 'acceptable tradeoff' to achieve the performance gain.

 [image publicly available at http://jibx.sourceforge.net]


Fig. 3 JiBX Simple XML Binding


CXF/JiBX using Eclipse (Spring / Standalone Server)
You need the following Tools, Dependencies (With Specific Versions) to Build this Sample:
 
1. Maven Connector for Eclipse - m2e
2. CXF Codegen Plugin (Maven Eclipse Connector)
3. JiBX Maven Plugin  (Maven Eclipse Connector)
4. Maven Configuration (Spring Dependencies)



These are the steps to follow to create a sample "Calculator" web service using CXF and JiBX. We used the following as a reference: https://github.com/FrVaBe/cxf-soap-with-jibx#readme

There are a very few (next to none) links that explain the step-by-step web service creation. After my reading of Apache CXF and JiBX, I would recommend it for all JAX-WS/SOAP based web development in terms of performance. Though I have not done any benchmarking myself, I am impressed with the facts provided for both CXF and JiBX. You can use the below steps to understand (deploy) and then use it directly (as RI) into your project or product. The only drawback seems to be the curve in understanding (read the 'depths') completely both the frameworks, at times perplexing. Since there are many features that one never uses in a normal 'web services' project or product, you may be skeptical that you have missed something or there might be something more in there. But to start off
, I give the detailed steps below. Rest, I leave it to the time you have and your curious mind. :-) Please do share in reply/comment to this entry, with your own blog entries or code samples or enhance this same sample!


1. Create a Maven Project in Eclipse

You can download the attached source code and understand further the 'Maven Project Object Model'. I am outlining the most important sections here:

 
The properties contain the version for easy reference and changes to the maven build file. The above are the right version, but you may want to try JiBX 1.2.6+ and also Spring 4.0.0+Also, you may want to try this on JDK 1.8.0; we used JDK 1.7.0.



The above dependencies are only for Apache CXF and the version is read from the properties section. This contain 'all' the dependencies required for building this project related to CXF.



  
The above dependencies are only for JiBX and the version is read from the properties section. This contain 'all' the dependencies required for building this project related to CXF.




The above dependencies are only for Spring. Note that only 'Spring Context' is required for building and running this project. This is true even if you would want to deploy this on a Web/App Server.



2. Contract First Web Services (Develop the WSDL/XSD in Eclipse)
I was never a fan of 'Contract-First' approach. But after using Apache CXF/JiBX, I start believing this is the 'best' way of developing 'web services' progressively from a 'requirements' and 'design' phase. If you plan early, at the end you may just want to / have to 'copy and paste' your specification to generate code using Maven (CXF/JiBX). Even otherwise, it will involve only 'some work' to create the WSDL/XSD out of your documented 'service specifications' in a 'Software Requirements Specification' or 'System Design' document. You may go through the WSDL/XSD in the attached source code. (Note the in the XSD contains the types as normal 'verb names'. The standards recommend that the response/request 'message' names be suffixed with 'HttpIn', 'HttpOut', 'Request', 'Response', 'SoapIn', 'SoapOut')



Fig. 4 XML Schema Definition (XSD)




You may go through the WSDL/XSD in the attached source code.



Fig. 5 Web Services Definition Language (WSDL)


3. Run the Maven Build (mvn install)
a. generate-sources
The following is the code extract from the pom.xml for the generate-sources from the cxf-codegen-plugin. Also, shown is the declaration (inclusion) of the Maven Compiler for running the Binding Compiler.



   
Fig. 6 Maven Code Extract for 'Generating Java Sources for Web Services'
(Note -nexclude Flag for Types)


b. generate-java-code-from-schema
This goal is used to generate the Java Types using the JiBX maven plugin. It also generates the binding.xml. All of the generated  will be available under the 'target/generated-sources'.



 
Fig. 7 Maven Code Extract for 'Generating Java Type from Schema'
(Note: We Have Mentioned a Dummy/Mock 'Custom JiBX Code Generation' XML)


c. bind
The 'bind' goal to bind the generated sources, types and the binding. This phase is the 'Binding Compiler' and when a request is made, then the 'Binding Runtime' will 'kick-in or trigger'.



Fig. 8 Maven Code Extract for 'Binding Compiler'
(Note Path to the 'Generated Sources' and 'Binding')


4. Identify the Generated Classes (Understand CXF/JiBX Better)
Spend some time to identify the generated classes and artefacts to better understand and appreciate the cxf-codegen-plugin and jibx-maven-plugin. 



Fig. 9 Maven Code Extract for 'Binding Compiler'


5. Deploy and Run (Start with Standalone Java using Spring Context) 
You can run the Main class at 'de.frvabe.sample.calculator.Main'. This will load (and deploy the web services Endpoint) the web services configuration using a Spring ClassPathXmlApplicationContext (from the spring configuration 'calculator-app-context.xml').




Fig. 10 Running the Main Class (Console)


6. Verify on the Browser (Check if WSDL Exists and is Accessible - Alternatively, use SoapUI)
Find the request provided and the response returned, using SoapUI.

 Fig. 11 SOAP Request (Using SoapUI)



Fig. 12 SOAP Response (Using SoapUI)

[The function  performed is a 'dummy add' with the value supplied (and zero) and the result is returned as response]


You can download the archive from here. You may choose to run this example on Tomcat or Weblogic or Websphere or JBoss or ??. Try adding functionality or exploring other 'Sophisticated Code Generation' capabilities of JiBX, as per your project requirement.


[You may want to use other reference implementations of JAX-WS or use the default JAXB for developing web services. This article is specifically for SOAP using CXF/JiBX (Eclipse/Standalone/Tomcat). This article may also help people, needing introduction to CXF/JiBX, who are already well versed in building SOAP Web Services using any of (or combination of) JAX-WS, Axis, Spring, JAXB and XML.]
 
 

Wednesday, November 25, 2015

Hardware Sizing for Java/Java EE Products

For doing a hardware sizing on Java/Java EE; especially where indexing or search frameworks like Solr/Lucene are involved - many more attributes add to the final sizing tabulation. [This entry does not include anything on Database Sizing]



Before we get ahead with a sizing exercise, we need to understand that the following will impact the accuracy of the metrics.

1. Decision on Exact Version of Runtime, Used Frameworks, Servers,
2. Experience of the Senior Engineer/Architect doing the Sizing Exercise
3. Understanding the Functional Characteristics of the System being Built
4. Agreeing upon the Non-Functional Characteristics of System being Built
5. Appreciating Sizing Environments [Development, Testing, Production, UAT,...]
6. The Future Extensions or Possible Lifeline of the System Being Built


The following are the most important standard guidelines and criteria for hardware or server sizing. Please note that these guidelines are for the server that hosts the Application. It does not contain any Database Sizing guidelines.  
  •  Hardware Component should operate at no more than 80% Utilization 
  •  Processor and Memory Resources should be allocated for Maximum User Load 
  •  User Think times and Network Latency should be taken into Account 
  •  Number of Potential Users and Number of Concurrent Users 
  •  Service Time and Average Response Time of your Application

If you are using Solr/Lucene type of indexing or disk-based frameworks; then it is important that you estimate the entire possible index size by deducing the number of documents, number of indexed fields and number of stored fields and also the average size of each document. By considering some buffer, you may be able to compute, almost accurately, the Disk Space. While computing the Estimated Memory Requirements for Solr/Lucene; additionally; the number of Unique Terms per Field also need to be considered. In the references below, I have provided a sheet (that has been made publicly available by 'Lucidworks'). It will also provide you all the attributes that you 'tune' to your Memory Sizing and Disk Space Requirements for Solr/Lucene, especially with respect to 'Caching of Query Terms'.

While doing sizing exercise, you can provide various tabulated forms as the result for each of the possible environments.  Alternatively, you may choose to present a single tabulated result (mentioning the environment for which you are providing this sizing). You may mention, the additional constraints that may be applicable across environments. It is important the buffer may be added to each of the computed attributes; keeping in mind the cost and future extensibility. Most of them come to a conclusion that "Hardware is Inexpensive these Days - We can Recommend something that is Beyond the Best Possible Maximum Load". Though this may work almost always, we may not be able to come out with a "Possible Minimum Estimate with Least Cost". Coming out with the the estimates; keeping latter in mind, will equip us to better understand the future issues that various functional and non-functional aspects may cause. This is especially if we want to achieve maximum efficiency under the constraints for all possible 'Loads'. For example, if we were to achieve this ('latter') in the 'Development, Testing or User Acceptance Testing' environments; we may be able ot point out that Memory Leak that would have manifested itself due to an Incorrect Development Practice or Deployment Strategy. Sometimes, we may also end giving a "Inflated Estimate" for an otherwise "Size-S System"; the resources which may always lie unused - if we go with the former approach.

Before I take you to the tabulation there are a few terms, that need definition (from the text book). They are very often assumed and the slight difference in their actual meanings may be better to know as it is.

User Think Time:  The time the user is not engaged in actual use of the processor (The time between Requests). This is used interchangeably with User Wait Time. In absolute real-life however, this has a slightly different impact as it involves the 'Time Required by an User for thinking and performing his next action in the application either due to the response or otherwise'.
Response Time: The response time measured at the client under load. (Average of Time). 
Concurrent Users:  The number of users measured on the server, taken in snapshots from the Server Status or Server Console.
Service Time:  The elapsed time to complete the operation measured for a single user.
Maximum User Load: The maximum number of concurrent users that may be expected or the system is tested for.
User Wait Times: The time elapsed between actions or clicks for a given user. This is used interchangeably with User Think Time. In absolute real-life however, this has a slightly different impact as it involves the 'Time Required by an User for analyzing or reading data received between request and also performing other tasks such as reading email, using the telephone, and chatting with a colleague or on other Applications simultaneously Running'. If we were to go deeper into Software Testing and Performance - Both of these may be put to great use to improve user experience and/or performance.
CPU Utilization: Average of the Total CPU Utilization as a Percentage.


The final tabulated Hardware Sizing Recommendation for the Java/Java EE Product will look like the following: (One Table is shown here for 'Development' environment and consideration for 'Production/UAT' environments provided below).

The Load Balancing, Data Clustering, Failover Strategy and Backup Strategy are not planned for, due to the nature of the System.

FIELD NAME
FIELD TYPE
Type of Environment
Development [/Testing]
Type of Machines
Physical [/Virtual]
Number of Servers
1x
Operating System
Red Hat Enterprise Linux - Linux X.Y.ZZ-AAA.BB.C.eRR.xpp_bb OS
Application Server
Weblogic ??c (Weblogic ??.?.?*)
Load Balancing
[NONE]
Data Clustering
[NONE]
Failover Strategy
[NONE]
Database Connections
10 [maxActive], 02 [maxIdle]
Backup Strategy
[NONE]
Processors
4 Cores
Concurrency
~500 Concurrent Users 
[Including Think Times]
Memory / RAM
4GB
Garbage Collection
Generational Garbage Collector [-XX:UseG1GC]
Disk Capacity
[Reasons]
    
   Lucene Indexing
~10GB SSD [/HDD]  
[Logs, Indexes, Dependencies, +Buffer]  

~300MB [Worst Case, +Buffer]
Java Heap Size

   Lucene
   Second Level Caching
Dedicated Machine [-Xms=??g -Xmx=??.?g]
 
~100MB [Worst Case, +Buffer]
 
~000MB [NONE]
 
This recommendation is for the Development Environment. It is best that the above is used / emulated for any of Development or Testing. For Production or User Acceptance Testing environments, the considerations (with our recommendations in brackets) related to Storage Capacity [500GB SSD], Storage Redundancy [RAID], Processor Cores [08+], Total Memory (RAM) [08GB+], Application Failover Strategy [Active-Active with 4x Physical Servers] should best match with Other Organizational or Hardware Tier Standards.


I am giving you the Following Links, which can be used as Reference to Get the Best Results:


Happy Hardware Sizing for Java/Java EE Products!


[Note: I am a Software Development Architect, working for a US based Software Product Company and this write-up is based on the work done as part of Special Product Customization for a Big Logistics Customer, as well as for later use in the Product Itself].