FCI-Helwan blog

Just another FCI-H weblog

J2EE Design patterns Part 1

Any enterprise application is divided into tiers

1- Client presentation tier.

2- Server-side presentation tier.

3- Server-side business logic tier.

4- Sever-side domain model.

5- Enterprise integration tier.


The first three tiers are defined in java by sun, but you can the last two tiers according to your application.

The client presentation tier is the user interface for the end user, the client can be either thin of fat client. This tier can be implemented using HTML and JavaScript.

As you noticed presentation tier are divided into two tiers, the server-side and client presentation tier. The server-side presentation tier is responsible to provide client presentation tier with materials to be shown to end user. The server-side presentation tier can be Servlets and JSPs.

While considering the verbs of the system (e.g. purchase, remove, ….,etc ), you have to mention the server-side business logic tier. It handles the business logic of the system.

The extra two tier are server-side domain model which handles all the models used by the business logic. In this tier we consider nouns in system (e.g. Order, Customer, ….,etc ). The final tier is the enterprise integration tier which handle connection between your system and other systems (e.g. web services, CORBA,…. , etc ).

When evaluating the end product of any enterprise development project, we can score it on four factors: Extensibility, Scalability, Reliability, and Timeliness. Using design patterns increases the score of the project.

Patterns will be divided according to the mentioned tiers, starting with presentation tier where most of the changes occurs. These patterns make presentation tier more extensible

A- Model-View-Controller Pattern.

Dividing the application to javaBeans and jsp and servlets. JavaBeans will show the data from the business logic to view. JSP will view the data. Servlet for navigations between views.

We have two models in MVC pattern , the first model is making a controller (servlet) for each view but it`s not the best practice, because if you need to add a logging mechanism it will cost you time and effort to handle it in all controllers. The second approach is making one controller to dispatch the requests to each service, also this approach isn`t the best practice because if you want to add new feature you will need to retest and redeploy,…etc.

B- Front Controller pattern.

Front controller merges MVC models, by using one front controller that perform the common functions and delegate other specific functions to a page-specific controller. Front controller is also responsible of redirecting to pages. The page specific controller can be Servlets or simple classes that is based on the GOF command pattern. Good example is the Actions in the Struts framework, classes that have common parent class and each class perform page specific functionalities. In the front controller you can add common functionalities so that you won`t have duplicate functions in page specific controllers. Adding common functionalities in front control can be applied using the GOF decorator pattern so that you can extent functionalities dynamically.

Advertisements

June 15, 2009 Posted by | Design Patterns, JAVA | 1 Comment

Optimizing Hibernate Usage

Well, getting straight to the point, using hibernate is totally different from effectively using it. You can set up and put hibernate in action very fast in your application and enjoy ORM features, however, as any project grows and expands by time, you may find that the application is acting slow and with overhead on the DBMS, most people – (mostly seniors and architects :D) – just blame hibernate for being inefficient and tend to prove that the ORM concept itself is wrong and hibernate is just saving some JDBC code with a great performance tradeoff. This article aims at helping you to know how you can optimize the usage of hibernate to minimize performance penalty. we suppose you already have some experience with basic usage of hibernate by the way, so if you are an absolute beginner this is not for you.

Before we proceed, there are two rules that we need to stress on:
1- Using hibernate does not completely isolate the developer from that database, the relational model is still involved.
2- Hibernate is not some sort of magic, it can not predict what you want, it will do what you configured it to do.

===== Basic Principles =====
99% of problems faced with hibernate is due to misconfiguration or misunderstanding, both reasons are the common two reasons when dealing with any framework that abstracts details behind the scenes. In order to start effectively using hibernate, we need to talk about the following terms:

1- Session.
2- Object state.
3- Transaction.
4- Locking.
5- Caching.
6- Connection Pooling.

Session: A session object in hibernate is that place created to store and manage objects and their states in memory, it can be considered as a cache, in other words, it is the heart of communication between your business model and the relational model represented by the DBMS. When you retrieve an object from the DB it is stored in the session, when you update that object and call session.save(object) the session manages to synchronize the data in the data base with the one in the session in object. A session’s life ends once the thread which created it is killed, so, when a web user requests a web page which needs to contact the DB, the thread that was initiated to handle the HTTP request will create the session to be used as long as the thread is alive, once the HTTP request is done and the thread is killed the session object is killed too.
When you retrieve an object, hibernate checks if the object is in the session first, if not, it gets it from the DB and saves it in the session object for further requests.

Object State: The word “state” here means how hibernate sees an object from the session perspective, an object has one of 3 states:

1- Transient : The object is still not saved in the DB.
2- Persistent : The object has already been saved to the DB and hibernate session can manage it.
3- Detached : The object has an entry in the DB, but the session that saved or retrieved it has been closed, so the object now has no associated session to manage it.

A developer needs to keep in mind the states of objects, because this can lead to many problems if not handled correctly, for example; a user requested a web page which loaded an object from the DB, that object will be updated by the client, however, the client took too long to perform that update, after he submitted the page again, the business logic simply calls session.update(object), but, a problem is there. The problem is because the hibernate session which retrieved the object in the first place has been closed, so the object is now detached, and can not be updated until it is re-attached to a hibernate session again. There are many solutions for this problem, we might look at later.

Transaction: A transaction with the terminology of hibernate is any kind of operations taking place between the application and the database, do not mix this with the atomic data base transactions. A transaction in hibernate is any data manipulation operations; when u need to reflect any thing to the database you need to put your code within begin Transaction and commit calls.

Locking: When a DB record is accessed by many threads, we need to manage the way these threads are dealing with the DB, a typical situation is where a thread reads a certain value and before it updates it another thread has already put another value, so, the information now being processed by thread one is considered invalid. Locking is the mechanism which decides the action taken by the DB when some records are being read. One solution for the previously mentioned problem is to tell the DBMS not to permit any updates for a certain record while a thread is reading it and intending to update it. There are different locking mechanisms for different situations. Hibernate can manage locking for objects on the DB level, that means that hibernate never locks an object in memory – ( in other words; the session) – so, always take care that hibernate only supports a certain locking mechanism if the underlying DBMS supports it.

Caching: Caching is a way to save extra hits to the database to retrieve info, a cache saves information in memory for fast retrieval and a cache is also responsible for maintaining that the data stored in the cache is valid and synchronized with the DB. Caching in hibernate takes place at 2 levels:

1- First level cache: Which is the session object, however, as we said before, a session is valid as long as the thread that initially created it is running, so, it is a caching layer per-thread. This caching can not be disabled, you can not manage to disable the caching in the hibernate session by any means.

2- Second level cache: Which can be any caching manager, ex; ehcache. A second level cache is a caching for the application, it is not tied to a specific thread, as long as the application is running, this cache will be used, if you use a second level cache and try to retrieve an object from the DB Hibernate will first check the first level cache (Session) if it does not find it, or finds it invalid, the first level cache will check with the second level one, if it does not find it, the second level cache will manage to get a fresh copy from the DB and saves it for further uses.
Using of second level caching needs caution, objects on the second level chache may not be in some cases synchronized with the DB or the first level cache, there are some types of applications the can not use second level caching by any means,like; financial applications.

Connection Pooling: Like any pool, this is a pool containing some open connections to the DB. Pooling techniques serve performance by keeping it fast and reliable to talk to the DB without the overhead of always opening and closing connection, also, a connection pool may control the number of open connections to a DB so no overload occurs. Hibernate offers a connection pooling provider “C3p0”, however, you can use any other pooling mechanism.

===Approaches to optimizing hibernate===

We will now try to define a roadmap to how you can start the process of optimization, sure we can not give or describe all problems and techniques, however, we try to provide your first steps towards optimization.

Some Advices:
1- Never try to start optimizing an application using hibernate from day zero, you will never be able to effectively tune it, wait until you have an application running and then manage to optimize.
2- A use of a load testing tool along with a profiling one may help a lot in determining bottle necks and bad written code.
3- Tuning requires a lot of team work effort and time, it is not always hibernate that is the cause of bad behavior, it may be from the DB, Java itself, network or server issues.
4- Review & fix your object model design for the application as it direclty affects your relational model, many problems arise from bad object models, the more you enhance your object model, the more you save your self a lot of trouble in the optimization process.
5- Configure hibernate to write the queries generated on the standard output, this will help you very much in tracing problems.
6- The ultimate solution for your problems is not here, every problem has its specific solution, you have to explore alternatives to decide what to use, there is nothing totally right or totally wrong.

===== Lets Go ======
1- It all starts from mapping:
As a hibernate developer, you must have a good knowledge about the “.hbm” files even if you use a generation tool. Mapping is a critical section, as a wrong or a bad mapped object may result in an unwanted heavy behavior. Make sure you define relations between objects in the right way, the tags you write in mappings defines how hibernate will build queries to retrieve and save your objects.

Hint 1: When mapping a one-to-one relations, its very effective to use the “constrained=true” attribute if you are sure that there is a record matching the foreign key association, this way, hibernate will not attempt a lookup first before selecting the records, it will direclty go and pick up the record as it previously knows of its existence.

Hint 2: Revisit your fetching strategy for objects, the “fetch” attribute tells hibernate how to retrieve a certain relation, it has many options. You may need to tell hibernate to use a certain type of join for a specific object, or a sub-select for another, that depends on the situation.

2- Queries:
Your queries may be the source of the problem, you may override an association strategy defined in the mapping file and then wonder why it is going wrong!!. Using SQL, HQL, or Criteria API are all valid options for object retrieval. Criteria API just gets translated to HQL before execution, so, using HQL is slightly better in terms of performance. Using SQL is needed at some cases when HQL lacks a certain feature for a specific DBMS.

Using query caching may be effective in cases where the application does not write too much to the DB, as caching a result set of a query become invalid if any update, delete or insert takes place at the DB table, you may enable query caching by setting
hibernate.cache.use_query_cache true
and use the setCacheable(true) of the query class before execution of a query.

Hint 1: Suppose a users logs in to your website, simply, you will hit the database to get the user profile object to compare the password supplied by the user, using a default hibernate get operation will retrieve the whole object graph which may include references to other objects and as a result a bad looking join SQL statement may be produced, at the end we have a heavy object in memory only for getting a simple string value. In situations like this, you need to specify a certain retrieval approach that just gets only the information you need to be added to the object built and returned for later use. You need to see how to use hibernate to build objects only of certain information.

3- Locking:
Ok, you used a load testing tool and booom…. a locking problem occurs, you find that your application messed up with the records, so, you decided to move to pessimistic locking, but, oops, the application is now having deadlocks. This problem mostly arise from the DBMS, as mentioned before; hibernate manages to use the provided locking mechanisms of the DBMS, no memory locking takes place, a problem like this may need a DBA to rightly configure the DBMS to work smoothly with your application to handle locking situations. From the hibernate side, you need to revisit your queries which needs to handle locking scenarios in your application, you need to pay attention whith writing such queries as those queries may mess up things if not written with caution. (See how to use locking with hibernate)

hint 1: A problem might take place if you lock some object for update and then not doing anything with the hibernate transaction initiated the lock. This can be solved by making sure that under any circumstances the transaction will either commit or rollback, this way, the lock will be released, another solution, is by setting the locking timeout of the DBMS itself, so a lock will be released after some time either the transaction issues any further action or not.

4- Connection Pooling:
As mentioned before, using connection pooling is a very good practice if used properly, beside configuring a pool manager, you need to pay your code a visit to make sure that there is a thread some where holding a connection open, this situation mostly found in a missing or a miss-placed connection close call. Optimizing the usage of a connection pool may need the advice of the DBA.

June 10, 2009 Posted by | Hibernate, JAVA | 1 Comment

Introduction to XML


SGML: (Standard Generalized Markup Language) is Standard in which one can define markup languages for documents.

HTML : Hypertext Mark-up Language.

XML : Extensible Markup Language, is a markup language that you can use to create your own tags.

XML is created to overcome the limitations of the HTML. Although HTML is a very successful markup language, it is used to preview the data without understanding the data or even give the ability to analyze data.

So main advantages of XML is giving the ability to analyze data and search inside XML document. Also XML used for data interchange, organizations can exchange data in XML and then convert this data to database records easily.

XML document rules

1-Root element:

An XML document must be contained in a single element. That single element is called the root element, and it contains all the text and any other elements in the document.

2- Elements cannot overlap:

If you started element and inside this element you must close first then close

3- End tags are required

Each element must have an end tag.

4- Elements are case sensitive

5-Attributes must have quoted values

• Attributes must have values.

• Those values must be enclosed within quotation marks.

You can use a predefined structure using the document type definition (DTD).

DTD defines the elements the can appear in XML file and the order if the elements. Another approach for using predefined structures is XML schemas.

XML Programming Interfaces

This section focus on the programming interfaces to deal with XML document.

There is a lot of programming APIs Available. Here we have the most popular APIs; Document Object Model (DOM), the Simple API for XML (SAX), JDOM, and the Java API for XML Parsing (JAXP).

  • Document Object Model (DOM):

Defines a set of interfaces to the parsed version of an XML document. The parser reads in the entire document and builds an in-memory tree, so your code can then use the DOM interfaces to manipulate the tree. You can move through the tree to see what the original document contained, you can delete sections of the tree, you can rearrange the tree, add new branches, and so on.

DOM has some issues, building the whole XML document in the memory consumes time especially with large documents. What if I need a specific part from document? It doesn’t make sense to load the entire document.

  • Simple API for XML (SAX):

SAX handle a lot of DOM issues, SAX based on events. First you define which event is more important to you and the data type of the data from event, the parser goes throw the document and throw event at the start, end of the element or start , end of document. If you don`t save the data from the event it will be discarded. As you can see SAX doesn`t hold the entire document in the memory, so it saves time. But one of the SAX issues is that SAX is stateless.

  • JDOM

Java classes developed to make it easier to use DOM and SAX parser. JDOM handle the DOM and SAX interfaces and gives high level classes to reduce the amount of code. JDOM make most of the parsing functionalities.

  • Java API for XML Parsing (JAXP).

Although DOM, SAX, and JDOM provide standard interfaces for most common tasks, there are still several things they don’t address. For example, the process of creating a DOMParser object in a Java program differs from one DOM parser to the next. To fix this problem, Sun has released JAXP, the Java API for XML Parsing. This API provides common interfaces for processing XML documents using DOM, SAX, and XSLT. JAXP provides interfaces such as the DocumentBuilderFactory and the DocumentBuilder that provide a standard interface to different parsers. There are also methods that allow you to control whether the underlying parser is namespace-aware and whether it uses a DTD or schema to validate the XML document.

June 9, 2009 Posted by | JAVA, Learning Materials, XML | Leave a comment

Handling Hibernate Session

Handling hibernate session is so important, and the way of handling the session reflects with the performance of the application.

You can handle the hibernate session per service. It means with each request for each service you will open new session, at the end of the service, you will close the session. In this approach you have to be sure that you don`t need the session again (lazy fetching for example ).

p.s) if lazy fetching is false it will affect the performance.

Another approach of handling session is session per view. When an HTTP request has to be handled, a new Session and database transaction will begin. Right before the response is send to the client, and after all the work has been done, the transaction will be committed, and the Session will be closed. In this approach you can user the Filter in servlets APIs . this approach is called session-per-conversation and used while implementing long conversations.

The third technique and the most recommended technique is using java transaction APIs. Hibernate can automatically bind the “current” Session to the current JTA transaction. This enables an easy implementation of the session-per-request strategy with the getCurrentSession() method on your SessionFactory:

try {
    UserTransaction tx = (UserTransaction)new InitialContext()
                            .lookup("java:comp/UserTransaction");
                            
    tx.begin();
 
    // Do some work
    factory.getCurrentSession().load(...);
    factory.getCurrentSession().persist(...);
 
    tx.commit();
}
catch (RuntimeException e) {
    tx.rollback();
    throw e; // or display error message
}

Finally If you don’t have JTA and don’t want to deploy it along with your application, you will usually have to fall back to JDBC transaction. Instead of calling the JDBC API you better use Hibernate’s Transaction and the built-in session-per-request functionality:

    factory.getCurrentSession().beginTransaction();

June 8, 2009 Posted by | Hibernate, JAVA | 1 Comment

Intro to Caching,Caching algorithms and caching frameworks part 1

Introduction:

A lot of us heard the word cache and when you ask them about caching they give you a perfect answer but they don’t know how it is built, or on which criteria I should favor this caching framework over that one and so on, in this article we are going to talk about Caching, Caching Algorithms and caching frameworks and which is better than the other.

The Interview:

“Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive to be fetched, so I can retrieve it faster. “

That what programmer 1 answered in the interview (one month ago he submitted his resume to a company who wanted a java programmer with a strong experience in caching and caching frameworks and extensive data manipulation)

Programmer 1 did make his own cache implementation using hashtable and that what he only knows about caching and his hashtable contains about 150 entry which he consider an extensive data(caching = hashtable, load the lookups in hashtable and everything will be fine nothing else) so lets see how will the interview goes.

Interviewer: Nice and based on what criteria do you choose you caching solution?

Programmer 1 :huh, (thinking for 5 minutes) , mmm based on, on , on the data (coughing…)

Interviewer: excuse me! Could you repeat what you just said again?

Programmer 1: data?!

Interviewer: oh I see, ok list some caching algorithms and tell me which is used for what

Programmer 1: (staring at the interviewer and making strange expressions with his face, expressions that no one knew that a human face can do 😀 )

Interviewer: ok, let me ask it in another way, how will a caching behave if it reached its capacity?

Programmer 1: capacity? Mmm (thinking… hashtable is not limited to capacity I can add what I want and it will extend its capacity) (that was in programmer 1 mind he didn’t say it)

The Interviewer thanked programmer 1 (the interview only lasted for 10minutes) after that a woman came and said: oh thanks for you time we will call you back have a nice day
This was the worst interview programmer 1 had (he didn’t read that there was a part in the job description which stated that the candidate should have strong caching background ,in fact he only saw the line talking about excellent package 😉 )

Talk the talk and then walk the walk

After programmer 1 left he wanted to know what were the interviewer talking about and what are the answers to his questions so he started to surf the net, Programmer 1 didn’t know anything else about caching except: when I need cache I will use hashtable
After using his favorite search engine he was able to find a nice caching article and started to read.

Why do we need cache?

Long time ago before caching age user used to request an object and this object was fetched from a storage place and as the object grow bigger and bigger, the user had spend more time to fulfill his request, it really made the storage place suffer a lot coz it had to be working for the whole time this caused both the user and the db to be angry and there were one of 2 possibilities

1- The user will get upset and complain and even wont use this application again(that was the case always)

2- The storage place will backup its bags and leave your application , and that made a big problems(no place to store data) (happened in rare situations )

Caching is a god sent:

After few years researchers at IBM (in 60s) introduced a new concept and named it “Cache”

What is Cache?

Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive to be fetched, so I can retrieve it faster.

Caching is made of pool of entries and these entries are a copy of real data which are in storage (database for example) and it is tagged with a tag (key identifier) value for retrieval.
Great so programmer 1 already knows this but what he doesn’t know is caching terminologies which are as follow:

Cache Hit:

When the client invokes a request (let’s say he want to view product information) and our application gets the request it will need to access the product data in our storage (database), it first checks the cache.

If an entry can be found with a tag matching that of the desired data (say product Id), the entry is used instead. This is known as a cache hit (cache hit is the primary measurement for the caching effectiveness we will discuss that later on).
And the percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache.

Cache Miss:

On the contrary when the tag isn’t found in the cache (no match were found) this is known as cache miss , a hit to the back storage is made and the data is fetched back and it is placed in the cache so in future hits it will be found and will make a cache hit.

If we encountered a cache miss there can be either a scenarios from two scenarios:

First scenario: there is free space in the cache (the cache didn’t reach its limit and there is free space) so in this case the object that cause the cache miss will be retrieved from our storage and get inserted in to the cache.

Second Scenario: there is no free space in the cache (cache reached its capacity) so the object that cause cache miss will be fetched from the storage and then we will have to decide which object in the cache we need to move in order to place our newly created object (the one we just retrieved) this is done by replacement policy (caching algorithms) that decide which entry will be remove to make more room which will be discussed below.

Storage Cost:

When a cache miss occurs, data will be fetch it from the back storage, load it and place it in the cache but how much space the data we just fetched takes in the cache memory? This is known as Storage cost

Retrieval Cost:

And when we need to load the data we need to know how much does it take to load the data. This is known as Retrieval cost

Invalidation:

When the object that resides in the cache need is updated in the back storage for example it needs to be updated, so keeping the cache up to date is known as Invalidation.
Entry will be invalidate from cache and fetched again from the back storage to get an updated version.

Replacement Policy:

When cache miss happens, the cache ejects some other entry in order to make room for the previously uncached data (in case we don’t have enough room). The heuristic used to select the entry to eject is known as the replacement policy.

Optimal Replacement Policy:

The theoretically optimal page replacement algorithm (also known as OPT or Belady’s optimal page replacement policy) is an algorithm that tries to achieve the following: when a cached object need to be placed in the cache, the cache algorithm should replace the entry which will not be used for the longest period of time.

For example, a cache entry that is not going to be used for the next 10 seconds will be replaced by an entry that is going to be used within the next 2 seconds.

Thinking of the optimal replacement policy we can say it is impossible to achieve but some algorithms do near optimal replacement policy based on heuristics.
So everything is based on heuristics so what makes algorithm better than another one? And what do they use for their heuristics?

Nightmare at Java Street:

While reading the article programmer 1 fall a sleep and had nightmare (the scariest nightmare one can ever have)

Programmer 1: nihahha I will invalidate you. (Talking in a mad way)

Cached Object: no no please let me live, they still need me, I have children.

Programmer 1: all cached entries say that before they are invalidated and since when do you have children? Never mind now vanish for ever.

Buhaaahaha , laughed programmer 1 in a scary way, ,silence took over the place for few minutes and then a police serine broke this silence, police caught programmer 1 and he was accused of invalidating an entry that was still needed by a cache client, and he was sent to jail.

Programmer 1 work up and he was really scared, he started to look around and realized that it was just a dream then he continued reading about caching and tried to get rid of his fears.

Caching Algorithms:

No one can talk about caching algorithms better than the caching algorithms themselves

Least Frequently Used (LFU):

I am Least Frequently used; I count how often an entry is needed by incrementing a counter associated with each entry.

I remove the entry with least frequently used counter first am not that fast and I am not that good in adaptive actions (which means that it keeps the entries which is really needed and discard the ones that aren’t needed for the longest period based on the access pattern or in other words the request pattern)

Least Recently Used (LRU):

I am Least Recently Used cache algorithm; I remove the least recently used items first. The one that wasn’t used for a longest time.

I require keeping track of what was used when, which is expensive if one wants to make sure that I always discards the least recently used item.
Web browsers use me for caching. New items are placed into the top of the cache. When the cache exceeds its size limit, I will discard items from the bottom. The trick is that whenever an item is accessed, I place at the top.

So items which are frequently accessed tend to stay in the cache. There are two ways to implement me either an array or a linked list (which will have the least recently used entry at the back and the recently used at the front).

I am fast and I am adaptive in other words I can adopt to data access pattern, I have a large family which completes me and they are even better than me (I do feel jealous some times but it is ok) some of my family member are (LRU2 and 2Q) (they were implemented in order to improve LRU caching

Least Recently Used 2(LRU2):

I am Least recently used 2, some people calls me least recently used twice which I like it more, I add entries to the cache the second time they are accessed (it requires two times in order to place an entry in the cache); when the cache is full, I remove the entry that has a second most recent access. Because of the need to track the two most recent accesses, access overhead increases with cache size, If I am applied to a big cache size, that would be a problem, which can be a disadvantage. In addition, I have to keep track of some items not yet in the cache (they aren’t requested two times yet).I am better that LRU and I am also adoptive to access patterns.

-Two Queues:

I am Two Queues; I add entries to an LRU cache as they are accessed. If an entry is accessed again, I move them to second, larger, LRU cache.

I remove entries a so as to keep the first cache at about 1/3 the size of the second. I provide the advantages of LRU2 while keeping cache access overhead constant, rather than having it increase with cache size. Which makes me better than LRU2 and I am also like my family, am adaptive to access patterns.

Adaptive Replacement Cache (ACR):

I am Adaptive Replacement Cache; some people say that I balance between LRU and LFU, to improve combined result, well that’s not 100% true actually I am made from 2 LRU lists, One list, say L1, contains entries that have been seen only once “recently”, while the other list, say L2, contains entries that have been seen at least twice “recently”.

The items that have been seen twice within a short time have a low inter-arrival rate, and, hence, are thought of as “high-frequency”. Hence, we think of L1as capturing “recency” while L2 as capturing “frequency” so most of people think I am a balance between LRU and LFU but that is ok I am not angry form that.

I am considered one of the best performance replacement algorithms, Self tuning algorithm and low overhead replacement cache I also keep history of entries equal to the size of the cache location; this is to remember the entries that were removed and it allows me to see if a removed entry should have stayed and we should have chosen another one to remove.(I really have bad memory)And yes I am fast and adaptive.

Most Recently Used (MRU):

I am most recently used, in contrast to LRU; I remove the most recently used items first. You will ask me why for sure, well let me tell you something when access is unpredictable, and determining the least most recently used entry in the cache system is a high time complexity operation, I am the best choice that’s why.

I am so common in the database memory caches, whenever a cached record is used; I replace it to the top of stack. And when there is no room the entry on the top of the stack, guess what? I will replace the top most entry with the new entry.

First in First out (FIFO):

I am first in first out; I am a low-overhead algorithm I require little effort for managing the cache entries. The idea is that I keep track of all the cache entries in a queue, with the most recent entry at the back, and the earliest entry in the front. When there e is no place and an entry needs to be replaced, I will remove the entry at the front of the queue (the oldest entry) and replaced with the current fetched entry. I am fast but I am not adaptive

-Second Chance:

Hello I am second change I am a modified form of the FIFO replacement algorithm, known as the Second chance replacement algorithm, I am better than FIFO at little cost for the improvement. I work by looking at the front of the queue as FIFO does, but instead of immediately replacing the cache entry (the oldest one), i check to see if its referenced bit is set(I use a bit that is used to tell me if this entry is being used or requested before or no). If it is not set, I will replace this entry. Otherwise, I will clear the referenced bit, and then insert this entry at the back of the queue (as if it were a new entry) I keep repeating this process. You can think of this as a circular queue. Second time I encounter the same entry I cleared its bit before, I will replace it as it now has its referenced bit cleared. am better than FIFO in speed

-Clock:

I am Clock and I am a more efficient version of FIFO than Second chance because I don’t push the cached entries to the back of the list like Second change do, but I perform the same general function as Second-Chance.

I keep a circular list of the cached entries in memory, with the “hand” (something like iterator) pointing to the oldest entry in the list. When cache miss occurs and no empty place exists, then I consult the R (referenced) bit at the hand’s location to know what I should do. If R is 0, then I will place the new entry at the “hand” position, otherwise I will clear the R bit. Then, I will increment the hand (iterator) and repeat the process until an entry is replaced.I am faster even than second chance.

Simple time-based:

I am simple time-based caching; I invalidate entries in the cache based on absolute time periods. I add Items to the cache, and they remain in the cache for a specific amount of time. I am fast but not adaptive for access patterns

Extended time-based expiration:

I am extended time based expiration cache, I invalidate the items in the cache is based on relative time periods. I add Items the cache and they remain in the cache until I invalidate them at certain points in time, such as every five minutes, each day at 12.00.

Sliding time-based expiration:

I am Sliding time-base expiration, I invalidate entries a in the cache by specifying the amount of time the item is allowed to be idle in the cache after last access time. after that time I will invalidate it . I am fast but not adaptive for access patterns

Ok after we listened to some replacement algorithms (famous ones) talking about themselves, some other replacement algorithms take into consideration some other criteria like:

Cost: if items have different costs, keep those items that are expensive to obtain, e.g. those that take a long time to get.

Size: If items have different sizes, the cache may want to discard a large item to store several smaller ones.

Time: Some caches keep information that expires (e.g. a news cache, a DNS cache, or a web browser cache). The computer may discard items because they are expired. Depending on the size of the cache no further caching algorithm to discard items may be necessary.

The E-mail!

After programmer 1 did read the article he thought for a while and decided to send a mail to the author of this caching article, he felt like he heard the author name before but he couldn’t remember who this person was but anyway he sent him mail asking him about what if he has a distributed environment? How will the cache behave?

The author of the caching article got his mail and ironically it was the man who interviewed programmer 1 :D, The author replied and said :

Distributed caching:

*Caching Data can be stored in separate memory area from the caching directory itself (who handle the caching entries and so on) can be across network or disk for example.

*Distrusting the cache allows increase in the cache size.

*In this case the retrieval cost will increase also due to network request time.

*This will also lead to hit ratio increase due to the large size of the cache

But how will this work?

Let’s assume that we have 3 Servers, 2 of them will handle the distributed caching (have the caching entries), and the 3rd one will handle all the requests that are coming (Which asks about cached entries):

Step 1: the application requests keys entry1, entry2 and entry3, after resolving the hash values for these entries, and based on the hashing value it will be decided to forward the request to the proper server.

Step 2: the main node sends parallel requests to all relevant servers (which has the cache entry we are looking for).

Step 3: the servers send responses to the main node (which sent the request in the 1st place asking to the cached entry).

Step 4: the main node sends the responses to the application (cache client).

*And in case the cache entry were not found (the hashing value for the entry will be still computed and will redirect either to server 1 or server 2 for example and in this case our entry wont be found in server 1 so it will fetched from the DB and added to server 1 caching list.


Measuring Cache:

Most caches can be evaluated based on measuring the hit ratio and comparing to the theoretical optimum, this is usually done by generation a list of cache keys with no real data, but here the hit ratio measurement assumes that all entries have the same retrieval cost which is not true for example in web caching the number of bytes the cache can server is more important than the number of hit ration (for example I can replace the big entry will 10 small entries which is more effective in web)

Conclusion:

We have seen some of popular algorithms that are used in caching, some of them are based on time, cache object size and some are based on frequency of usage, next part we are going to talk about the caching framework and how do they make use of these caching algorithms, so stay tuned 😉

January 5, 2009 Posted by | JAVA | 4 Comments

Scripting in JDK6 (JSR 223) Part 2

Its Lunch Time!

After programmer 2 shown programmer 1 how to deal with JSR 223 or JDK6 scripting ( part 1) programmer 2 went for lunch as he didn’t eat anything since yesterday and left programmer 1 to meet his fate.

Programmer 1 began to apply what he learned from programmer 2 and stuff went so bad, after 30 minutes programmer 2 came back.

Programmer 2: Man, how is everything going? (While he was just about to finish his last bite from the sandwich)

Programmer 1: man the performance sucks, it is really really slow and that won’t be good thing, it is bottleneck now

Programmer 2: ok let me see what did you do. (Swallowed his last bite hardly)

The Crime!:

View the whole article

December 24, 2008 Posted by | JAVA | Leave a comment

Scripting in JDK6 (JSR 223) Part 1

Introduction:

For sure most of us (mm guess so) have heard about the Scripting provided in Java 6 (Mustang) or JSR 223, 1st time I saw that (just saw the title) I thought that the Java guys will enable us to compile and run JavaScript scripts and that’s the end, well nope that wasn’t what Scripting in java 6 about but actually it is about enabling scripting languages to access the java platform and get the advantages of using java and the java programmers to use such scripting languages engines and get advantages from that (as we will see ).after I got it I was thinking : oh man if the team who implemented the Scripting engine were living in the dark age and they came up with such idea they will be accused of The practice of witchcraft but thanks God we are open-minded now and they will live and the good thing is that they will continue to practice witchcraft 😀
So for now java guys will stay alive and continue to do their voodoo stuff and we will be looking at this nice scripting engine.

for more on this topic check my blog

December 13, 2008 Posted by | JAVA | Leave a comment

The new Servlets3.0 Specifications

Well, Servlets new specifications 3.0 has be pondered.
The new 3.0 specs changes the servlet technology up-side down the way that EJB3.0 did.

In the new release servlets makes heavy use of annotations to obsolete XML meta data (hell) and to make all its objects as POJOs .

Many changes have added to the new specs such request suspension and allowing servlets to be added to the web app after deployment that all were not allowed in the previous release that held the number 2.4.

you can find more in this interesting article :

http://www.theserverside.com/tt/articles/article.tss?track=NL-461

December 4, 2008 Posted by | JAVA | Leave a comment

Enabling Remote Debugging for tomcat

Sometimes your web application just works very fine on your local machine and when you move it to the deployment environment you see a BOOM. If the deployment environment is a Non-GUI machine (which is the common case), it is very hard to debug on that machine. Simply all what you need is to use a remote debugger to connect to the server machine and see what is wrong.

The concept is that you enable the web server to start the JVM in debug mode and start listening on a certain port for remote debuggers to connect, in tomcat, all what you need to do is to add this line to the catalina.sh
you will find at the begining of the file a set of lines started by “#” as they are commented, uncomment the JAVA_OPTS and insert the following line between double qoutes.
"JAVA_OPTS = -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=n

-Xdebug: tells the VM to start in debugging mode.
-Xrunjdwp: selects the protocol to use,address: the port,suspend=n: tells the server to start listening even if no debugger is connected, simply, start and dont wait for a debugger to connect.

That’s it, tomcat is waiting for a debugger to connect.

November 26, 2008 Posted by | JAVA | 2 Comments

Learning Jakarta Struts Project


Struts

The Jakarta Struts Project

The Jakarta Struts project, is an open−source project sponsored by the Apache Software Foundation. The Struts project was designed for creating Web applications that easily separate the presentation layer and allow it to be abstracted from the transaction/data Layers.

Model View Controller

Model : Represents the data objects. The Model is what is being manipulated and presented to the user.

View : Serves as the screen representation of the Model. It is the object that presents the current state of the data objects.

Controller : Defines the interaction between the view and the model .

The MVC (Model-View-Controller) model 2 :

all the views connect to one controller (Servlet) and this servlet works as a dispatcher where it handle all the requests and do whatever it needs in the business layer (Model).

What is Struts?

Struts is a web application framework to force

the user to follow the MVC models.
Struts don’t cover the business layer at all (Model)
It concerned only with the controller.
Also Struts support a tag library for the
presentation layer.

life Cycle :

Create new class for each transaction (will call it an action
this class extends Action Class(in Struts Api) and override execute() to handle the business login which this action will perform.
then add struts-config.xml and define each action with his properties in action element.
in web.xml we define the ActionServlet (provided by struts).

How Struts works

· When you Browse http://……/login.do

· The server check web.xml and found the servlet (ActionServlet)

· Then check the actions in struts-config.xml and found this action associated with specific action class.

· The action class Perform the code in execute() and return ActionForward

· ActionForward will give the result url

Struts Main Components:

ActionServlet : The role of the controller.

Action Mapping :

· tells the ActionServlet each request associated with specific Action.

· In Struts-Config.xml you type the element and this translated to java interface called ActionMapping.

Struts-Config.xml

· Three important elements in Struts-Config.xml
{form-bean, ActionMapping , global-forward }


To Be Continued….

October 12, 2008 Posted by | JAVA | Leave a comment