http://www.ibm.com/developerworks/java/library/j-5things4.html
Tuesday, November 30, 2010
5 things you didn't know about ... java.util.concurrent
ArrayBlockingQueue can give reader and writer threads first in, first out access (it would be a more efficient to allow readers to run while other readers held the lock, but you'd risk a constant stream of reader threads keeping the writer from ever doing its job.)
10 things you didn't know about - Java performance monitoring
- jstack will get a stack dump from any process
jmap can produce a dump of the heap, or a histogram of live classes and how many instances there are and the spaced used by them.
jhat supports analysing heap dumps obtained from jmap or jconsole or HotSpotDiagnostic.dumpHeap
JConsole is a built-in Java performance profiler that works from the command-line and in a GUI shell. It's not perfect, but it's an adequate first line of defense
The most effective response to a performance problem is to use a profiler rather than reviewing the code or JVM garbage collector flags.
Monitor the class count - if the count steadily rises, then you can assume that either the app server or your code has a ClassLoader leak somewhere and will run out of PermGen space before long.
com.sun.management.HotSpotDiagnostic has a "dumpHeap" mebean operation that allows a dump to be created (remotely) which can be analysed later.
jstat can monitor garbage collection and JIT compiler statistics
Java Best Practices - High performance Serialization
- If you don't explicitly set a serialVersionUID class attribute the serialization mechanism has to compute it by going through all the fields and methods to generate a hash, which can be quite slow.
- With the default serialization mechanism, all the serializing class description information is included in the stream, including descriptions of the instance, the class and all the serializable superclasses.
- Externalization eliminates almost all the reflective calls used by Serialization mechanism and gives you complete control over the marshalling and demarshalling algorithms, resulting in dramatic performance improvements. However Externalization requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions.
- Use simpler data representations to serialize objects where possible, e.g. just the timestamp instead of a Date object.
- You can eliminate serializing null values by serializing meta information about which fields are being serialized.
- Google protobuf is an alternative serialization mechanism with good size advantages when using compression.
Thousands of Threads and Blocking I/O
- For an NIO based server, the server notifies when some I/O event is ready to be processed, this is then processed; since all I/O is effectively multiplexed, it requires the server to keep track of where each client is within its i/o transaction, i.e. state must be maintained for all clients (unless a stateless protocol is used, e.g. all state is part of the request).
- NIO is not faster than IO, but it can be more scalable, though the scalability is an issue of how efficient the OS is at handling many threads.
- NIO transfers rate can be only 75% of a plain IO connection (several benchmark studies show this sort of comparative maximum rate).
- A multithreaded IO server tends to automatically takes advantage of multiple cores, where an NIO server may explcitly need to hand processing off to a pool of worker threads (though that is the common design)
- On modern OSs, idle threads have not much cost, context switching is fairly efficient, uncontended synchronization is cheap.
- Nonblocking datastructures scale well - ConcurrentLinkedQueue, ConcurrentHashMap, NonBlockingHashMap (&NonBlockingLongHashMap)
- A good architecture throttles incoming requests to the maximum rate the server can handle optimally, otherwise if the server gets overloaded overall request rates as well as individual request service times drop to unnacceptable levels.
- Avoid Executors.newCachedThreadPool as an unbounded number of threads tends to be bad for applications (e.g. more threads get created just when you are already maxxed on CPU).
- If you do mutliple sends per request, use a buffered stream. If one send per request, don't buffer (as you effectively already have).
- Try to keep everything in byte arrays if possible, rather than converting back and forth between bytes and strings.
- In a thread-per-request model, watch for socket timeouts.
- Multithreaded server coding is more intuitive than an event based server.
Fast and Safe concurrency - Actor framework for java
How to get C like performance in java
High Performance Serialization
How much time out if your day does ibm waste
4 tips on exception handling
Using exceptions for execution control in unexceptional situations is not recommended as it makes reading the code very difficult.
Creating an exception is expensive because of the initialization in the fillInStackTrace() method.
Exceptions raised from the JVM itself (e.g. NullPointerException, ClassCastException, ArrayIndexOutOfBoundsException) can be extremely fast as after a while the (HotSpot) virtual machine just returns the same exception object with no stack trace. This would make debugging difficult if you get that.
You can use the -XX:-OmitStackTraceInFastThrow to turn off the ability of the JVM to throw fast stackless exceptions.
5 things you didn't know about ... Java Database Connectivity
JDBC ResultSets can be scrollable, which might be useful for efficiently finding specific rows if your query returns ordered results. But scrolling ResultSets usually requires an open network connection, so this may not be desirable.
JDBC ResultSets can but updateable, this might be more efficient than executing a separate update query (it would depend on the implementation). But updating ResultSets usually requires an open network connection, so this may not be desirable.
JDBC comes with four disconnected RowSets that allow you to manipulate the ResultSet data without maintaining a database connection: CachedRowSet (a disconnected Rowset); WebRowSet (a CachedRowSet XML tranforms); JoinRowSet (a WebRowSet that can JOIN wjile staying disconnected); FilteredRowSet (a WebRowSet that can filter).
Statement.executeBatch lets you execute more than one SQL statement within one network round-trip.
Troubles with Sharding
Troubles with Sharding - What can we learn from the Foursquare Incident? (Page last updated October 2010, Added 2010-10-28, Author Todd Hoff, Publisher highscalability.com). Tips:
- [Although the article is about shards, the tuning suggestions are generic and really apply to many systems]
- Use more powerful servers - scaling-up is often the best solution.
- Spread your load over more cores and servers
- Design/Enable components to be movable (ideally dynamically) so that they can each separately move to a system with lower load.
- Can your data be moved to another node fast enough to fix overload problems?
- Monitor request queues and memory and fragmentation, and have limits that trigger a contingency plan when limits are exceeded.
- Prioritize requests so management traffic can be received by nodes even when the system is thrashing. Requests should be droppable, prioritizable, load balanceable, etc rather than being allowed to take down an entire system.
- Have the ability to turn off parts of system so load can be reduced enough that the system can recover.
- Consider whether a read-only or restricted backup version of the system could be usable for maintaining restricted operations after catastrophic failure while repairs are going on.
- Use an even distribution mechanism to better spread load.
- Replicate the data and load balance requests across replicas.
- Monitor resource usage so you can take preventative action. Look for spikes in disk operations per second, increased request queue sizes and request latency.
- Build in automated elasticity, sharding, and failure handling.
- Enable background reindexing and defragmentation.
- Continually capacity plan to continually adjust resources to fit projected needs.
- Test your system under realistic high load and overload conditions. Integrate testing into your build system.
- Use incremental algorithms that just need an event and a little state to calculate the next state, e.g. a running average algorithm rather than one that requires all values for for every calculation. This reduces data flow requirements as data can be discarded from caches more quickly.
- Separate historical and real-time data.
- Use more compact data structures.
- If you are expecting that all your data will not fit in RAM, then make sure your IO system isn't the bottleneck.
- Single points of failure are not the end of the word, but you need to know about them and work it into your plans for contingency planning.
- Figure out how you will handle downtime. Let people know what's happening and they will still love you
Playfish's Social Gaming Architecture - 50 Million Monthly Users And Growing
JAX-WS service using Spring 3.0
web.xml
No entires required
In this case the end point has to extend SpringBeanAutowiringSupport
public class SayHelloServiceEndpoint extends SpringBeanAutowiringSupport {
@Autowired
private SayHelloService SayHelloService;
@WebMethod
public String sayHello() {
return SayHelloService.sayHello();
}
}
Service endpoint -
http://localhost:8080/
WSDL -
http://localhost:8080/
jax-ws - jdk 1.6 bundled in your web app
End Point class
@Service("
@WebService(serviceName="
public class SayHelloServiceEndpoint {
@Autowired
private SayHelloService SayHelloService;
@WebMethod
public String sayHello() {
return SayHelloService.sayHello();
}
}
spring app context
web.xml
no specific entires required
Service URL -
http://localhost:9999/
WSDL -
Monday, November 29, 2010
MantisBT
MantisBT is a web-based issue tracking system originally designed for use by software development teams and their customers. However users have been using MantisBT for issue tracking and project management in a variety of environments including help desks, project management, TODO list management and others. MantisBT is unique in that it finds the delicate balance between richness in features while maintaining simplicity of usage, deployment and customization..
Unveiling the java.lang.Out OfMemoryError And dissecting Java heap dumps
Unveiling the java.lang.Out OfMemoryError And dissecting Java heap dumps (Page last updated May 2010, Added 2010-11-29, Author Jinwoo Hwang, Publisher ). Tips:
- A java.lang.OutOfMemoryError (OOME) is thrown when the JVM cannot allocate an object due to memory constraints. There are six different types of memory areas in the JVM: 1. Program Counter Register; 2. Java Virtual Machine Stack; 3. Heap; 4. Method Area; 5. Runtime Constant Pool; 6. Native Memory Stack. All but the first of these can cause an OOME.
- Different OutOfMemoryErrors are thrown depending on what caused the error and which memory space is unable to provide the required space. The error message usually identfies which memory space has caused the error.
- "OutOfMemoryError: PermGen space" implies the perm gen space is too small (stores class objects and interned strings); this may be caused by loading too many classes (usually class versions, or generated classes), or interned strings, or the space may be too small (use -XX:MaxPermSize=... to set to a larger value).
- "OutOfMemoryError: unable to create new native thread" typically happens when you are using very many threads and you have encountered the limitations of the process on that particular machine (possibly at that time). You might want to reduce the stack size of threads to work around this, or examine how further operating system memory resources can be made available to the JVM.
- "OutOfMemoryError: requested NNN bytes for MMMM. Out of swap space?" Typically occurs when the process has grown very big and you have encountered the limitations of the process on that particular machine (possibly at that time). You might want to reduce the number of objects in the system to work around this, or examine how further operating system memory resources can be made available to the JVM.
- "OutOfMemoryError: Java heap space" this usually indicates you need to change the maximum size of the heap (-Xmx) or reduce the number of objects being held on to by the application. Can be accopanied by a heap dump if the JVM has been so configured (e.g. with -XX:+HeapDumpOnOutOfMemoryError set).
- To diagnose an OutOfMemoryError: first examine the error message to determine what caused the error and which memory space was unable to allocate more memory; then determine whether the error can be eliminated woth a configuration change (e.g. larger heap or perm space setting or stack sizes), in some cases it may help you to examine the garbage collection statistics to identify this; finally if this cannot be solved by a configuration change then either the application memory usage needs to be examined (with the help of a heap dump if possible) or the application needs to be spread across multiple JVMs possibly across multiple machines.
Greasemonkey scripts
Are you IE ??
Download Video
Save video clips from YouTube, Google Video, Myspace, Metacafe, Break.com, Putfile, Dailymotion and Sevenload.
The script shows a small notice-bar at the top of supported video-pages. Upon clicking the notice-bar download links are shown on a small results page. Right-click to save the files.
Install this script here
Orkut scrap all addon
Orkut Script: It Adds an extra functionality to your Orkut account by which you can send scrap to all your friends with an ease!
download here
NewTube - Youtube Cleanup
Removes a lot of useless elements from youtube making it cleaner.
Keep watching this space for update!
Download here
Folders4Gmail
You love Gmail, but you miss folders to sort your emails? Organize your labels in a folder-like hierarchy with the Folders4Gmail Userscript.
Suppose you have two labels:
Mum
Dad
And you want Mum and Dad to be sub-folders of Family.
All you have to do is:
1. Create a parent label
Create a label named Family
(use Edit labels and Create new label).
2. Rename your labels
Dad to Family\Dad
Mum to Family\Mum
(use Edit labels and rename).
3. Now you have three labels:
Family
Family\Dad
Family\Mum
Download here
Sunday, November 28, 2010
OpenNMS - Interview with Project Leaders
OpenNMS - Interview with Project Leaders (Tarus Balog/David Hustace/Alexander Finger/Craig Gallen)
OpenNMS is the world's first enterprise-grade network management application platform developed using the open source model.
To break that down:
OpenNMS was registered on SourceForge in March of 2000 as project 4141, about two months after NetSaint which later became Nagios. So it has been around for while, almost longer than any other open source management tool.
It was designed from "day one" to be enterprise-grade, that is to manage tens of thousands, if not hundreds of thousands, of devices from a single instance. Ultimately it will be able to mange unlimited devices with a heavily distributed architecture.
While it works out of the box for many, it wasn't designed as much an application as an application platform. OpenNMS really shines when it is customized to fit a particular environment and integrated with other tools.
Finally, it is 100% free and open source software. There is no "enterprise" or commercial version - it is all open source. In fact, even commercially sponsored development on the project is freely available on the project's SourceForge git repository.
Why and how did you get started?
The OpenNMS project was started by Brian Weaver, Steve Giles and Luke Rindfuss in July of 1999. Their organization was purchase by a company called Atipa, which later became Oculan. Oculan focused on building a network management appliance based on OpenNMS.
However, there was still a lot of interest in OpenNMS itself, so in September of 2001 Tarus Balog joined Oculan to focus on building a services business around the project. In May of 2002, Oculan received new investment and decided to focus exclusively on their appliance. Tarus decided to remain focused on OpenNMS and took over administration of the project full time.
In 2004, he, along with David Hustace and Matt Brozowski, formed The OpenNMS Group, a commercial services company to support OpenNMS that current has customers in 24 countries.
Who is the software's intended audience?
There are two main audiences. First, there are those organizations that currently use expensive management suites such as HP's OpenView or IBM's Tivoli. In many cases, OpenNMS is more flexible, powerful and scalable, and there is the rather large savings in licensing costs that OpenNMS provides.
The second audience is those resellers and consultants who form the ecosystem around such products as OpenView and Tivoli. This is the environment that the principals in OpenNMS came from, and it is refreshing to be able have a tool that puts power in the hands of the integrator and user instead of the vendor.
What are a couple of notable examples of how people are using your software?
Where to start? Well, I guess we can focus on scale. At Papa Johns Pizza, they have been using OpenNMS for years to monitor a billion dollar internet product line. Currently, they are extending OpenNMS remote monitors into every one of their 2500 domestic stores.
Swisscom Hospitality Services provides internet access to hotels, conferences and other public places throughout Europe. They are monitoring 52,000 devices with a single instance of OpenNMS.
Rackspace Managed Hosting has over 70,000 customers in data centers from London, throughout the US, to Hong Kong. In each one OpenNMS is used to insure that clients network services are operational and responsive.
At New Edge Networks, they have integrated OpenNMS performance graphs directly into their customer portal. They are collecting 1.2 million data points concerning bandwidth, errors and other information -- every five minutes.
And these are just some notable commercial clients. The project has users worldwide from Vietnam to monitoring displays in the Paris subway.
What are the system requirements for your software, and what do people need to know about getting it set up and running?
That's a big "it depends". Those installations I mentioned above quite naturally run on powerful hardware, but I run a local instance of OpenNMS monitoring about 100 interfaces on a virtual machine. OpenNMS is written in Java, so it likes memory, and if you have to make the choice between more memory and more CPU choose the former. Some other sizing tips can be found on our wiki.
What gave you an indication that your project was becoming successful?
Hrm, I'm not sure. We haven't focused on being recognized as successful as much as delivering a quality product that is completely free and open, and I think that focus has paid off.
One of my earliest memories was an e-mail we received from Ho Trong Dat in 2002. Dat is from Vietnam, and was happy enough with OpenNMS to write and say so. Years later he is still using the product.
Then there were the awards, such as winning against much larger organizations at the LinuxWorld Expo in 2005, and in TechTarget's surveys.
Lately it seems like every day there is something to get excited about. During the disastrous Haiti earthquake we found out that Inveneo, a relief organization that provides bandwidth to NGOs during such crises, was using OpenNMS. We immediately saw it as a chance to give back and donated a freecommercial support agreement.
What has been your biggest surprise?
When OpenNMS was started, we thought our users would be those small to medium size businesses that couldn't afford OpenView or Tivoli. What we found was that it was large enterprises and carriers that couldn't afford OpenView and Tivoli. Not necessarily from a licensing standpoint, but from the cost of long, expensive deployments that failed to meet their needs. By designing OpenNMS as a flexible platform, the tool can be made to fit the business's processes, and not the other way around.
What has been your biggest challenge?
The biggest challenge facing OpenNMS has been convincing people that a free and open source software solution can be as good as a commercial solution costing millions of dollars. There seems to be this myth that hidden, private software development is somewhat superior to that done in a free and open manner.
But anyone looking at the OpenNMS code can tell that isn't the case. Under the leadership of Matt Brozowski, the code has become very well written. We realized early on that we didn't have the staff to fully test OpenNMS, and while we could rely on the large user community around the project it was best if we instrumented the code using junit tests as part of an agile development process.
I was talking to a CEO of another Java-based project, but one that was commercial, and he said they tried junit testing but it was just "too hard".
Why do you think your project has been so well received?
I think the main part lies in that the OpenNMS community is empowered outside of the commercial services OpenNMS Group. While those of us who are lucky enough to work on the project full time are able to contribute quite a bit to the project, the actual project is managed by a group called The Order of the Green Polo. The OpenNMS project has avoided the "open core" trap where a commercial company controls the project and there are both a "community" or open source licensed version and an "enterprise" or commercial version. In true open source communities this creates friction as various features are withheld from the "free" version. To build a successful community takes a lot of trust. Our slogan for the ten year anniversary was "Still open ... Still free" and I think over the years the project has earned it.
What advice would you give to a project that's just starting out?
Run out and buy a copy of "Rework" by Jason Fried and David Heinemeier. They are the founders of 37signals, and while not an open source company they have loads of great advice for software companies that want to start out without taking a more traditional venture-backed investment route. One key point they make is to not let anything get in the way of writing code. Don't worry about business plans, names, marketing, etc. at the beginning - write code and don't let anyone else tell you that you have to do all of those other things beforehand.
Where do you see your project going?
The goal of OpenNMS is nothing less than to become the de facto management platform of choice. Every discussion of a management solution should start out with "Have you tried OpenNMS?" We're not there yet, but we plan to be.
What's on your project wish list?
What isn't? (grin)
In the near term our focus is on what we are calling OpenNMS 2.0. One would think that in ten years we would have a 2.0 release, but almost since its inception we've had this idea for a fully distributed and scalable platform and that has represented 2.0 in our minds. We've adopted the OSGi framework moving forward, and we hope to have our initial release incorporating it out by the end of 2010.
What are you most proud of?
OpenNMS has always leveraged the idea that open source is a very powerful method for developing software, and that it is possible to build a successful services business around such a project without resorting to selling separate software under a commercial license.
For ten years we've been told that this isn't possible, and for ten years we have watched our community grow while other "open core" and commercial companies have either floundered or closed.
The fact that those of us who get paid to work on OpenNMS have been able to make a comfortable living without betraying the trust of our community is something of which we are very proud.
If you could change something about the project, what would it be?
I wish it were easier for people to become involved with OpenNMS. It is a huge project, and it takes a lot of time to understand the code, well written as it is.
One thing that is different with OpenNMS versus other popular open source projects is that the end users are not coders. Unlike, say, JBoss and Spring, our end users tend to be system and network admins, not Java programmers, so it has been a lot a work to both attract programmers to the project and educate those within it who wish to become coders on how to work with OpenNMS.
How do you coordinate the project?
Most of the project discussion occurs on our mailing lists that are hosted at SourceForge. The opennms-discuss list is the main community communication tool while opennms-devel is used for development discussion. We also have a private mailing list for Order of the Green Polo members but that is usually used to discuss administrative issues. There is a bugzilla instance that is used to track issues. Once a year we rent out a college dormitory and hold Dev-Jam, a weeklong coding fest. Everyone is welcome to attend (there is a cost associated with it to cover room and board) and most of the Order of the Green Polo members are able to come. The last one featured people from five different countries, one as far away as New Zealand.
How many hours a month do you and/or your team devote to the project?
Heh, those of us that work for The OpenNMS Group put in 50-60 hours a week. A typical contributor from the community can range from 2-3 hours up to 20+ for some of the Order of the Green Polo members.
The OpenNMS Group tends to hire straight out of the community, so not only do we get great people, we get people who do this because they love it, not just because it is a job.
What is your development environment like?
OpenNMS is written mainly in Java, and the development environment is optimized for using the Eclipse project's IDE. It runs on practically any operating system (yes, including Windows) and thus most developers use their own machines for coding.
OpenNMS uses test driven development, so unit tests play a large part in developing new features. Members of the Order of the Green Polo are often available to "pair program" with people new to the project.
Milestones:
OpenNMS uses the old kernel numbering scheme for releases. Any release where the number after the first decimal is "even" is considered a stable or production release. If the number is "odd" it is the unstable or development release. And, of course, nightly snapshots of trunk are available.
Currently, the development release is 1.9 and the latest stable release is 1.8. Point releases are done on roughly a monthly schedule, but the following was the official release date of the major stable releases:
Version / Date | Milestone |
---|---|
October, 2008 | OpenNMS 1.6 |
May, 2005 | OpenNMS 1.2 |
May, 2002 | OpenNMS 1.0 |
WireShark - open source network protocol analyzer
Wireshark
Wireshark is a network protocol analyzer. It captures the packets flowing across a computer network and displays them in a human-readable form. It supports nearly every protocol in common use and can capture on a wide variety of interface types including Ethernet and 802.11. You can filter packets as they are being captured and you apply display filters during analysis. Captures can be saved in a variety of formats so that you can send them to someone else or review them at a later date.
Wireshark helps you understand what's happening on your network at a fundamental level. As Laura Chappell says, "The packets never lie."
It is used just about everywhere. Most major network and software vendors point to Wireshark in their documentation.
Wireshark's original name was Ethereal.
What are a couple of notable examples of how people are using this software?
At Sharkfest (Wireshark's developer and user conference), in June speakers from Google and Citigroup talked about how they use Wireshark for troubleshooting. Wireshark is also being used in the development of the Interplanetary Internet. According to the New York Times, Wireshark was used to track down the GhostNet surveillance network last year.
GPL Ghostscript
GPL Ghostscript
GPL Ghostscript is a complete set of page description language interpreters including PDF, PostScript, PCL5, PCLXL, and XPS along with the ability to convert to and from any of these languages. There is no other open source solution for page description languages that is this comprehensive.
Who is the software's intended audience?
Anyone who needs to view, print, archive, or convert PDF, PostScript, PCL, or XPS on any platform is a potential GPL Ghostscript user.
What are a couple of notable examples of how people are using this software?
* The Linux print architecture
* Ghostscript's PDF interpreter is in every Kyocera printer in the last eight years
* Ghostscript is the PDF engine in many document management solutions including Xerox, OpenText, and Xinet
* Ghostscript is the PDF engine in many PDF tools applications including PrimoPDF, NitroPDF, and activePDF
* Ghostscript is the PDF/PostScript engine in many host-base Raster Image Processors (RIPs) including Caldera, Aurelon, Ergosoft, and Devstudio
Where do you see your project going?
The need for independent implementations of page description languages is not going away. PDF becoming a requirement in pocket computers (smart phones and tablets), and one of these technologies, "MuPDF" is optimized for that kind of application.
Current workload ?
1,700 hours per month.
JEDIT - lightweight powerful editor
jEdit
jEdit is a programmer's text editor written in Java. It can be configured as a rather powerful IDE through the use of its plugin architecture.
How are people using jEdit ?
- For developing jEdit
- As an IDE for various languages
- Partly embedded in other applications
- As a reverse engineering tool
- As a very powerful text editor
- As portable application
- As an excellent XML editor
- As an editor for KML files for Google Earth
What's on your project wish list?
- More spare time to work on it
- Regular release schedule
- Port jEdit to an OSGi framework
- The ability to create plugins in languages other than Java