"Hibernate is one of most cited culprit for slower queries !", though I am kind of victim of this thought for sometime until I "understood" the true wisdom of lame excuses :)
Though sometimes this epic comic reflects my mental turbulence during the performance improvement katas... I learned not to go by hunches, but on real hard facts and truths. One interesting problem I found recently pretty annoying me for days (actually months sparingly) was slow query performance of the total application I was working on. Though the problem is itself is pretty basics, but the process of tuning and troubleshooting is what I thought is interesting.
To give a background, I was running Hibernate/Java/Spring Stack SQL Server 2008 as my DB (it's out of choice unquestionably) but I don't mind and stick to one technology so opportunity to try new / old things are always good.
So giving benefit of the doubt for Hibernate this time and not insanely changing all the app stack to just native queries or different ORM framework, just to keep my sanity and my application's , thought of venturing the journey of true wisdom, and let me take through the journey of findings.
Suspect No. 1 Hibernate
Though initially I tried with few different common measures, checking out the queries and doing proper indexes, running load tests it's pretty annoying that just with 10 million records it took laborious 187 ms just to run a simple query without even joins !!
So used the SQL Profiler to log the queries being sent to the server and to my surprise the queries are actually sent as stored procedures.
select c.col1,c.col2 from customer c where c.name like @param1 and c.country like @param2
Actual Driver Sent Query:
declar sp .... select c.col1,c.col2 from customer c where c.name like @param1 and c.country like @param2
Enabling stats for Hibernate, clearly prints the proper HQL transformed queries, so we are good there, so Hibernate is out of the picture now.
Suspect No 2 JDBC Driver
I was actually using Microsoft JDBC Type 4 Driver.
Ok let's go back and see what is happening on the driver front. Little bit analysis from the documentation it's pretty clear that SQL Server caches the query plan efficiently for stored procedures, so driver is doing the right thing! , so who is the culprit here now ?
SQL Server ?!!! - I hate to say that it's not , it's being clever tries to cache the query, but having one caveat to it "Parameter Sniffing" , in short it's the sql server way to cache the queries and make the parameters dynamic to enable effective query plans, but the queries being sent with @param1 ... so there has to be common problem across.
One option is to stop the driver to send out the stored procedure or make adjustments to queries or remodel the schema to take advantage of the parameter sniffing (though it requires extensive analysis of the application running in prod. with time crunch of existing releases, though i like this way), but hastily I tried to find options to stop sending the stored procedures out of the app, it seems the driver is of no use in handling these through config, shopping around I found JTDS claiming to be the fastest SQL driver available, so wanted to try this first as it has this parameter prepareSQL=false, which would disable the stored procedure queries
This made a greater improvement from 180ms to around 90ms. Wait what? remember 2ms query time ? that's the baseline to reach for, so under the hood, I see the queries being sent like
Select * from customer c where c.name like **N**'somename' and c.country=**N**'somevalue
what is this cryptic 'N' before the value ? oh yea Nvarchar, wait... what? why nvarchar for my varchar type, in short supporting UTF-8 chars, oh but my query is a userId and it's a hashed string, so why need a UTF-8, there is a parameter to disable it to sendUnicode=false pretty cool huh??. This did the trick, no type conversions were done and the query was running around the baseline we established before.
So now the application is back into business, but wait the app is not yet supporting under staging environments with even 1000 req, under load balanced env. two servers behind it, so what's happening initially the thread dumps points to the apache to tomcat connector AJP threads being waiting, ok so that's not acceptable
Ok digging it again, oh yea pretty lame bug in the jdbc url, there was a parameter in the url instance=SQL2008 why for a sql server 2012 R2? due to this setting the driver was trying to optimize the db connections as SQL2008 and 2012 server was so slow in establishing them. and the initial connections are taking too long to warm up the connection pool. So threads are actually waiting on the next set of connection pools to support extra load, initialSize being defaulted to 50, so changing and playing the params, finally fixed the problem just by removing the instance=SQL2008, param from the URL.
Sigh of relief now, pretty much the environment is able to scale up and handle larger loads, having a proper basic setup ready and established baselines, now we have pretty good application setup to work with.
One thing to remind you, there will be more journey in the future, but would be interesting and rewarding too...
One final Thought
@bpoetz: "programming is awesome if you like feeling dumb and then eventually feeling less dumb but then feeling dumb about something else pretty soon"
See you till the next post!