This article is written in English and Portuguese
Este artigo está escrito em Inglês e Português

English version:

Spending a lot of time with customers is great. It gives me time to go beyond the strict task execution that short projects allow. We actually have time to dive deep into what sometimes looks only as a curiosity. I'll describe one of this situations in this post. It started out as a normal performance issue. A query involving a join was causing an hash join and apparently there was an index that could be used. Some quick investigation showed that the datatypes didn't match on the join columns, and as such the index was not being used.
The situation was fixed (using an explicit cast since changing the datatypes would require more deep analysis) and the programmer(s) received an explanation about the issue so that future situations would be avoided. Simple right? Yes... But when you have inquiring minds and programmers with a few minutes to spare you may be surprised. And this was the case. A few weeks later the DBA team received a request to explain why we recommended that the columns used in joins should have the same datatype. A programmer had produced a testcase where the engine was able to convert the parameter sent and use the index. In other words, if the engine is smart enough why should we care?!

Although this could be considered a waste of time (using the same datatypes or explicit cast is a good practice, right?!) the question was interesting enough to make us spend some time with it. In fact I had seen situations in the past where apparently sometimes the engine was smart, and others not. I never thought too much about it, since I always recommended to follow the best practices (which clearly saves us some troubles). So, personally I also had this doubt, and together with the customer DBAs we started to do some tests. We came up with a very simple test case that we though would show the problem:

DATABASE stores;

DROP TABLE IF EXISTS tst_int;
DROP TABLE IF EXISTS tst_char;

CREATE TABLE tst_int
(
c1 INTEGER,
c2 INTEGER
);
CREATE TABLE tst_char
(
c1 CHAR(15),
c2 INTEGER
);

INSERT INTO tst_int
SELECT FIRST 2000 pg_pagenum, pg_partnum FROM sysmaster:syspaghdr;

INSERT INTO tst_char
SELECT FIRST 2000 pg_pagenum, pg_partnum FROM sysmaster:syspaghdr;

CREATE INDEX i_tst_int ON tst_int(c1);
CREATE INDEX i_tst_char ON tst_char(c1);

SET EXPLAIN ON;
SELECT "tst_int with char parameter:", * FROM tst_int WHERE c1 = '12345678';
SELECT "tst_char whth int parameter",* FROM tst_char WHERE c1 = 12345678;

--- demonstrates that each index key is being casted
INSERT INTO tst_char VALUES("a1", 12345678);
SELECT * FROM tst_char WHERE c1 = 12345678;

If we take a look at the query plan we see:

QUERY: (OPTIMIZATION TIMESTAMP: 12-10-2011 23:25:19)
------
SELECT "tst_int with char parameter:", * FROM tst_int WHERE c1 = '12345678'

Estimated Cost: 1
Estimated # of Rows Returned: 1

1) informix.tst_int: INDEX PATH

(1) Index Name: informix.i_tst_int
Index Keys: c1 (Serial, fragments: ALL)
Lower Index Filter: informix.tst_int.c1 = 12345678

So here the engine was "smart". Meaning it converted the CHAR to an INTEGER and that allowed it to use the index. Nice.


But here:

QUERY: (OPTIMIZATION TIMESTAMP: 12-10-2011 23:25:19)
------
SELECT "tst_char whth int parameter",* FROM tst_char WHERE c1 = 12345678

Estimated Cost: 84
Estimated # of Rows Returned: 1

1) informix.tst_char: SEQUENTIAL SCAN

Filters: informix.tst_char.c1 = 12345678

It looks as it's not that smart... Instead of converting the INTEGER parameter to a CHAR and use the index it decides to do the opposite: Converts all the CHARs in that column into INTEGERs and makes a SEQUENTIAL SCAN.
Since I didn't have a good explanation for thi...

Author: This e-mail address is being protected from spambots. You need JavaScript enabled to view it (Fernando Nunes)


{mosgoogle}

{mos_sb_discuss:4}


 
Newsflash
IBM intends to offer NoSQL database capabilities inside of DB2 beginning early next year, according to a keynote speaker at IBM Information On Demand 2011.


Read more...
 

Title: Best Practices for Embedded Data Management
Date: Thursday, May 24, 2012
Time: 10:30 AM - 12:00 PM (Central Time), 16:30h Germany
Presented By: Pradeep Natarajan - Software Developer (IBM)
Registration: Best Practices for Embedded Data Management

Description:

 
A small footprint, a silent i...
Read more...
 

Title: Utilizing Informix in the Apple Eco System
Date: Thursday, September the 27th, 2012
Time: 10:30 AM - 12:00 PM (Central Time), 16:30h Germany
Presented By: Ian Ilsley -Software Engineer (IBM)
Registration: Utilizing Informix in the Apple Eco System

Description:

 
The presentation will cover iOS support for Informix. 

Applications can be seamlessly written to Support Informix as the backend. The 
development environment used on iOS via open source, web services etc. will be 
detailed for application development. Discover creating Mac OSX application using 
Informix ESQL/C and Xcode iOS using Informix as the backend. 

Come and join us for this Event. 

Register for this event at:
https://events.webdialogs.com/register.php?id=6e99e45969&l=en-US 

If clicking a link above does not work, please copy the entire link 
and paste it into your Web browser. 
For technical questions, contact support at support@lotuslive.com.
For questions about this event, contact the host at: mckeithe@us.ibm.com. 
Read more...
 
Ads