It can be handy to index just part of a long character column. The strings might vary enough in just the first few characters to get the same cardinality out of a partial index as a full index would give, and the partial index takes up much less disk space and memory.Indexes in The ManualBut how long should that prefix be? Should it be the same cardinality as the original data? Within 10%?I dunno; somebody else can blog about that decision. I’m just interested in a query that finds the number which satisfies your requirements. I have a table called ‘numbers’ with an int field ‘num’ that just holds values 1 through several thousand.Then:SELECT num AS prefixLength, COUNT(DISTINCT LEFT(stringField, num)) AS prefixCardinality, dt.originalCardinalityFROM yourTable JOIN numbers JOIN (SELECT COUNT(DISTINCT stringField) AS originalCardinality FROM yourTable) AS dtGROUP BY prefixLengthHAVING prefixCardinality >= .90 * originalCardinalityLIMIT 1;Adjust the HAVING clause according to your needs.
More...


LinkBack URL
About LinkBacks




Reply With Quote