From ff06148679933705db013ca7dbec56296e9f062c Mon Sep 17 00:00:00 2001 From: Venkata Sidagam Date: Sat, 11 Jan 2014 14:48:29 +0530 Subject: [PATCH] Bug #17760379 COLLATIONS WITH CONTRACTIONS BUFFER-OVERFLOW THEMSELVES IN THE FOOT Description: A typo in create_tailoring() causes the "contraction_flags" to be written into cs->contractions in the wrong place. This causes two problems: (1) Anyone relying on `contraction_flags` to decide "could this character be part of a contraction" is 100% broken. (2) Anyone relying on `contractions` to determine the weight of a contraction is mostly broken Analysis: When we are preparing the contraction in create_tailoring(), we are corrupting the cs->contractions memory location which is supposed to store the weights(8k) + contraction information(256 bytes). We started storing the contraction information after the 4k location. This is because of logic flaw in the code. Fix: When we create the contractions, we need to calculate the contraction with (char*) (cs->contractions + 0x40*0x40) from ((char*) cs->contractions) + 0x40*0x40. This makes the "cs->contractions" to move to 8k bytes and stores the contraction information from there. Similarly when we are calculating it for like range queries we need to calculate it from the 8k bytes onwards, this can be done by changing the logic to (const char*) (cs->contractions + 0x40*0x40). And for ucs2 charsets we need to modify the my_cs_can_be_contraction_head() and my_cs_can_be_contraction_tail() to point to 8k+ locations. --- include/m_ctype.h | 4 ++-- strings/ctype-mb.c | 2 +- strings/ctype-uca.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/m_ctype.h b/include/m_ctype.h index f44fe1b10de..81096f60c78 100644 --- a/include/m_ctype.h +++ b/include/m_ctype.h @@ -372,13 +372,13 @@ my_cs_have_contractions(CHARSET_INFO *cs) static inline my_bool my_cs_can_be_contraction_head(CHARSET_INFO *cs, my_wc_t wc) { - return ((const char *)cs->contractions)[0x40*0x40 + (wc & 0xFF)]; + return ((const char *) cs->contractions)[0x40 * 0x40 * 2 + (wc & 0xFF)]; } static inline my_bool my_cs_can_be_contraction_tail(CHARSET_INFO *cs, my_wc_t wc) { - return ((const char *)cs->contractions)[0x40*0x40 + (wc & 0xFF)]; + return ((const char *) cs->contractions)[0x40 * 0x40 * 2 + (wc & 0xFF)]; } static inline uint16* diff --git a/strings/ctype-mb.c b/strings/ctype-mb.c index fddb8d2a16b..258613d3b05 100644 --- a/strings/ctype-mb.c +++ b/strings/ctype-mb.c @@ -697,7 +697,7 @@ my_bool my_like_range_mb(CHARSET_INFO *cs, char *max_end= max_str + res_length; size_t maxcharlen= res_length / cs->mbmaxlen; const char *contraction_flags= cs->contractions ? - ((const char*) cs->contractions) + 0x40*0x40 : NULL; + (const char *) (cs->contractions + 0x40*0x40) : NULL; for (; ptr != end && min_str != min_end && maxcharlen ; maxcharlen--) { diff --git a/strings/ctype-uca.c b/strings/ctype-uca.c index 8cd850b06df..7ec2e9851e1 100644 --- a/strings/ctype-uca.c +++ b/strings/ctype-uca.c @@ -8046,7 +8046,7 @@ static my_bool create_tailoring(CHARSET_INFO *cs, void *(*alloc)(size_t)) if (!(cs->contractions= (uint16*) (*alloc)(size))) return 1; bzero((void*)cs->contractions, size); - contraction_flags= ((char*) cs->contractions) + 0x40*0x40; + contraction_flags= (char *) (cs->contractions + 0x40*0x40); for (i=0; i < rc; i++) { if (rule[i].curr[1])