(2) The introduction of the idea of sub-tables
Recent articles: 1) Architecture application of high concurrent data collection (Redis application)
2) Highly available data collection platform (how to play with 3 languages php+.net+aauto)
Teach you step by step how to do a keyword matching projectThis part has basically been completed. In-depth research is to analyze the performance of the system and must be done under the stimulation of some environments. Some changes.
Teach you step by step how to do keyword matching project: Teach you step by step how to do keyword matching project (search engine)----Day 1~Teach you step by step how to do keyword matching project (search engine)--- - Day 22 (22 articles in total)
In-depth research: The previous section talked about in-depth research on keyword matching projects - the introduction of filters.
Each article will be divided into causes of the problem, solutions and some necessary implementation plans.
This article officially begins.
Antecedents of the problem
With the explosive growth of automatically collected data, the capacity of the vocabulary database is booming, from a few W data to millions of data, Xiao Shuai Shuai looks at the queries in the database and feels more and more Nothing can be done.
In addition, what Xiao Ding Ding often says to Xiao Shuai Shuai the most is: When can I choose words faster? Every time I wait for a long time and there is no response, I am really anxious to death.
Little handsome is also more anxious, and my heart is haggard. I really feel that this is the challenge. Xiao Shuaishuai had no choice but to continue to find the boss, asking him to reward him with a clever trick.
Boss Yu patted Xiao Shuaishuai on the shoulder: Young man, you know how difficult the project is!
Xiao Shuaishuai replied: Don’t ridicule me, I have felt it deeply, and I think my heart may not be able to bear it anymore.
Boss Yu: If you can’t bear this, I guess there will be more for you in the future.
Little handsome: big brother, let alone these virtual behaviors, hurry up the solution.
Boss Yu: Why are you in a hurry? Things cannot be rushed. Come here and I will give you a clear path.
“Does each baby have a category attribute? How many words in these millions of data really belong to this category? Assuming that we only use the vocabulary of this category, can our project continue to be stable?”
Solution
According to certain business needs, we can split the data table vertically or horizontally, which can effectively optimize performance.
Vertical segmentation is also called column segmentation. It splits uncommon columns or long fields to ensure that entities are in a relatively applicable state. Common ones include one-to-one relationships.
Horizontal splitting, also called row splitting, splits data records according to a certain business and stores them in different tables. Common table splitting operations include date splitting.
This case uses horizontal segmentation to split the data into categories.
Implementation Plan
In order not to change the structure of the data table, we designed it this way. We distinguish which data table the project uses according to the table name. The changes resulting from this are relatively few. We only need to change the code slightly to solve it. This is a very frustrating thing.
Modify the Keyword code and add data sources.
<?<span>php </span><span>define</span>('DATABASE_HOST','127.0.0.1'<span>); </span><span>define</span>('DATABASE_USER','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE__PASSWORD','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE_CHARSET','utf-8'<span>); </span><span>class</span><span> Keyword { </span><span>public</span> <span>$word</span><span>; </span><span>public</span> <span>static</span> <span>$conn</span> = <span>null</span><span>; </span><span>public</span> <span>function</span><span> getDbConn(){ </span><span>if</span>(self::<span>$conn</span> == <span>null</span><span>){ self</span>::<span>$conn</span> = <span>mysql_connect</span>(DATABASE_HOST,DATABASE_USER,<span>DATABASE__PASSWORD); </span><span>mysql_query</span>("SET NAMES '".DATABASE_CHARSET."'",self::<span>$conn</span><span>); </span><span>mysql_select_db</span>("dict",self::<span>$conn</span><span>); </span><span>return</span> self::<span>$conn</span><span>; } </span><span>return</span> self::<span>$conn</span><span>; } </span><span>public</span> <span>function</span><span> save(){ </span><span>$sql</span> = "insert into keywords(word) values ('<span>$this</span>->word')"<span>; </span><span>return</span> <span>mysql_query</span>(<span>$sql</span>,<span>$this</span>-><span>getDbConn()); } </span><span>public</span> <span>static</span> <span>function</span> getWordsSource(<span>$cid</span>,<span>$limit</span>=0,<span>$offset</span>=40<span>){ </span><span>$sql</span> = "SELECT * FROM keywords_<span>$cid</span> LIMIT <span>$limit</span>,<span>$ffset</span>"<span>; </span><span>return</span> DB::MakeArray(<span>$sql</span><span>); } </span><span>public</span> <span>static</span> <span>function</span> getWordsCount(<span>$cid</span><span>){ </span><span>$sql</span> = "SELECT count(*) FROM keywords_<span>$cid</span>"<span>; </span><span>return</span> DB::QueryScalar(<span>$sql</span><span>); } }</span>
New QueryScalar is added to the DB class, which is used to calculate the total amount
<?<span>php </span><span>#</span><span>@author oShine</span> <span>define</span>('DATABASE_HOST','127.0.0.1'<span>); </span><span>define</span>('DATABASE_USER','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE__PASSWORD','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE_CHARSET','utf-8'<span>); </span><span>class</span><span> DB { </span><span>public</span> <span>static</span> <span>$conn</span> = <span>null</span><span>; </span><span>public</span> <span>static</span> <span>function</span><span> Connect(){ </span><span>if</span>(self::<span>$conn</span> == <span>null</span><span>){ self</span>::<span>$conn</span> = <span>mysql_connect</span>(DATABASE_HOST,DATABASE_USER,<span>DATABASE__PASSWORD); </span><span>mysql_query</span>("SET NAMES '".DATABASE_CHARSET."'",self::<span>$conn</span><span>); </span><span>mysql_select_db</span>("dict",self::<span>$conn</span><span>); </span><span>return</span> self::<span>$conn</span><span>; } </span><span>return</span> self::<span>$conn</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> Query(<span>$sql</span><span>){ </span><span>return</span> <span>mysql_query</span>(<span>$sql</span>,self::<span>Connect()); } </span><span>public</span> <span>static</span> <span>function</span> makeArray(<span>$sql</span><span>){ </span><span>$rs</span> = self::Query(<span>$sql</span><span>); </span><span>$result</span> = <span>array</span><span>(); </span><span>while</span>(<span>$data</span> = <span>mysql_fetch_assoc</span>(<span>$rs</span><span>)){ </span><span>$result</span>[] = <span>$data</span><span>; } </span><span>return</span> <span>$result</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> QueryScalar(<span>$sql</span><span>){ </span><span>$rs</span> = self::Query(<span>$sql</span><span>); </span><span>$data</span> = <span>mysql_fetch_array</span>(<span>$rs</span><span>); </span><span>if</span>(<span>$data</span> == <span>false</span> || <span>empty</span>(<span>$data</span>) || !<span>isset</span>(<span>$data</span>[1])) <span>return</span> 0<span>; </span><span>return</span> <span>$data</span>[1<span>]; } } </span>
Modify the Selector code for word selection:
<?<span>php </span><span>#</span><span>@Filename:selector/Selector.php</span><span> #</span><span>@Author:oshine</span> <span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/SelectorItem.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/charlist/CharList.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/charlist/CharlistHandle.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>dirname</span>(<span>__FILE__</span>)) . '/lib/Logger.php'<span>; </span><span>class</span><span> Selector { </span><span>private</span> <span>static</span> <span>$charListHandle</span> = <span>array</span><span>( </span>"黑名单" => "BacklistCharListHandle", "近义词" => "LinklistCharListHandle"<span> ); </span><span>public</span> <span>static</span> <span>function</span> select(<span>$num_iid</span><span>) { </span><span>$selectorItem</span> = SelectorItem::createFromApi(<span>$num_iid</span><span>); Logger</span>::trace(<span>$selectorItem</span>-><span>props_name); </span><span>$charlist</span> = <span>new</span><span> CharList(); </span><span>foreach</span> (self::<span>$charListHandle</span> <span>as</span> <span>$matchKey</span> => <span>$className</span><span>) { </span><span>$handle</span> = self::createCharListHandle(<span>$className</span>, <span>$charlist</span>, <span>$selectorItem</span><span>); </span><span>$handle</span>-><span>exec</span><span>(); } </span><span>$selectWords</span> = <span>array</span><span>(); </span><span>$wordsCount</span> = Keyword::getWordsCount(selectorItem-><span>cid); </span><span>$offset</span> = 40<span>; </span><span>$page</span> = <span>ceil</span>(<span>$wordsCount</span>/<span>$offset</span><span>); </span><span>for</span>(<span>$i</span>=0;<span>$i</span><=<span>$page</span>;<span>$i</span>++<span>){ </span><span>$limit</span> = <span>$i</span>*<span>$offset</span><span>; </span><span>$keywords</span> = Keyword::getWordsSource(selectorItem->cid,<span>$limit</span>,<span>$offset</span><span>); </span><span>foreach</span> (<span>$keywords</span> <span>as</span> <span>$val</span><span>) { </span><span>#</span><span> code...</span> <span>$keywordEntity</span> = SplitterApp::<span>split</span>(<span>$val</span>["word"<span>]); </span><span>#</span><span> code...</span> <span>if</span>(MacthExector::macth(<span>$keywordEntity</span>,<span>$charlist</span><span>)){ </span><span>$selectWords</span>[] = <span>$val</span>["word"<span>]; } } } </span><span>return</span> <span>$selectWords</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> createCharListHandle(<span>$className</span>, <span>$charlist</span>, <span>$selectorItem</span><span>) { </span><span>if</span> (<span>class_exists</span>(<span>$className</span><span>)) { </span><span>return</span> <span>new</span> <span>$className</span>(<span>$charlist</span>, <span>$selectorItem</span><span>); } </span><span>throw</span> <span>new</span> <span>Exception</span>("class not exists", 0<span>); } }</span>
Summary
Xiao Shuai Shuai has learned new knowledge points. Is this a way to reward the boss? Do you also want to reward me? Please give me a thumbs up!