(二)分錶思想的引入
近期的文章: 1)高並發資料擷取的架構應用(Redis的應用)
手把手教你做關鍵字匹配專案這塊基本上已經完成,深入研究是對系統的性能作為分析,在一些環境的刺激下所必需要做的一些改變。
手把手教你做關鍵字配對專案: 手把手教你做關鍵字配對專案(搜尋引擎)---- 第一天~手把手教你做關鍵字配對專案(搜尋引擎)---- 第二十二天(共22篇)
深入研究:上節講到 關鍵字匹配項目深入研究-過濾器的引入。
每一篇會分為問題的前因、解決方案以及有些必要的實現方案。
本篇正文正式開始。
問題的前因
隨著自動採集資料的爆炸式的增長,詞庫的容量蒸蒸日上,一下從幾W數據猛增數百萬數據,小帥帥看著數據庫的查詢越來越感到無能為力。
再加上小丁丁常對小帥帥說的最多的一句:何時那麼選詞能快一點,每次我都等好久都莫有反應,真是急死我了。
小帥也較為焦急,心力憔悴,真正的感覺原來這就是挑戰。小帥帥無可奈何的繼續找到於老大,求老大賞賜高招。
於老闆拍拍小帥帥的肩膀:小伙子,知道專案的難度了吧!
小帥帥:別挖苦我了,我已深深的感受到了,我想我心臟估計快承受不了了。
於老大:就這點你就承受不了,那估計以後有的是給你受的。
小帥:大哥,別說這些虛的行不,趕緊的解決方案丫。
於老大:急啥,事情是急不來的,過來,哥給你指條明路。
「每個寶貝是不是有類別的屬性,那麼這數百萬資料真正屬於這個類別的字能夠有多少?假設我們只取這個類別的字庫我們的專案是否可以繼續穩定下來」。
解決方案
依照某種業務需要,我們可以將資料表實施分割,可縱向或橫向分割,可有效的進行效能最佳化。
縱向分割也稱列分割,並以不常用的欄位或長字段分割確保實體處於相對適用的狀態,常見的有一對一關聯。
橫向分割也稱行分割,依照某種業務分割資料的記錄來存放在不同的表,常見的有按日期分錶操作。
本案例是使用橫向分割,並依照類別的形式分割資料。
實現方案
我們為了不更改資料表的結構,這樣設計了,我們依照表名來區分項目使用那個資料表。這樣一來的改動相對是非常少的。我們只要稍微改動下程式碼就可以解決了,這很心塞的一件事情。
修改Keyword的程式碼,增加取得資料來源。
<?<span>php </span><span>define</span>('DATABASE_HOST','127.0.0.1'<span>); </span><span>define</span>('DATABASE_USER','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE__PASSWORD','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE_CHARSET','utf-8'<span>); </span><span>class</span><span> Keyword { </span><span>public</span> <span>$word</span><span>; </span><span>public</span> <span>static</span> <span>$conn</span> = <span>null</span><span>; </span><span>public</span> <span>function</span><span> getDbConn(){ </span><span>if</span>(self::<span>$conn</span> == <span>null</span><span>){ self</span>::<span>$conn</span> = <span>mysql_connect</span>(DATABASE_HOST,DATABASE_USER,<span>DATABASE__PASSWORD); </span><span>mysql_query</span>("SET NAMES '".DATABASE_CHARSET."'",self::<span>$conn</span><span>); </span><span>mysql_select_db</span>("dict",self::<span>$conn</span><span>); </span><span>return</span> self::<span>$conn</span><span>; } </span><span>return</span> self::<span>$conn</span><span>; } </span><span>public</span> <span>function</span><span> save(){ </span><span>$sql</span> = "insert into keywords(word) values ('<span>$this</span>->word')"<span>; </span><span>return</span> <span>mysql_query</span>(<span>$sql</span>,<span>$this</span>-><span>getDbConn()); } </span><span>public</span> <span>static</span> <span>function</span> getWordsSource(<span>$cid</span>,<span>$limit</span>=0,<span>$offset</span>=40<span>){ </span><span>$sql</span> = "SELECT * FROM keywords_<span>$cid</span> LIMIT <span>$limit</span>,<span>$ffset</span>"<span>; </span><span>return</span> DB::MakeArray(<span>$sql</span><span>); } </span><span>public</span> <span>static</span> <span>function</span> getWordsCount(<span>$cid</span><span>){ </span><span>$sql</span> = "SELECT count(*) FROM keywords_<span>$cid</span>"<span>; </span><span>return</span> DB::QueryScalar(<span>$sql</span><span>); } }</span>
DB類別新增QueryScalar,用來算總量
<?<span>php </span><span>#</span><span>@author oShine</span> <span>define</span>('DATABASE_HOST','127.0.0.1'<span>); </span><span>define</span>('DATABASE_USER','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE__PASSWORD','xiaoshuaishuai'<span>); </span><span>define</span>('DATABASE_CHARSET','utf-8'<span>); </span><span>class</span><span> DB { </span><span>public</span> <span>static</span> <span>$conn</span> = <span>null</span><span>; </span><span>public</span> <span>static</span> <span>function</span><span> Connect(){ </span><span>if</span>(self::<span>$conn</span> == <span>null</span><span>){ self</span>::<span>$conn</span> = <span>mysql_connect</span>(DATABASE_HOST,DATABASE_USER,<span>DATABASE__PASSWORD); </span><span>mysql_query</span>("SET NAMES '".DATABASE_CHARSET."'",self::<span>$conn</span><span>); </span><span>mysql_select_db</span>("dict",self::<span>$conn</span><span>); </span><span>return</span> self::<span>$conn</span><span>; } </span><span>return</span> self::<span>$conn</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> Query(<span>$sql</span><span>){ </span><span>return</span> <span>mysql_query</span>(<span>$sql</span>,self::<span>Connect()); } </span><span>public</span> <span>static</span> <span>function</span> makeArray(<span>$sql</span><span>){ </span><span>$rs</span> = self::Query(<span>$sql</span><span>); </span><span>$result</span> = <span>array</span><span>(); </span><span>while</span>(<span>$data</span> = <span>mysql_fetch_assoc</span>(<span>$rs</span><span>)){ </span><span>$result</span>[] = <span>$data</span><span>; } </span><span>return</span> <span>$result</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> QueryScalar(<span>$sql</span><span>){ </span><span>$rs</span> = self::Query(<span>$sql</span><span>); </span><span>$data</span> = <span>mysql_fetch_array</span>(<span>$rs</span><span>); </span><span>if</span>(<span>$data</span> == <span>false</span> || <span>empty</span>(<span>$data</span>) || !<span>isset</span>(<span>$data</span>[1])) <span>return</span> 0<span>; </span><span>return</span> <span>$data</span>[1<span>]; } } </span>
修改Selector的程式碼,用於選字:
<?<span>php </span><span>#</span><span>@Filename:selector/Selector.php</span><span> #</span><span>@Author:oshine</span> <span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/SelectorItem.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/charlist/CharList.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>__FILE__</span>) . '/charlist/CharlistHandle.php'<span>; </span><span>require_once</span> <span>dirname</span>(<span>dirname</span>(<span>__FILE__</span>)) . '/lib/Logger.php'<span>; </span><span>class</span><span> Selector { </span><span>private</span> <span>static</span> <span>$charListHandle</span> = <span>array</span><span>( </span>"黑名单" => "BacklistCharListHandle", "近义词" => "LinklistCharListHandle"<span> ); </span><span>public</span> <span>static</span> <span>function</span> select(<span>$num_iid</span><span>) { </span><span>$selectorItem</span> = SelectorItem::createFromApi(<span>$num_iid</span><span>); Logger</span>::trace(<span>$selectorItem</span>-><span>props_name); </span><span>$charlist</span> = <span>new</span><span> CharList(); </span><span>foreach</span> (self::<span>$charListHandle</span> <span>as</span> <span>$matchKey</span> => <span>$className</span><span>) { </span><span>$handle</span> = self::createCharListHandle(<span>$className</span>, <span>$charlist</span>, <span>$selectorItem</span><span>); </span><span>$handle</span>-><span>exec</span><span>(); } </span><span>$selectWords</span> = <span>array</span><span>(); </span><span>$wordsCount</span> = Keyword::getWordsCount(selectorItem-><span>cid); </span><span>$offset</span> = 40<span>; </span><span>$page</span> = <span>ceil</span>(<span>$wordsCount</span>/<span>$offset</span><span>); </span><span>for</span>(<span>$i</span>=0;<span>$i</span><=<span>$page</span>;<span>$i</span>++<span>){ </span><span>$limit</span> = <span>$i</span>*<span>$offset</span><span>; </span><span>$keywords</span> = Keyword::getWordsSource(selectorItem->cid,<span>$limit</span>,<span>$offset</span><span>); </span><span>foreach</span> (<span>$keywords</span> <span>as</span> <span>$val</span><span>) { </span><span>#</span><span> code...</span> <span>$keywordEntity</span> = SplitterApp::<span>split</span>(<span>$val</span>["word"<span>]); </span><span>#</span><span> code...</span> <span>if</span>(MacthExector::macth(<span>$keywordEntity</span>,<span>$charlist</span><span>)){ </span><span>$selectWords</span>[] = <span>$val</span>["word"<span>]; } } } </span><span>return</span> <span>$selectWords</span><span>; } </span><span>public</span> <span>static</span> <span>function</span> createCharListHandle(<span>$className</span>, <span>$charlist</span>, <span>$selectorItem</span><span>) { </span><span>if</span> (<span>class_exists</span>(<span>$className</span><span>)) { </span><span>return</span> <span>new</span> <span>$className</span>(<span>$charlist</span>, <span>$selectorItem</span><span>); } </span><span>throw</span> <span>new</span> <span>Exception</span>("class not exists", 0<span>); } }</span>
總結
小帥帥又學到了新的知識點,這是要犒勞於老闆的節奏嗎?你們是否也要犒賞我呢,求贊哈!