做web,经常会用到key-value的缓存,虽然快,但缓存的维护是个问题,就拿sgementfault举例来说,如果我来做,首页的问题分页就涉及到
1.问题总数缓存
2.每页问题清单缓存
现在管理员删除了一个问题,那么为满足实时性,肯定需要更新缓存
1.问题总数缓存-1
2.每页问题清单缓存如何更新?
当然可以查数据库算出来,是否有这个必要?
这只是个例子,为了说明问题,为了更新一个缓存而造成了另外的数据库查询开销。在实际过程中,我经常用查询条件的组合作为key,这使得我在更新缓存时无从下手(虽然我自己手工维护了一个缓存key的清单,但难免会有遗漏,终归不是好办法)。但如果把缓存的key都定义死,则缓存的使用不是那么灵活。
It is recommended to handle it like this:
1. Cache the total number of questions
Reference: http://www.php.net/manual/zh/memcache...
2. How to update the question list cache of each page?
I didn’t think about it carefully. If it were me, I would probably use an end_id as the end stamp, and then increment it forward when new questions come in. It will be refreshed every 100+ summaries. It mainly depends on the volume. It is difficult to judge without the volume. (If you start small, just refresh the db once every three minutes, no problem)
1. It is good to use memcache incr/decr provided by @jawa to cache the total number of questions
2. A list of questions on each page. For this, you don’t need to maintain the list but only cache the question content. The list itself can be checked from the database. Indeed, as @jawa said, it depends on the quantity. There are many solutions to the details. There are very coarse-grained methods. For example, the list itself mentioned above is directly checked from the database. There are also very fine-grained methods. For example, the content forwarding and number of comments on Sina Weibo are obtained twice and decoupled. Changes in content (less) and changes in count (frequently)
The main purpose of using cache is to reduce the pressure on the database. It is unreasonable to update the cache too frequently.
1. Give up real-time performance and update the cache when the data reaches a certain defined threshold.
2. Use real-time incremental indexing of search engines such as sphinx.