Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#1655 closed task (fixed)

Memcache for spam counting

Reported by: planetcruiser Owned by:
Priority: major Milestone: 0.8
Component: BW Database Keywords:
Cc: tobixen@…

Description (last modified by planetcruiser)

Issue:

  • According to deer:/var/log/mysql/mysql-slow.log this query is slow:
    # User@Host: bewelcome[bewelcome] @ localhost []
    # Query_time: 7  Lock_time: 0  Rows_sent: 1  Rows_examined: 77222
    
    SELECT COUNT(*) AS cnt
                    FROM messages, members AS mSender, members AS mReceiver
                    WHERE mSender.id=IdSender
                    AND messages.SpamInfo='SpamSayMember'
                    AND mReceiver.id=IdReceiver
                    AND (
                            mSender.Status='Active'
                        OR
                            mSender.Status='Pending'
                        );
    

Solution:

  1. Rewrite the section executing this query to use memcache (use a long expiry, maybe 24h)
  2. Make key expire when new spam is reported

Clues:

Related tickets:

Change History (7)

comment:1 Changed 7 years ago by planetcruiser

  • Description modified (diff)

comment:2 Changed 7 years ago by planetcruiser

  • Description modified (diff)

comment:3 Changed 7 years ago by planetcruiser

  • Cc tobixen@… added

tobias via bw-admin list:

Probably this query should run often to discover spammers fast and efficiently. Then memcache is the wrong answer.

i disagree. the query result won't change until a member reports new spam, so part 2 of the suggested solution will make sure the result is always up-to-date.

comment:4 Changed 7 years ago by planetcruiser

tobixen via bw-admin list:

There is only 426 messages with SpamInfo='SpamSayMember' ... so a
conditional index (if such exists in mysql?) or an index on SpamInfo
would probably speed up this query.  That can be done without dev
interaction.  I'll fix.

With some help from devs, we could eventually speed it further up by
having a new code in spaminfo for accounts that are closed and have
some cronscript update the table:

update message set spaminfo = 'SpamFromClosedAccount' where
spaminfo='SpamSayMember' and (... sender account is closed ...)

Also, the join to the receiver account seems to be a noop and should be removed.

comment:5 Changed 7 years ago by tobixen

  • Resolution set to fixed
  • Status changed from new to closed

I fixed the index, execution time has been slashed from many seconds to 0.05 seconds (unless there are some cache effect in place that I'm unaware of?) so I think this one can probably be marked as resolved.

There are two things that concerns me ...

  1. the current solution doesn't scale if the number of flagged spam messages increases a lot. I hinted about a solution for this (differentiated status flag for messages sent by a closed account so the query can ignore those out of the hat).
  1. the index that I made eats up more memory and disk space than what's needed - I should probably spend a bit more time doing research on conditional (aka partial aka filtered) indexes under mysql.

Probably no big deal ... I guess I'll just resolve this one.

comment:6 Changed 7 years ago by planetcruiser

this ticket wasn't actually fixed but invalid, because the query became fast after adding the index and no memcache is needed for now.

the solution by tobixen was this:

create index messages_by_spaminfo on messages(spaminfo);

the query does not show up in the slow.log any longer, so this is definitively fixed. science, you scary! :) if we run into performance problems with the index solution we can reopen.

i added the index to http://downloads.bewelcome.org/for_developers/rox_test_db/bewelcome.sql.gz

comment:7 Changed 7 years ago by planetcruiser

  • Milestone changed from unassigned to 0.8

..to make sure it goes to the 0.8 changelog

Note: See TracTickets for help on using tickets.