Opened 10 years ago

Closed 6 years ago

Last modified 5 years ago

#261 closed bug (fixed)

Abnormal delay for some php page to produce a result

Reported by: jeanyves Owned by:
Priority: critical Milestone: Legacy Resolved
Component: unknown Keywords: delay server slow
Cc: tobixen, hkroger, friedrich

Description

About the delay problem, I confirm the server is sometime very slow to answer.
I have added a temporary measurement of the time between the processing of the config.inc.php and the footer.php (for old BW code). The results show that in several case the delta time is too big

I have made an excel file with this (I have removed the ip addresses)

it show that several pages, sometime are very long to produce results

Attachments (1)

DebugDelay.xls (21.5 KB) - added by jeanyves 10 years ago.
summary of some server sometime slow pages

Download all attachments as: .zip

Change History (20)

Changed 10 years ago by jeanyves

summary of some server sometime slow pages

comment:1 follow-up: Changed 10 years ago by matrixpoint

See also #227.

comment:2 in reply to: ↑ 1 Changed 10 years ago by jeanyves

Replying to matrixpoint:

See also #227.

yes John you are right

I am thinking about how to give more visibility performances and slow query to SD Team, this could help for analysis and also prevent problems.
However, keep observing, for now the problem seems to be temporary fix, and if it is confirmed this will also confirm my diagnostic

comment:3 Changed 10 years ago by micha

  • Milestone changed from 0.1.1-outreach-bugfixing to BigPicture

comment:4 Changed 10 years ago by tobixen

I added duration logging to the apache access log now, and I will set up scripts making munin graphs later.

Unfortunately, the duration logging is integers - so it's mostly only 0. But it should be good enough to spot real problems.

comment:5 Changed 10 years ago by matrixpoint

In #338 DB queries with execution times accurate to tenths of milliseconds are available at the bottom of each page.

comment:6 Changed 10 years ago by tobixen

I've set up http://bull.bewelcome.org/ext-server-status (basic auth, as on alpha) so we can monitor apache performance.

Currently http://bull.bewelcome.org/ is lightningly fast, while http://www.bewelcome.org/ feels very sluggish (to me at least).

Out of the last 60000 requests, 561 requests took more than 5 secs, out of those 200 was a request for "/", and quite many others seems to be trivial stuff. 1% is insigificant anyway.

comment:7 Changed 10 years ago by matrixpoint

I don't doubt the new hardware is very fast. But something is causing major problems with the searchmembers page. Queries that always return in a fraction of a second on my laptop can often take 10 seconds or longer on any of the servers, even when I increase the number of members in my laptop DB to 2000. This is already unacceptable performance and will only worsen as the membership grows. Since I don't have access to the servers, I need some way of getting diagnostic information. And I know this tool will be valuable to other developers on an ongoing basis, as it was in CS.

comment:8 Changed 10 years ago by tobixen

One of the reasons why I joined BW is that I have quite some experience on database tuning ... but unfortunately I've been working with postgres the last years, and we're using mysql. I guess I have to read up a bit.

comment:9 Changed 10 years ago by tobixen

Moving the database from goat to bull will probably help.

Another thought, we had a launch at work, and we had major performance problems - after a while we discovered that each and every transaction caused a new connection to the database to be made and closed, instead of existing connections to be reused. This caused lots of CPU usage. I should check up a bit how much time it takes to execute a simple select.

A third thought, to reduce the load we should probably make the front page static and/or cacheed.

comment:10 Changed 10 years ago by tobixen

FWIW, according to the munin graphs, bull has no significant CPU usage, goat is mostly idle but have some spikes now and then

comment:11 Changed 10 years ago by matrixpoint

I'm not a DB or server expert, but something has to be wrong, just comparing with the performance of a similar website I worked on, CS. The most number of concurrent logins I've noticed at one time on BW is about 5, whereas on CS, 100-200 was not uncommon. And comparing 300,000 members to search among vs. 2,000, there's no good reason why a search should take 5 times longer on BW, as it often does.

So, I'm not sure it's some subtle detail, but rather something more obvious like the multiple connection problem you experienced in the past.

There's no doubt we're going to need caching (in addition to the built-in MySQL caching currently used), but with such a light load as it is now, there must be something simpler to change that could improve performance.

comment:12 follow-up: Changed 10 years ago by tobixen

I absolutely agree that "something is wrong". One or more operation(s) when loading i.e. the front page just seems to take lots of time - the server is generally not loaded at all. Cacheing the front page will serve as a neat trick to get down the CPU consumption when the server is loaded, but as of now it will only mask the problem.

I found one problem with name resolving being very slow, but I've fixed that and it doesn't seem to help.

Do we know if it's slow due to DB operations? Is #338 implemented in production yet, can it eventually be fast-tracked to production? Can I have access to this functionality?

comment:13 in reply to: ↑ 12 Changed 10 years ago by matrixpoint

Do we know if it's slow due to DB operations? Is #338 implemented in production yet, can it eventually be fast-tracked to production? Can I have access to this functionality?

No, it's only on the test server now. I didn't want to fast track it because of potential security issues and the fact that I'm unclear on rights. Otherwise, it's very simple and shouldn't interfere with anything. I would like to get it to production ASAP, because it would very quickly help us to pinpoint bottlenecks to the MySQL server or, if there aren't any serious ones, rule that out so we can look elsewhere.

There is another potential problem in that #338 involves a modification to Platform PT db access code, but it's such a small and simple mod that even if it gets overwritten by an upgrade, it should be very easy to re-introduce.

RE: rights. I think there should be a right for developers to get general diagnostic info and/or turn on the debug feature. There probably already is and J-Y knows exactly what to do, so I'm just hoping he will take a look and advise us, then #338 can go fast to production.

comment:14 follow-up: Changed 10 years ago by tobixen

I think we never really agreed 100% what the routines for fast-tracking should be, but I think we have more or less landed on the concensus that the minimum requirement is that one talks with peers, agrees that things needs to be done, and get some comments and code review from peers.

Is the issue reproducable on alpha and test? (I will check a bit later today, if/when I get the time)

comment:15 Changed 10 years ago by tobixen

Well, testing it takes zil time anyway ;-)

Issue is not reproducible on test, but highly reproducible on alpha.

I will deploy r3765 on alpha tomorrow unless anyone beats me at it or protests.

comment:16 in reply to: ↑ 14 Changed 10 years ago by matrixpoint

Is the issue reproducable on alpha and test? (I will check a bit later today, if/when I get the time)

Only once did I get a substantial delay on test, so it could have been a one-time incident. But I agree, alpha (and production) have highly reproduceable, significant delays.

comment:17 Changed 10 years ago by philipp

  • Milestone changed from BigPicture to unassigned

Milestone BigPicture? deleted

comment:18 Changed 6 years ago by planetcruiser

  • freq_reported set to 1
  • Resolution set to fixed
  • show_on_bw set to 0
  • Status changed from new to closed

looks like this is fixed now, all pages load reasonable well. if you experience slow rendering, please re-open

comment:19 Changed 5 years ago by TimLoal

  • Cc tobixen hkroger friedrich added; tobixen hkroger friedrich removed
  • Milestone changed from unassigned to Legacy Resolved
Note: See TracTickets for help on using tickets.