After our move last week, we found that one of the headers in our outgoing email was incorrect, as a result some MX servers for various email domains are rejecting our emails. These emails are confirmation/verification emails which are sent when a blog is added through SezWho My Account. This has affected around 100+ requests since last week.
We’re working on this and targetting to clear the backlog by tomorrow.
Update (08/14/08): Barring a few emails to aol.com domain and a few bounces due to incorrect email addresses, the email backlog is cleared now. Drop us a note at support@sezwho.com if you don’t recv. keys/confirmation emails within an hour.
| 4.2 (6 people) |
We’re moving the database servers and the webservers to a new infrastructure this weekend at approximately 9:30p 7:30p PT on Saturday Aug, 02. We’re expecting a downtime of 45 minutes. During the outage Profiles, ratings, and webservices functionality would be unavailable; Images, SezWho CSS and SezWho JS would continue to work without hampering page loads.
Update: So just to be clear, your site will not get effected…it will load as it does now. The things that will not work will be ratings, and the profile popup etc. while we are down. So again, your site will not go down because we are going to be down. Let us know if you have questions.
Update (10:13 pm): Woof, Woof
| 4.5 (10 people) |
We started our application migration to a new datacenter along with the network changes we have been planning for sometime. In the first step, we increased the capacity of the image servers this weekend. As per our external web monitors we’re seeing improvements in the response times. See the image below.
Let us know, if you notice anything abnormal. The remaining part of the migration is happening this weekend (details to be posted).
| 4.2 (4 people) |
After our outage last week. Here’s a comeback on the events.
We found that our servers did not go down; the apache servers did (albeit incompletely without releasing the socket/port) This made the browser to wait forever to connect to the webserver, eventually timing out. This delayed the loading of sites which are part of SezWho community.
To add to this hubbub; the night before, our SMS quota on the monitoring service ran out and as such we did not get an immediate notification (One of our developers out of Bangalore, caught this at around 4am his time). We could have resolved this problem in less than 10 mins if the alerts came as usual. The SMS quota has been set to infinity now
This week, we got our new environment setup with bigger irons and we have started moving a few things there. We’re setting it up in a way to have the browser request time out immediately if the servers are not available. We are also exploring suggestions of externalizing the CSS.
We take the quality of our service very seriously and will be up and will do everything in our power to make sure an outage like this never happens again. We are thankful to our community members for their support, cheers (and jeers!)
| 4.1 (7 people) |