On Yesterday's Outage

Folks

First, thank you for using Scholaric for your homeschool planning.  Again, my apologies for not having access to Scholaric for such a long time.

As you've no doubt heard, this outage had an impact well beyond Scholaric, or even Heroku, our hosting provider.  Some of the services continue to be down as I write this out.  In fact, it was covered by CNN: http://money.cnn.com/2011/04/21/technology/amazon_server_outage/index.htm?iid=RNM

I don't mean to make any excuses for the outage, when you run a business, you don't get that luxury.  I can only explain what I know, fix what I can, and try to help you however possible.

What Happened (from what I can tell):

Amazon's Cloud Services (called EC2 for Elastic Cloud Compute) had serious network issues yesterday.  The Elastic part of EC2 makes servers replicate themselves when they get too busy, to handle demands which can spike so quickly on the internet.  Networking issues such as these can cause servers to appear extremely busy, to handle error messages sent between them.  This business caused EC2 to kick in, and caused a number of servers to replicate themselves.  Under normal circumstances, only a few servers will replicate themselves, but this volume of replication was too much load for the off-server disk systems (EBS or Elastic Block Storage) and they became backed up, making the problem a wider one.

Should Scholaric Use a Different (non cloud) Hosting Service:

In my opinion, no.  There are a few models to choose from when shopping for hosting: a Shared Host, where you deploy to a single machine, along with other web programs; a Virtual Private Server, where you share a server with other programs, but some software makes it appear that you have your own (and you mange it yourself); a Dedicated Server, where you truly have your own sever (and again manage it yourself); and finally a Cloud Service, where your software is deployed to an entire set of servers.  Note that experts often talk about moving services "to the cloud" which does not necessarily mean to a cloud-based hosting service.

Of these, clouds are the most complex, but protect against (1) sudden spikes in traffic and (2) continued use of a service beyond its capacity (3) server failure.  The server load issue of (1) and (2) should not be overlooked - if services cannot meet demand, it can be very difficult to add capacity, halting all development for extended periods of time, in order to rearchitect the system to make it scale better.  In a cloud service, the service can scale up to meet your demand.  For Heroku, this is as simple as logging in to their control panel, and cranking a dial up.

For server failure (3) the worry about losing a web server or a database server is non-existent in a cloud service.  The data and code are running in more than one place and should one go down, the other is available, and more instances can be deployed easily.  Of course, things can still go wrong, and cause a service to be unavailable, as they did yesterday.  Our service was still running, but nobody could get to it, due to other issues.  This problem happens, regardless of the above models of hosting.  The difference is (1) how widespread the outage is and (2) with a cloud infrastructure (or shared hosting) the hosting provider is made aware of the issue more quickly.  Yesterday, I didn't have to tell Heroku (or Amazon) I was having a problem.  They were working on it before I had a problem.

Heroku has been a large part of my ability to build Scholaric, and I am still extremely happy I chose Heroku.  In a year since our Beta launch, I know of only one other outage, which lasted a few minutes, and for which I received no customer complaints.

More To The Point - What Are Our Risks:

When you rely on an online service like Scholaric, you want to know the following:

(A) Are backups done?

Yes, Heroku does backups, and I backup the database independently of Heroku.

(B) What happens if a server crashes?

Explained above.

(C) Can I get to my data if I stop using the service?

We have not had this happen yet, but our plan is to make your data available to you as a CSV file, which you can bring into a spreadsheet, should you quit Scholaric.

(D) How is it tested?

We extensively, and automatically test Scholaric.  When I work on Scholaric, I turn on my automated tests.  They run while I am coding, running every time a save a file.  As of this writing, I have 686 tests, which make 1673 checks.  I have written 2.2 lines of test code for every line of program code.  This does not mean that problems can't get into production, but that we make every effort to prevent them.

Finally - What Are We Going To Do About it:

I'll continue to track the response of Amazon and Heroku and let you know of any changes they make in response to this problem.

Again, I don't mean to sound like I'm blaming them, I want to do what I can for you.  That is why we are giving a free month of Scholaric to all customers in response to this outage.  For those in their free 15-day trial, we will move back your first payment date by a month.  For those in the beta program, we are setting your first payment date to July 1st.  If you have yet to see our pricing, please go to http://scholaric.com/marketing/pricing

I apologize again for this outage.

jeff

Scholaric Service Up

My last try, Scholaric was back up, and I was able to log in.

I will continue to monitor the situation.

At this time, I would like to again apologize for this outage, and want to assure you that your data is in no way at risk through this issue - as I regularly back it up, and so does Heroku.

For those of you new to Scholaric, we have really never had an outage of any significant magnitude, so we are doing our best to be open with our customers about the issue and status.

Again, I'll need to gather my thoughts about the issue and let you know.  I apologize once again.

jeff

Update: Some Progress Being Made

I have intermittently been able to get some of the web site back.  I hope this means we are getting closer.

We have an update from Heroku 

UPDATE: We have been able to successfully boot new servers and are in the process of restoring our core services. Once our core services come online we will be able to start to bring app operations back online. We will post further updates as soon as we have additional information.
APR 21, 2011 – 21:15 UTC – 34 MINUTES AGO

Again, I apologize 

jeff

Outage Continuing

I am sorry to report, that there is nothing to report.

I have been frustrated by this outage.  Amazon is giving updates, which claims service is being restored.

Indeed, many of the effected services have been restored, but not Heroku, on which Scholaric.com is hosted.

I will publish my thoughts and opinions on this outage after service is restored.

jeff

Scholaric Outage

Users

Scholaric.com is currently down.

I've been tracking an ongoing issue with my hosting provider, Heroku - which you can see status at http://status.heroku.com

This morning, Scholaric was working properly, but I was unable to use my tools.  This was identified as network connectivity issues, effecting tools.

The underlying cause has now been identified as an Amazon Web Service outage - which you can see on http://status.aws.amazon.com/?a 

This Amazon Outage extends beyond Heroku, and other services, like Squarespace are having problems today also.

I sincerely apologize for this outage, but I continue to be confident in Heroku, the hosting provider for Scholaric, and Amazon Web Services.

As is my policy, all users will get their next payment date moved back due to this outage, and all free trials will be extended.

I will post again when the service is back up.

jeff

Update: The list of effected services include Netflix, Reddit, and Quora, as well as Squarespace.

Update: Here is what you usually see:

Application Error

An error occurred in the application and your page could not be served. Please try again in a few moments.

If you are the application owner, check your logs for details.

Review of Our First Confence

Response from conference was overwhelmingly positive. We met with
 
  • Homeschoolers who used competing products and wanted something simpler
  • Homeschoolers who used paper and wanted to save time
  • Homeschoolers who had lost track of their records and needed to start tracking again
  • Homeschoolers whose homeschool was not as organized as they wanted
  • Homeschoolers who ran a hybrid school/homeschool and needed a solution
  • and some (only a few) homeschoolers who were happy with what they had, and did not want a demonstration.
 
But I did not meet a single:
  • Homeschooler who looked at Scholaric and said they liked their current homedchool planning software better.
 
They spoke unanimiously - they were excited about Schoaric’s combination of simplicity and powerful features, and the price was right.

We are more energized than ever