• RavenDB: Lessons Learned: Query Includes and Projections

    By default, RavenDB will only allow 30 requests per session. This is part of RavenDB's "Safe by default" behaviors, to prevent you from making a giant number of RavenDB HTTP requests, which would be a performance quagmire.

    Let's say you have an object graph that you are retrieving from RavenDB that contains referenced documents, and it looks like this:

    stories/123
    {
      "Headline": "New iPad is key to Apple's bottom line",
      "Author": "Jack Smith",
      "LastPublishedAtUtc": "2012-03-14T23:48:00.0000000+00:00",
      "PublishStatus": "Published",
      "StoryReferences":
      {
        "storyreference/456213":
        {
          "Headline": "New iPhone coming soon",
          "Author": "John Doe"
        }
        "storyreference/789654":
        {
          "Headline": "New iPad foils reviewers' attempts to find legitimate faults",
          "Author": "Jane Doe"
        }
        "storyreference/555111":
        {
          "Headline": "Now on Netflix: Search by TV network",
          "Author": "Jack Smith"
        }
        "storyreference/942342":
        {
          "Headline": "Apple stores to open at 8am for iPad launch",
          "Author": "John Doe"
        }
        ...
      }
    }
    

    And let's say you are interested in getting a small subset of data about the referenced stories for display with the base story. What you DON'T want to do is something like this:

    
    var story = session.Load("stories/123");
    
    foreach(var storyReference in story.RelatedStories)
    {
        var otherStory = session.Load(storyReference.Id);
        // ... do something with otherStory ...
    }
    
    

    That will result in the following HTTP traffic back to Raven:

    1. Make a request for 'story/123'
    2. Make a request for 'story/456213'
    3. Make a request for 'story/789654'
    4. Make a request for 'story/555111'
    5. Make a request for 'story/942342'
    6. ...etc...

    You'll consume unnecessary bandwidth and incur the cost of individual HTTP requests. What you really want to do is have the client make a single HTTP request. Fortunately RavenDB allows you to do that with Includes. A RavenDB include says "Hey server, go get this for me, but before you give it back to me, gather up these other things and return them with the request too so I can deal with them in a moment".

    A few weeks back we had some code that was hitting the 30 requests per session limit. At first we couldn't understand why, since we do a pretty good job of making sure we only make 1 or 2 requests via Includes. Upon further inspection, it turned out we had misunderstood something about the RavenDB client API.

    What's the problem?

    If we have an index that produces projections, in which it produces a server side anonymous entity containing flattened "StoryReferenceIds", like this (this is a contrived example):

    
    public class Stories_ByReferencedStories : AbstractIndexCreationTask
    {
        public class Result
        {
            public string Headline { get; set; }
            public DateTimeOffset? LastPublishedAtUtc { get; set; }
            public IEnumerable StoryReferenceIds { get; set; }
        }
    
        public Stories_ByReferencedStories()
        {
            this.Map = stories => from story in stories
                                  select new
                                  {
                                      Headline = story.Headline,
                                      LastPublishedAtUtc = story.LastPublishedAtUtc,
                                      StoryReferenceIds = story.StoryReferences.Select(x => x.Id),
                                  };
        }
    }
    
    

    ... Then we had previously done something like the following on our Lucene queries against it:

    
    session.Advanced.LuceneQuery()
      .WhereStartsWith("Headline", text)
      .OrderBy("-LastPublishedAtUtc")
      .Include("StoryReferenceIds")
    

    However, it turns out that last Include line doesn't do anything at all. The Include() call actually operates on the entries identified by the index, NOT the projection. In other words, the stories produced from the query are what the Include() call actually operates against.

    So, with that in mind, what we actually want is something like this:

    
    .Include("StoryReferences,Id");
    

    The syntax with the comma may look a little funny, but what it means is "For the StoryReferenceIds entities collection, Include the document identified by the Id property from each referenced document". So if you had a story with 45 referenced stories in it, instead of making 46 requests back to Raven, you would make only 1 request. That's much better.

    Happy coding.

    Show more
  • msnbc.com wins Compuware's "Best of Web" for site performance in 2011

    For the truly curious, you can download the full report here. The basic idea is that Compuware took a look at availability, responsiveness, and consistency to generate a composite score which ranks the top 25 online news sites. The Compuware folks were even more impressed when they learned that we maintained our number 1 position while doing multiple releases per day and while pushing experiences that are, on average, more complex than our competitors. 

    Although I believe pretty strongly in self-deprecation, I'll break with convention for a moment to give props to my team for their contributions to overall site availability and responsiveness. Congrats!

    For insight into what we've done lately in the war against page load times, check out the new experience on http://www.technolog.msnbc.msn.com/technology/technolog.

     

     

     

     

     

     

  • Customizing RavenDB: A simple RavenDB server bundle for replication conflict handling

    Last week in our webinar on RavenDB, we mentioned that we have at least two RavenDB instances in each of our data centers and that we have each of them configured to replicate to all the others.  One attendee asked how we handle replication conflicts.

    First, we try to limit replication conflicts, by having all writes to a given database directed at a single RavenDB instance.  We showed off our UI for managing that, and I have another post in the works describing the details.

    But there are inevitably cases where there is a conflict.  We have some automated processes that update content, so multiple saves in rapid succession are not unusual.  We also have integration challenges with other systems, resulting in our receiving the same document multiple times at more or less the exact same moment.  If our write master changes at an inopportune moment (either due to failover or an intentional change by an admin), we'll likely have a replication conflict.

    As I mentioned in my post on read-only vs. read-write databases, our front end web apps aren't allowed to write to most databases, so they can't resolve the conflict.  In any case, that would probably make things worse, given the number of front-end servers we have.

    Fortunately, there's an easy solution for us.  The nature of our data and our business means we always want the latest version of a document to win.  The Replication bundle provides a hook to allow custom handling of replication conflicts on the RavenDB server.  We simply compare the Last-Modified value of the existing document with that of the inbound replicating document.  Whichever has the later date wins.

    Here's the code for the LastInWinsReplicationConflictResolver: https://gist.github.com/2012016

    Score another one for the simple extensibility provided by RavenDB.

  • Customizing RavenDB: Read-only and read-write document stores

    One of the simple customizations we've made to RavenDB is the ability to specify read-only vs. read-write document stores.  We have editorial and ingest applications that need to write into our RavenDB databases.  However, we don't want anyone -- even ourselves -- to be able to write to most of those databases from the public apps that render our various sites.

    Raven has a lot of extensibility points, and we used one of them to make some document stores read-only from some of our apps.

    The Raven.Client.Listeners namespace contains several interfaces that allow you to hook into any query, save or delete that happens on a given DocumentStore instance.  For this feature, we used IDocumentStoreListener and IDocumentDeleteListener. We implemented both in a ReadOnlyListener class that throws on any attempt to call Store() or Delete().  That happens on the client side, in our app, before any communication with the Raven server.

    Here's the complete code of ReadOnlyListener:

        public class ReadOnlyListener : IDocumentStoreListenerIDocumentDeleteListener
        {
            private const string ErrorMessage = 
                "The store is read-only. To enable writes, use StoreAccessMode.ReadWrite.";
    
            public void AfterStore(string key, object entityInstance, RavenJObject metadata)
            {
                // Do nothing.
            }
    
            public bool BeforeStore(string key, object entityInstance, RavenJObject metadata)
            {
                throw new InvalidOperationException(ErrorMessage);
            }
    
            public void BeforeDelete(string key, object entityInstance, RavenJObject metadata)
            {
                throw new InvalidOperationException(ErrorMessage);
            }
        }

     

    The Before methods are executed by the Raven client, um, before the call to the Raven server.

    We have a class called PlatformDocumentStore that enforces some conventions when creating a DocumentStore. This is a typical call that happens at app startup:

        PlatformDocumentStore.Register(container, StoreName.Content);
    

     

    That line of code creates an instance of DocumentStore pointed at the nearest read-only version of a RavenDB database named Content.  It also registers it in the container (an instance of UnityContainer).  The second parameter is just a string.  We don't want arbitrary stores being created; the StoreName class contains the names of the "allowed" stores.  When we register the store, we register it as IDocumentStore using the store name.  That allows us to inject it into our controllers with Unity's [Dependency] attribute and specify the name of the store.

    But back to read-only vs. read-write...  What you don't see is the optional third parameter.  An editorial app would have this line at app startup:

        PlatformDocumentStore.Register(container, StoreName.Content, StoreAccessMode.ReadWrite);
    

     

    StoreAccessMode defaults to ReadOnly.  That third parameter says it's okay for the application to write to the store.  When we new up the DocumentStore instance inside of PlatformDocumentStore, we have this code:

        if (accessMode == StoreAccessMode.ReadOnly)
        {
            documentStore.RegisterListener((IDocumentStoreListener)new ReadOnlyListener());
            documentStore.RegisterListener((IDocumentDeleteListener)new ReadOnlyListener());
        }
    
    

    With the listener attached for both Store() and Delete() calls, we have effectively prevented any modifications to the data from within this app.

    It's important to note that the database on the Raven server is perfectly happy to accept writes -- we've just hooked into all attempts to use the client-side API from within this one application.  Also, a developer could simply new up a DocumentStore without the listeners and call Store() or Delete() to their heart's content.  This is not a security measure -- it's just a simple way to prevent inadvertent writes where we don't want them.  Like setting the Read-only flag on a file in the file system.

  • How we got started with RavenDB

    Developers always want to use the newest, coolest tools.  Admins want everything to be 100% reliable and stable.   RavenDB was not just a new tool, it was an entirely new kind of tool for us.  Successfully introducing it to our environment required our Ops folks to be comfortable with monitoring and maintaining it.

    After our initial testing in various environments, we talked about it with Ops and deployed RavenDB to production -- without anything actually talking to it.  That was step 1 -- get it deployed.  It was on its own servers and nothing depended on it, but it was out there.  RavenDB was now part of our build and deployment process, so we could update it whenever we needed to.

    Step 2 was to deploy some fairly trivial code, accessible only to internal users, that used RavenDB.  So far so good.

    Step 3 was to turn on replication across the multiple instances of RavenDB.  Here we did find some issues: a few in our own code as we learned about RavenDB; one in RavenDB itself -- a bug involving the combination of replication and DTC-managed transactions.  After a few emails with Oren and Hibernating Rhinos support, a build with a fix was available for download within a couple of days.  We also asked for a new feature -- the ability to disable transitive replication.  That was available in another build within a week.

    Step 4 was to start using RavenDB behind a real feature.  We chose a non-mission-critical feature of our editorial system -- something that, if broken, would only affect our editors, not the public.  We shipped that and learned a bit more.  Bryan posted about his experience creating that feature.

    Step 5 was to ship a feature used by the public.  We did an A/B test of a new user experience for some of our content in late 2011.  That ran on RavenDB.  More recently, in Feb 2012, we fully released that feature:  we transitioned some of our blogs to a new look and feel, from our new RavenDB-backed CMS.  You can see those blogs at:

     

  • Why we looked at RavenDB

    In my previous post, I talked about our initial work in our new "SkyPad" CMS, and some of the tradeoffs we made around data storage and replication.

    As we continued to evolve SkyPad, we realized that the majority of our data isn't inherently relational.  It's almost entirely documents -- serialized object blobs.  That caused us to look at the available options for key-value stores, document stores, distributed hashtables and the like. 

    RavenDB looked like a good fit for a bunch of reasons:

    It is a document database. The basic paradigm is fairly close to our problem domain: saving, updating and publishing documents.  Our editors, developers and admins think and talk in terms of documents.

    It is schema-less.  To keep up with our competition and with what our users expect, we have to be adding and changing features constantly.  We need to add properties to our entities, create new entities, split entities, rename entities.  RavenDB stores documents as JSON.  It de/serializes our types effortlessly, while giving us the tools to take over and do our own thing when necessary.  There is no schema to keep in sync with our code changes.

    It is transactional.  We can't afford to lose a document.  If our database says "I saved it", we need that to be true no matter what.  Our service bus is built with WCF and MSMQ.  RavenDB works with System.Transactions as you would expect, so we didn't have to do anything special to use it during transactional message handling.

    It is .NET-focused and supports querying with Linq.  We're a .NET shop.  Ramp-up speed for new developers and code readability are critical.  Anyone who has used C# and Linq can understand the queries we're making against RavenDB and start creating their own very quickly.  Not to mention we can all read the RavenDB source itself more easily.

    It is extensible.  The list of provided bundles reads almost like our requirements list:  replication, versioning, document expiration, authorization.  Plus we can change RavenDB's behavior where we need to.  (Bundles on the server side, and listeners on the client-side.)  That makes our developers a lot more comfortable.

    It is both open source and commercial.  We can see all the code, submit pull requests when we want something to be different, or just make our own private changes to the codebase.  But we can also pay for it, count on support, and -- ahem -- have a throat to squeeze if something goes seriously awry…

    It is based on very mature storage technology.  RavenDB writes everything to disk, and the underlying storage is ESENT, which is also used by Exchange and Active Directory.  Those underpinnings made our Ops team more comfortable trying it out.  We have a humongous read-write ratio, so the performance characteristics of writing everything to disk are just fine for us.

  • The backstory on data storage in our CMS

    In 2009 we started migrating features from our legacy CMS to a new one, which we refer to as SkyPad.  In previous posts, I've talked about our SOA approach and our service bus.  Here I'll be talking just about data storage.

    We use Sql Server a lot.  It's been supporting most of our systems and running well for 15 years.  It was natural for us to continue using Sql Server as we started to create SkyPad. 

    As we started developing features in SkyPad, we found that almost all of our tables were identical.  We deal with documents  -- stories, videos, slideshows, images, etc. -- and the table structure we were creating over and over again consisted of a document id, an xml blob with the document contents, and some timestamps for auditing and concurrency control. 

    That common table structure led us to develop a simple CRUD repository.  Create, Update and Delete were trivial, as was retrieving by ID.  Querying was harder, since it required either storing the queryable properties separately, or digging into the XML blob.  All doable, but not as simple as we wanted.

    As a major news organization, it is, alas, a newsworthy event if one of our sites goes down even for a few minutes.  So our sites run in multiple datacenters, and our ops team requires complete redundancy within each data center.  We have sets of servers called "pods".  Each is a small, but complete, production environment.  During normal maintenance and deployments (a couple times a day), we can take a pod offline and still be running live out of all the data centers.  That requires our data to be redundant also - each pod has a copy of all the live data.

    Sql Server provides excellent replication and high availability features.  However, we were trying hard to build a system that could be run on commodity hardware and scale out, not up.  We wanted to avoid clustering, SANs and other solutions that require extensive setup, care and feeding by our Ops team.  We wanted teams to be able to quickly spin up a test environment that looked a lot like production.

    For those reasons, we decided to use our service bus to synchronize data across the various databases in multiple data centers.  When a pod is taken offline, data synchronization messages start queuing up.  When the pod is brought back online, the messages are handled and the pod's data is up to date before it starts receiving live requests.

    That works fine, though it has some issues.  To reduce concurrency conflicts, at any given time, one pod is designated as the write-master for each application that is doing writes.  We deploy multiple times a day, and we need to change the write-masters each time.  We need a tool to re-sync pods when the inevitable problems occur or when we bring a new pod online.  These were the tradeoffs we made when deciding to roll our own replication.

  • Raven DB: Lessons Learned: Caching Contexts

    Caching

    When you talk about caching in terms of the the full web application stack, you've typically got the following layers:

    • Browser cache
    • CDN cache
    • Application output cache
    • Data cache

    However, in a application leveraging Raven DB, the last layer actually gets split up into two layers.

    Some Background

    The way Raven DB operates is by having the client generate HTTP requests, which are sent across the wire to the server. Therefor, the same standard caching mechanisms that HTTP provides are present. This means, that if a request is made, and Raven DB thinks the data hasn't changed since the last time you requested that same data, then the server responds with HTTP 304 Not Modified, instructing the raven client to continue to use what it got last time.

    using (var session = store.OpenSession())
    {
        //if the server doesnt have anything different from the last time this was requested,
        //it wont do any processing, and just return HTTP 304 Not Modified to the client. the
        //client will then use what it got last time.
        var foo = session.Load("foos/123");
    }
    

    So the first layer of the data cache, you get for free out of the box with Raven. Fortunately the second layer is available as well, if your application needs it.

    Aggressive Data Caching

    With Raven DB, it's possible to instruct the client to not even ask the server for data again, thereby skipping the HTTP request, even if it might result in a 304. Here's what that looks like:

    using (var session = store.OpenSession())
    {
        //set up an aggressive caching context, instructing the server to not
        //make an http request if it made one within the last 5 seconds
        using (session.Advanced.DocumentStore.AggressivelyCacheFor(TimeSpan.FromMinutes(5)))
        {
            var foo = session.Load("foos/123"); //may or may not make a request
        }
    }
    

    Runtime Configuration?

    We made mention in a previous blog post about a runtime configuration setup that we've provided our ops team with. Having the ability to control the TTL on the Raven runtime configuration seems like a prime candidate to use with this. We wired up the runtime configuration much the same as the output caching runtime configuration from the other blog post.

    Clever

    Now, to use output caching, it was a simple line to apply the [ConfiguredOutputCache] attribute to our controller actions. However, with the raven data caching, it's a violation of DRY to have to open an aggressive caching context, and pass in a runtime configuration value everywhere it's needed. So, with that in mind, we came up with an extension method to encapsulate that behavior. We thought this was very clever, but it actually turned out to be quite stupid. Can you spot the problem?

    public static class DataCachingExtensions
    {
        public static T LoadAndCache(this IDocumentSession session, string id)
        {
            using (session.Advanced.DocumentStore.AggressivelyCacheFor(CacheSettings.RavenAggressiveCachingDurationSeconds))
            {
                return session.Load(id);
            }
        }
    
        public static IRavenQueryable QueryAndCache(this IDocumentSession session)
        {
            using (session.Advanced.DocumentStore.AggressivelyCacheFor(CacheSettings.RavenAggressiveCachingDurationSeconds))
            {
                return session.Query();
            }
        }
    }
    
    ... 
    
    session.LoadAndCache("foos/123");
    
    ...
    
    session.QueryAndCache().Where(f => f.Bar == "Baz").ToList();
    
    

    The first extension method is fine, but the 2nd one doesn't do anything at all. Why?

    It's because Raven doesn't actually execute the HTTP query until it's evaluated. So since we haven't actually executed the query, and have returned from inside the aggressive caching context, the context was disposed before we ever execute the HTTP query, resulting in no caching.

    So after feeling pretty silly, we restructured the extension method to simply return the aggressive caching context, so that the caller can encapsulate the full query including its execution.

    public static class DataCachingExtensions
    {
        public class NonCachingContext : IDisposable
        {
            public void Dispose() { }
        }
    
        public static IDisposable GetCachingContext(this IDocumentSession session)
        {
            if(CacheSettings.RavenAggressiveCachingDurationSeconds == 0)
            {
                return new NonCachingContext();
            }
    
            return session.Advanced.DocumentStore.AggressivelyCacheFor(CacheSettings.RavenAggressiveCachingDurationSeconds);
        }
    }
    
    ...
    
    using (var session = store.OpenSession())
    {
        using (session.GetCachingContext())
        {
            session.Query().Where(f => f.Bar == "Baz").ToList();
        }
    }
    

    It's worth pointing out that the caching context, when used with a Query, does not actually cache the items returned, but rather just caches the query/response aspect, so subsequent cache-enabled calls to .Load for items that were returned from a cache-enabled query context will still make a request take place, if they weren't already cached by a .Load call themselves.

    Happy coding!

  • Performance on-demand; Giving your ops team runtime flexibility

    Performance On Demand

    Pretend you are in an operations position, in which your job is to maintain the infrastructure that routes traffic and the servers that serve requests. Wouldn't it then be nice, if you suddenly had a surge in traffic or a drop in available server hardware (be it expected or unexpected), you could alter the performance characteristics of your web applications?

    This is a problem we've been tackling with our new set of web apps, and we think we've got a pretty good solution in place.

    Operations Administration Panel for Runtime Configuration

    For starters, we've created an administration web application for our operations folks, whose primary purpose is that of runtime configuration. Operations can control various aspects of our systems from this application, including:

    • Logging levels
    • Caching TTLs
    • Database masters/slaves and replication strategies
    • Application settings
    • Network locations for editorial assets
    • Logical service bus participants
    • Etc.

    When any of these settings are updated, we send a message on the service bus informing subscribers of changes in the settings they care about. Let's analyze the one we made reference to above, which will provide ops with a way to dial in performance on demand.

    Output Caching

    In the administration panel, we've provided a settings page where the output caching TTL and data caching TTL can be set for a given application. When this setting is updated, we publish a message on the service bus, which our front end rendering ASP.NET MVC application can subscribe to.

    Creating a handler in the rendering application then is pretty easy. We listen for the settings type that corresponds to caching:

        public class CacheSettingsUpdater : SettingsChangedHandler
        {
            protected override bool ShouldHandle(string id)
            {
                return string.Equals(
                    id,
                    CacheSettingsData.StorageId,
                    StringComparison.OrdinalIgnoreCase);
            }
    
            protected override void Update(CacheSettingsData settingsData)
            {
                CacheSettings.UpdateSettings(settingsData);
            }
        }
    

     

    As you can see, the handler can then inform a settings class by calling its "UpdateSettings" method, who keeps a reference to the latest data.

        public static class CacheSettings
        {
            private static CacheSettingsData Data = new CacheSettingsData();
    
            public static int OutputCacheDurationSeconds
            {
                get
                {
                    return
                        Data.CurrentAppCacheParameters
                            .OutputCacheDurationSeconds;
                }
            }
    
            internal static void UpdateSettings(CacheSettingsData data)
            {
                Data = data;
            }
        }
    

     

    Leveraging it with OutputCacheAttribute

    Now, in ASP.NET MVC, there is an action filter for output caching: OutputCacheAttribute. This attribute can be applied at the controller level, or at the individual action level. When an action is run the first time, the framework will cache the result, such that the next request won't require processing again, and will be delivered from cache. The cached item will be delivered from cache until the TTL/Duration expires. The effect of this is that your application won't be processing for every request, and will be able to therefor serve more requests.

    The issue with connecting up our runtime configuration class from above (CacheSettings) to the OutputCacheAttribute, is that the settings for a filter can only be specified by constants, like so:

        [OutputCache(Duration = 10)]
        public ActionResult Index()
    

     

    So, we need to instead create our own action filter, which inherits from OutputCacheAttribute, so we can control where it gets its values from. I've simplified this for brevity to just illustrate the Duration extensibility point.

        public class ConfiguredOutputCacheAttribute : OutputCacheAttribute
        {
            public new int Duration
            {
                get { return base.Duration; }
                set
                {
                    throw new NotSupportedException(
                        "Duration cannot be set directly. " +
                        "Set from runtime config.");
                }
            }
    
            public ConfiguredOutputCacheAttribute()
            {
                base.Duration = CacheSettings.OutputCacheDurationSeconds;
            }
    
            public override void OnActionExecuting(
                ActionExecutingContext filterContext)
            {
                base.Duration = CacheSettings.OutputCacheDurationSeconds;
                base.OnActionExecuting(filterContext);
            }
        }
    

     

    As you can see, when we hit OnActionExecuting, we check the CacheSettings class for the current output cache duration, and set it on the base OutputCacheAttribute class we inherited from. The effect of this, is that during day to day traffic, operations can control the cache TTL.

    Then we just apply it where we want to cache:

        [ConfiguredOutputCache]
        public ActionResult Index()
    

     

    Well, how did we do?

    Let's see what it looks like if I simulate light traffic load. The red line indicates request execution time.

    Turning on output caching results in a dramatic drop off in request execution time.

    The dramatic drop off occurred when I went into the operations administration panel and changed the TTL. The spikes every 10 seconds following the drop off are when the cache duration TTL expired, forcing the page to actually process again.

    There are a number of things we can do to enhance the flexibility of this system. For example, we could specify groupings in the operations administration panel that correspond to cache policies, and then simply specify on each instance of our attribute which policy we'd like to use:

        [ConfiguredOutputCache(CachePolicy = "FooCachePolicy")]
        public ActionResult Index(string streamSlug)
    

     

    We think this feature will be particularly valuable in situations where we need more performance on demand, and look forward to extending it to have more flexibility as needed.

    Happy coding!

  • Battling the Fallacies of Distributed Computing with RavenDB

    Recently, I deployed some code that had the following requirements:

    • When post {x} is first published in CMS A, import a summary of {x} into CMS B
    • When {x} is subsequently updated and re-published in CMS A, do nothing

    Seemed pretty simple.  Due to the limited API support in CMS B, I used RavenDB to maintain a record of posts I had already imported from CMS A to CMS B in order to honor the second requirement.

    Worked on my machine

    Immediately after deploying, I basked for about 30 seconds in the praise from our editors. Moments later, I started receiving reports of duplicate posts showing up in CMS B. I was flabbergasted. I had been careful to handle the dupe scenario in code.

    I checked the code again. By design, the code prevents dupes from being created in CMS B…. unless the duplicates arrived less than a few milliseconds apart. Fail.

    When I looked at the server logs, the duplicate notifications were indeed happening less than a millisecond apart. For a few minutes, I thought about how I might prevent the duplicate publish notifications. Ultimately, I embraced the first two fallacies of distribute computing instead:

     

    1. The network is reliable
    2. Latency is zero

     

    Duplicate notifications arriving a couple of milliseconds apart are a fact of life. Deal with it.

    Raven etags and concurrency control to the rescue

    One of my favorite aspects of Raven is that it’s ACID when you need it, BASE when you don’t. Here’s how we made it really ACID-y to solve the duplicate import problem:

    1. When a notification comes in, check Raven to see if we’ve already imported the post.
    2. If the post has never been seen, create a new document in Raven with a null etag and using optimistic concurrency. Used this way, the Raven client will throw an exception if anyone else tries to create the same document. Here’s the code:

          using(var session = store.OpenSession())

                  {

                      session.Advanced.UseOptimisticConcurrency = true; 

                      var post = new Post()

                                     {

                                         Id = id,

                                         ImportStatus = ImportStatus.ImportStarted,

                                         ImportStartedAtUtc = DateTimeOffset.Now

                                     }; 

                      session.Store(post,null);

                      session.SaveChanges();

                  }

       

    3. Send a message using our service bus to actually perform the import – Though not related to RavenDB directly, see Jimmy Bogard’s post on how to use messaging patterns to interop transactionally with non-transactional systems (in my case, the CMS B APIs do not participate in a distributed transaction, so we had another source of dupes when message failures were retried after a transaction rollback).

    Steps 1-3 are wrapped in a distributed transaction. When a simultaneous duplicate notification occurs, step 2 will fail for all but one of the notifications. All the failed transactions get rolled back and dupes no longer show up in CMS B.

     

  • Blitz.CSharp: A Nuget package for Blitz.io

    Since the Blitz folks lacked a C# library to interact with their API, I took the task upon myself. I've spun up a Github repository for it as well as a Nuget package.

    To take it for a test drive, create a new project, run "Install-Package Blitz.CSharp" from the Package Manager console, and paste the following code:

    Code Snippet
    1. using System;
    2. using Blitz.CSharp;
    3.  
    4. namespace Blitz.Console
    5. {
    6.     public class Program
    7.     {
    8.         private const string apiKey = "xxx-xxx-xxx-xxx";
    9.         private const string userName = "jon.doe@foo.com";
    10.         const string url = "http://foo.com";
    11.  
    12.         private static void Main()
    13.         {
    14.             var sprintRequest = new SprintBuilder()
    15.                 .FromRegion("virginia")
    16.                 .WithStep(url)
    17.                 .AsSprintRequest();
    18.  
    19.             var rushRequest = new RushBuilder()
    20.                 .FromRegion("oregon")
    21.                 .WithStep(url)
    22.                 .WithInterval(1, 10, TimeSpan.FromSeconds(10))
    23.                 .WithInterval(10, 250, TimeSpan.FromSeconds(50))
    24.                 .AsRushRequest();
    25.  
    26.             var sprint = new Sprint(userName, apiKey);
    27.             sprint.SprintStatusChanged += (s, sprintStatus) =>
    28.             {
    29.                 System.Console.Out.WriteLine(sprintStatus.status);
    30.             };
    31.             sprint.Execute(sprintRequest);
    32.  
    33.             var rush = new Rush(userName, apiKey);
    34.             rush.RushStatusChanged += (r, rushStatus) =>
    35.             {
    36.                 System.Console.Out.WriteLine(rushStatus.status);
    37.             };
    38.             rush.Execute(rushRequest);
    39.         }
    40.  
    41.     }
    42. }

    Feedback warmly welcomed.

     

  • Blitz.io – Load testing made easy

    If you haven’t taken AppHarbor for a test drive, you should. While you’re there, check out the various 3rd party add-ons. Without fail, they are useful, easy to use, and have free usage tiers that allow you to do interesting things without divulging your credit card number. This is how I stumbled upon Blitz.io.

    What is Blitz?

    Load testing is a strange addiction to develop but many would be load testing junkies are turned off by the complexities of simply hammering a server with a bunch of requests. Blitz.io makes this as easy as clicking a button. Seriously. Within seconds, you can have an army of load agents at your disposal, ready to pound your servers from Amazon’s public cloud.

    What will Blitz.io do for me?

    Glad you asked. Blitz isn’t going to solve all of your performance problems. In fact, it’s beauty lies in its simplicity, so don’t expect an “enterprise” product with thousands of features. However, in the last couple of weeks, I’ve used it to: 

    • Verify my basic CDN configuration is working
    • Verify my load balancer is working as expected
    • Verify my application cache performance is decent
    • Verify application performance is tolerable in general
    • Find and fix a performance issue

    What Blitz hasn’t done for me:

    • Collected perf counters from my server – you should be doing this anyway, right?
    • Told me how to fix my problems  

    Blitz makes load testing fun

    As odd as it sounds, watching production servers sweat as they get pummeled by Blitz is pretty entertaining. One of my new office past times involves having Blitz on one monitor and Splunk on another and watching how well the latest build performs in production.  The Blitz UI is easy on the eyes. See the screenshot below. Better yet, take 5 minutes and try it out yourself.