How To Transition From Being an In-Office to Remote Programmer

Note: This article was originally posted to my personal Medium account, which you can find at

“person facing laptop inside room” by Muhammad Raufan Yusup on Unsplash
“person facing laptop inside room” by Muhammad Raufan Yusup on Unsplash

My name is James, and I’m a Software Engineer at a company called Yesware, based in Boston. Yesware is the fourth job I’ve had in which I’m paid to write code, but it’s the third time now that I’ve transitioned from being an in-office employee to a remote one. Since I’ve handled this transition a few times now, in varying degrees of complexity/difficulty, I figured I probably have at least a few words for anyone else looking to begin working remotely.

But first, a warning: survivorship bias guarantees that your mileage may vary with regard to anything I say here. Switching to a remote position has worked for me in the past, but that in no way means it will work for anyone else in the same ways. You don’t often see articles entitled “How I Completely Failed to Work Remotely and Botched My Dream Gig”, but I’m sure there are many cases of folks doing exactly that. That would actually probably be a more helpful article, so consider this an open invitation for someone to write it as a response.

Also, I talk about my place of work in this article a fair bit, but I’d like to state clearly that I wasn’t asked to, or otherwise encouraged to at all. Any reference I make to my employer is included only because I see it as relevant or helpful to others, so as to get an idea of how a company might support its employees.

Why Remote?

This is a question I won’t dwell on too much because there are endless blogs, listicles, and books¹ written on the subject. Remote work is fantastic for some personality types as it affords much more freedom. Some folks just enjoy being able to move, to be able to go where their friends/family/bucket-lists take them. Being able to take your job where you want to go is, in my opinion, less stressful than having to factor in finding another one if you decide to leave. Simply put, it resolves any dichotomies between having a job you enjoy, and anything else you want to do. You can have your cake and eat it too².

Why Not?

Remote work can be much harder depending on a myriad of details related to one’s personality, the nature of their work, their company’s culture, etc. Are you the type of person who can self-regulate and ensure you’re allowing for extended periods of focus, wherever you are? Is the work you’re doing something that can be done without any coworkers immediately available? Is the company you work for willing to put in the extra effort to ensure you feel connected and valued? Do you have some misconception that remote work means doing any less work? Do you have a spouse or other close family member who this will affect? Is the Wifi going to be any good where you’ll be based? I’ll leave you to ruminate on those questions, but there are many reasons that being remote wouldn’t be a good fit. They each warrant honest discussion with yourself, those close to you, and your employer.

How To Tell Your Company

Eventually, you’ll need to broach the subject with your boss, and while there are various ways to do this, there is only one I recommend.

The worst way I ever went about it featured me telling my boss that I was moving in two weeks, no ifs-ands-or-buts about it. I was essentially abruptly quitting, without really offering to discuss the matter. I knew I had to go, and I was ready to leave my job to make that happen, but I feel as though I had cornered my boss. After my spiel (“I grew up here, I’m ready to leave, I need to do this”, etc.), he told me to just work from Boston, and that there was no reason I couldn’t stay on the team while working elsewhere. This was incredibly surprising to me, and something I hadn’t considered. I ended up taking the offer. A few weeks later I was in Boston working the same job as before, and paying significantly more in rent.

This method felt wrong for a few reasons. Namely, I didn’t bring the idea up as a conversation, and I went in defensively, expecting the worst. Also, I didn’t value my role in the job enough! I didn’t expect that I was valued enough by my company to warrant letting me move and stay on the team, and that was unfair to myself.

By the next time I had to have the same conversation, I was already 9 months into my tenure at Yesware. I had an opportunity to move to Spain with a friend, but I valued my job an enormous amount. I was simply learning and growing too much to justify leaving³. Since I felt more comfortable at Yesware than any previous company I’ve worked at, I figured I’d just talk to my boss about the opportunity openly. I wrote him an email one morning stating plainly that I had an opportunity to move to Spain in the upcoming Fall, and I wanted to talk about if that would be possible while still working for Yesware. I dug up the original email, and the sentence that best captures its spirit was “I just want to open this dialogue to see what is and isn’t possible, without any expectations.”

My boss and I immediately had a short 1-on-1 meeting to talk about the idea further, and we agreed that, while it was unlikely to happen, we could at least see what it would take to work. If there were no legal reasons we couldn’t do it, and the company felt confident that my work wouldn’t suffer, we felt as though it should be fine. Over the next few months, we discussed what it would take to demonstrate that I can handle remote work, and that our team can. We organized a month long experiment where our engineering team of ~6 people would be remote every day, and I used this as an opportunity to show that I can be perfectly productive (often times much more so) while at home. Eventually, once we had clarified everything with the Yesware legal team, I was made aware that I could work from Europe if I so chose.

“brown concrete buildings” by Alasdair Elmes on Unsplash
“brown concrete buildings” by Alasdair Elmes on Unsplash

The most important thing I learned from this is that discussing big changes like this requires one to be honest, upfront, and realistic. By making sure Yesware knew I wouldn’t leave the company if they said no, I helped make the process easier for them to consider. We also put in the extra work to make sure it would go over smoothly by making sure the team would be comfortable with me being remote, and by proving that our productivity wasn’t jeopardized.

What It Actually Takes To Work Remotely

I’d like to take a moment to reiterate my warning message at the beginning of this article, in which I claim that just because this works for me, it doesn’t mean it will work for anyone else. You’ve been warned.

The biggest struggles I expect one to encounter when starting to work remotely are, in no particular order, managing to keep a normal schedule, creating a space conducive to focusing, and feeling connected to your team. They are also 3 sides to the same geometrically-impossible coin. For me, managing to succeed in one of those areas makes succeeding in the others easier. Here are some ways I do that, and some ways that my team supports me:

Stick to a routine. Pick a chunk (or chunks) of time every day to focus on work. While a remote schedule often means some degree of freedom when choosing when to work, we can get a lot out of maintaining some sort of normal hours. I tend to stick roughly to working hours EST (which is 3pm to 11pm for me in Spain), but I also know I’m most focused first thing in the morning, so I always do an hour or two in the morning, then trim a little off my evening hours when my team doesn’t tend to schedule meetings.

Schedule time for casual conversation. By being remote, you’re going to miss out on casual chats at lunch with your team, and you’re going to be deprived of spontaneous conversations with other departments at the proverbial water cooler. These can be vital interactions, however, and they warrant being protected. Find time to engage coworkers in non-work discussions so you can continue to develop natural friendships with those you spend so much of your day with.

For example, my team often uses our bi-weekly retro meetings to talk about whatever we want for an hour. We do end up being able to talk about things that have gone well or poorly since the previous iteration of the meeting, but we don’t worry about being completely distracted for an hour with a coworker’s photos from a trip abroad, or the grizzly details of an ongoing murder investigation⁴. Our daily standups can also get a little carried away sometimes, but as long as they stay within the allotted 15 minute window, everyone is happy to chat a little bit about whatever. It helps to bring the team a little closer in a small but ultimately meaningful way.

Set up your ideal workspace. This is a big one for me personally. There is a profound difference in my productivity when I am working with all the tools I need versus when I am not. For example, I’ve realized that the correct keyboard for me is a dealbreaker, and I tend to work faster with another monitor hooked up to my machine. Also, my back gets achy after sitting for too long, so I usually use a standing desk, or an ergonomic chair. Invest in your workspace, and the boost you’ll experience in being able to focus will ensure the money is made back. Yesware has a generous employee expensing policy, so I’m encouraged to buy what I feel I need to be healthy and productive on the company’s dollar. If your employer can’t make the same offer, it’s still important to spend what you can on having a proper battle station.

Get out of the house. Now that your consummate work station is complete with an ergonomic keyboard and mouse, powered standing desk, miniature koi pond, 4K monitor, and Newton’s cradle, leave it all and go somewhere else. Work a part of your day at home, and maybe finish the evening at a coffee shop, or library, or a friends house. Do everything you can to not get stuck inside, because when you work where you live it’s a serious risk. When I was in the US and working remotely, I would always either skateboard or go for a walk immediately after finishing work (now I just do it before). I also purposefully avoid buying any sort of coffeemaker so I need to physically leave the house to get coffee, and my regrettable caffeine addiction mandates that I leave at least once to satisfy the craving.

Some folks go so far as to pay for a membership at a coworking space just so they can actually go to work. While I haven’t made this jump quite yet, I can understand the appeal: we need to carefully illustrate the differences between working and not working. It isn’t as easy when the place where I work is also the place where I unwind.

Hop on a video call. This is one of the more team-oriented pieces of advice for making a remote job work. When it comes to getting answers or clarification on something, or needing to discuss an issue that’s even partially technical, it can be much easier to just hop on a video call instead of having to type everything out in Slack, or whichever medium you use for such communication. Our goal is to make our being in another location as simple and straightforward as possible for our team, and that often means acknowledging that talking can be better than typing. Since going remote, I’m now more likely to just hop on a call with a team member to discuss something, rather than spending 15 minutes discussing in Slack. Plus the face-to-face time is just nice anyway.

Iterate. If this is the first time you’ll be working remotely, or your company is making the transition to support remote workers for the first time, it’s going to take some trial and error to make everything work fluidly. Check in with your team or manager to see how things have been (this would be a perfect topic for retro), and experiment with different practices if there are issues. For example, because Yesware recently started to open up our engineering teams to remote applicants we’ve had to experiment with different video conference solutions to see what works best for our space. It’s an ongoing endeavor, but we are closing in on what feels like the right product, which will ensure remote workers (hopefully) never have a problem calling into meetings and planning sessions. Besides, any large transition is going to demand time before feeling comfortable, so give it a couple weeks being remote before panicking that you can’t handle it.

Meet up every once in a while. Being a remote employee doesn’t mean never seeing the rest of your team or company. Schedule a trip every year to go to headquarters for a week, or go once a quarter just for some of that coveted face-to-face time. Or work remotely from the same city, and head into the office when you feel like it. Being remote doesn’t necessitate being hundreds of miles away, and we can still gain a lot by regularly working from home even if the company’s office is nearby.

“person using laptop” by Kaitlyn Baker on Unsplash
“person using laptop” by Kaitlyn Baker on Unsplash

While those are practical tips for being comfortable while working remotely, the actual switch can be fairly daunting. To make that transition more smooth, try easing into it one day at a time. When I was ramping up to move abroad and work remotely, I started by working from home every Wednesday, since it was uncommon for my team to have anything special on that day. After a month of that, I worked from home 2 or 3 times a week, and purposefully chose days that had important planning meetings, so I could get a feel for how to navigate critical meetings without being in the room. By the time my entire team began the remote experiment, I was already fairly comfortable, and after 3 weeks of being completely remote, I felt even more productive than I was in the office. If you can afford to, begin the switch to remote work slowly.

I’m sure you totally expected something like this at the end of the article, but Yesware is hiring engineers of all levels. Again, I haven’t been coerced into saying any of this, I just genuinely love where I work and I’m excited for our future. I’d be a fool not to encourage kind, driven, and smart folks to join us. Check out our company page for a list of positions, and to get a feel for us. And yes, we’re accepting remote applications.


  1. Remote: Office Not Required. This is a decent intro to the benefits of remote work, but it won’t tell you how to do it. It’s also roughly half pictures, for some reason.
  2. This phrase should always be accompanied with a resounding “within reason”. I suspect your employer wouldn’t be too keen on the idea of using remote work as a means of somehow having another separate full-time job somewhere else.
  3. While this honestly wasn’t factored into the decision to stay at Yesware, I was also going to need an American job for my residence visa in Spain anyway. Without them, I almost certainly wouldn’t have been able to go.
  4. This is just run-of-the-mill morbid curiosity, nothing to be alarmed by. I promise!

How to use Kafka Streams

In this blog post I’m going to write about how to use Kafka Streams for stateful stream processing. I’m going to assume that the reader knows the basic ABC’s (producers, consumers, brokers, topics) of Apache Kafka.

Problem Statement: We needed a system which would consume messages, perform real-time fast stateful processing of these messages, and then forward these processed messages downstream with features like scalability, fault-tolerance, high throughput and millisecond processing latency etc.


Historically at Yesware, we had used RabbitMQ as our messaging system with good results so far. But at the time when we had this requirement, the latest 0.10 release of Apache Kafka had introduced a new feature called Kafka Streams. So our options were:

  1. Use RabbitMQ (which we were quite familiar with) for messaging and at the downstream consumer level use a fast in-memory data store (like Redis – which we also had used quite extensively) to do stateful processing, and forward the processed messages to the final destination
  2. Use Kafka for messaging (which we had used sparingly), and try our luck with this new thing called Kafka Streams which looked quite promising – mainly because it builds on the highly scalable, elastic, distributed, fault-tolerant capabilities integrated natively within Kafka.

We decided to choose Kafka for messaging, because we had expected a big chunk of messages traffic per second and also just because we wanted to expand our boundaries. Now, with Kafka as our messaging service, we still could have used a stream processor framework existing outside of the Kafka system but then we would have faced lot of complexity (picture on the left), so choosing Kafka Streams was quite tempting indeed (picture on the right).

comparison kafka

So let’s quickly go over the basic concepts, before we dive into the code snippets.

Continue reading How to use Kafka Streams

Automated UI Testing for Native Windows Applications

Yesware has a robust culture of automated testing. We currently have no QA department where handing over manual testing bloats the “testing” phase of our build, test, release cycle. Instead, engineers at Yesware write unit tests and UI tests along with any patch they are trying to merge into our master branch. This allows us to confidently and quickly add features, refactor code, clear up tech debt, and update underlying dependencies, relying on our suite of tests to point out the exact issue(s) on any specific set of changes. We avoid lengthy development schedules that are slow to complete, cumbersome to change or adapt, and fragile in the face of the unexpected.

Automated UI tests are finicky and can be expensive to write and maintain. The choice in tooling makes a huge difference in the tests’ efficiency, reliability, and maintainability. The tooling should hide general but nitty gritty details of UI automation while allowing the developer to extend and customize to meet their specific requirements.

Yesware has much experience in writing and running automated UI tests for our web applications, and we have benefited much from the fruit of the open source community for creating, maintaining, and resolving issues for tools like capybaraSelenium WebDriver, WebKit implementations, and so many more. Something we didn’t have much experience in was automated testing for native Windows applications, or in the case of Yesware for Outlook, automated testing for a plugin to Outlook.

While trying to choose technologies to adopt, we recognized that writing automated UI tests when we didn’t control the host application would pose a challenge, so any tool that came with documentation and sample UI automation for Microsoft Office Addins would have a large leg up in our evaluation. So while we became aware of TestStack/White which builds on top of the UI Automation framework, we favored Microsoft’s offering: Coded UI. It had ample documentation and videos, and we found two sample UI automation projects covering how to use Coded UI for Office Addins.

On the surface, Coded UI looked very promising.

  • It was an official Microsoft offering
  • Visual Studio Premium already came with it bundled
  • Lots of documentation, including the aforementioned samples, videos, and MSDN documentation and walkthroughs
  • Promises to hide and abstract away the accessibility and automation layers
  • Mitigated the risk of targeting Office addins by demonstrating with working examples

However, Coded UI turned out to be a very difficult option.

The generated code was very verbose. For example, Coded UI generated 500 lines of code to

  • Launch Outlook
  • Open a message composer
  • Compose to “”, subject “hi”, body “hello”
  • Send the email


500 lines of code to compose and send an email

Such verbose code was not a problem on its own, but since the code was so unreliable for our use case, its verbosity made diagnosing and resolving issues a nightmare.

The accessibility properties in Outlook changed in very subtle ways when our code changed. The code verbosity made it a challenge both to pinpoint these subtle discrepancies and to code an elegant resolution. Even very small changes to our application, Yesware for Outlook, could require a subtle correction in our automated UI test that was difficult to identify and implement. Larger changes to our application could require very extensive changes to the UI test.

The Coded UI search engine was a flaky black box that would intermittently fail. These failures appeared maddeningly similar to subtly incorrect search criteria. However, we ruled out that the error was in our search criteria by successfully finding the target element in another instance of the Coded UI search engine with the exact same search criteria.

As a silver lining, since we needed to adjust Coded UI’s behavior at the Windows automation layer, we became more familiar with Windows automation. We were able to leverage this experience as we moved to White.

White is much more expressive and readable than Coded UI. The previous example where Coded UI generated over 500 lines of code could be written in 70 lines using White for the exact same functionality.


Running automated UI test written in White. Try out the demo.

As you can imagine, maintaining 70 lines of code is much easier than maintaining 500 lines.

As daunting as the UI automation may appear (try checking out the MSDN UI Automation overview), we have found that searching by the “class name” and by the “text” accessibility properties has been consistently sufficient to retrieve the desired UI elements. White did a good job hiding the many details of Windows accessibility that were none of our concern.

White doesn’t generate the search properties for you, so use the Windows Inspect tool to help you identify the values of the search properties of the elements you are trying to interact with.

As we’ve mentioned, we run our tests continuously, so we also have our continuous integration server run our UI tests on every candidate patch. Jake Ginnivan wrote a great guide on how to setup a TeamCity build agent on Azure to run automated UI tests. While some details may be outdated, the crux consists of the following broad strokes

  • set up the build agent on an Azure VM
  • set up a persistent graphical UI session (necessary for Windows UI automation)
  • run the TeamCity build agent within that UI session

TestStack.White is not without its flaws. While it is a vast improvement over Coded UI for us, we’ve stumbled over a couple of parts of White. Their issues page is a pretty comprehensive list. Development isn’t particularly active, but the project has great potential and plenty of areas to improve. For example, Coded UI had means to catch playback issues and retry while White does not.

The UI automation tools for native Windows testing have lagged behind web-based tools. While there is a large ecosystem of tools to choose from to run automated UI tests against your web application, we found only two major contenders for automated UI testing against native Windows applications. We hope this article helps you make a more informed decision on how to UI test your Windows application. Writing and maintaining these UI tests have been a long road for our team, but it has gone a long way to catching UI bugs in our code before they are deployed and caught by our users.

Introducing YetiLogger

In this blog post I’m going to introduce the yeti_logger gem. This is Yesware’s shared logging mechanism for our Ruby code. You might be wondering why such a need exists. Ruby has pretty decent logging via the Logger class and Rails builds upon that. So why another layer? For us it came down to a few reasons:

  • abstraction
  • efficiency
  • format

Keep reading for some background on each of these reasons and a high level overview of how to use it, or feel free to jump straight to the GitHub repo ( to check out the Readme.

More on the why?

As with all things software, we tend to change our minds on things here at Yesware. Logging is not necessarily one of them, but it’s something we’ve toyed with in the past. We wanted to be sure that all of the great logging we were adding to the product wouldn’t result in a refactoring nightmare if we switched from using Rails.logger to something else. Also, to make things more complicated, we have some non-Rails applications that use Logger directly, and a very few places that even use puts (most of our production apps run on Heroku, where our logs are sent to stdout anyway, so this isn’t as bad as it sounds). Having a nice abstraction layer over whichever logger we wanted to use seemed like a good idea.

Another aspect of abstraction is the idea that it should be very easy to log (and to stop logging). By making YetiLogger a mixin module, you get methods you can call directly on your class or class instance, such as log_info("hello"). By having these methods live on your class/module/instance, it makes it much easier to replace YetiLogger, or just bypass it by defining a method on your class/module/instance with the same signature of those from YetiLogger. For example to temporarily bypass all calls to YetiLogger#log_info, define a dummy method on your class that does what you like with it:

def log_info(message)
  # Code to stick message wherever you like, such as a database, queueing system, or nowhere!

We also wanted more control over the performance impact of our logging. Fortunately we’re not in a position that we’ve been able to point to our logging as a major source of performance drain on our application. That said, we didn’t want to get ourselves into such a predicament. The formatting of a string to emit to a log is generally fast, but sometimes we want to compute or fetch additional information to help with later debugging. Additionally there were some efforts to remove log statements from the code that were set to log at levels below where we typically ran. While this was fine and all, we really wanted to have the freedom to leave some poor-performing code in and be assured that unless we needed to turn it on, it would remain off. As you can see below, we accomplish that with Ruby blocks being passed to YetiLogger methods. Since the block’s evaluation is deferred, we can optionally check whether or not we need to even evaluate it (and thus even build the string to log) based on the logging level. Unfortunately, Ruby doesn’t have the notion of lazy evaluated method parameters, so we couldn’t rely on that. Blocks gave us what we were looking for. We can now fill calls to YetiLogger with inefficient blocks that lookup all the information we’d want to debug something knowing that unless we crank up the log level they’ll safely not impact performance.

The last, and probably biggest reason for YetiLogger is one of consistent formatting. We’ve tried out several logging tools and are currently big fans of SumoLogic. It’s great for consolidating all of our production system logs and giving us a nifty way to search through them. One thing that quickly came up after we began using our logs more was that the formatting of log messages was all over the place. Parsing structured data is much easier than extracting information from plain text. Take these two log messages for example:

user_id=37 msg=login


"user 37 ( logged in"

Now, extracting the user id and email from either of these is not terribly hard. In fact it’s arguably not hard at all. The problem isn’t with one such format, but when every log statement has a slightly different way of formatting things, it’s a problem of scale. At some point, the code to extract all the possible ways a user id is specified is huge. And that’s just one field. The key=value formatting that we prefer to use however, is much easier for a machine to read. It makes writing parsers to extract attributes much more cookie cutter, rather than each one looking slightly different. While YetiLogger does support unstructured log messages, by far the preferred way to use it is to take advantage of the key=value style format.

That’s a little about how we got to where we are with YetiLogger. Now let’s take a look at how to use it.

How it works

To start with, install it via RubyGems:

gem install yeti_logger

YetiLogger is a wrapper around a Logger-like class, so before you can you it, you must configure it. The only required configuration is to specify a logger class to use for the actual logging.

require 'yeti_logger'
YetiLogger.configure do |config|
  config.logger = Rails.logger

The logger can be any Logger-like class that YetiLogger will defer to for actual logging. It also relies on the underlying logger for configuration such as log levels.

Once you have a configured YetiLogger, you can begin mixing it into your classes and modules. Adding it to a class will give you both class-level methods as well as instance-level methods such as log_info, log_warn, etc.

class MyClass
  include YetiLogger
  def test_logging

The above will output a log line that looks like this:

2016-04-05T10:23:29.135-04:00 pid=90811 [INFO] - MyClass: hello!

The bits at the beginning of the line are all specified via the underlying Logger format configuration. The last bits of the line (MyClass: hello!) all come from YetiLogger. In general, the format for a log message is <Class name>: <log message>. The signature for each of the log_* methods looks like this:

log_info(obj = nil, exception = nil, &amp;block)

Yes, everything is optional, but that’s to give you the most flexibility. Let’s drill into each of the arguments separately.

The first arg is obj. This can be any object, but it really gets boiled down to one of three types: a hash, an exception, and everything else. See the next paragraph if obj is an exception. If it is not an exception, nor a hash, then we simply call #to_s on it and log that. If obj is a Hash, though, then we convert it into a key=value key=value string of the hash’s values. Remember that one of the reasons for YetiLogger was formatting? We found that key value formatting of log messages helps us write tools that look through logs for information. It’s much easier to find all activity associated with a user by searching for user_id=1234, rather than having to remember what format to keep things in so you don’t wind up matching all versions of logging user id: user_id:1234, user_id(1234), or the worst 1234.

The second argument is exception. This argument, if present, should be a Ruby Exception. YetiLogger will print the message, class and backtrace of the exception. If the first argument (obj) is a Hash, the exception details will be added to it and equivalently logged as key=value pairs.

The last argument, a block, is the preferred way to call YetiLogger for logging messages below the level you specify for logging. For instance, if you’ve configured the Logger to log at the info level, then all debug-level log statements should use the block form of calling. The block form of calling YetiLogger defers evaluation of the block until after the log level is checked. This allows you to leave in log statements that may be very expensive to compute (lookup additional data from a database for instance) and be assured that they won’t slow down your app, until you crank up the log level so they are evaluated. The value returned from the block follows similar rules above for the obj parameter. For instance:

log_info(user_id:, msg: "logging in")

is equivalent to

log_info do
    msg: "logging in"

We routinely run with our logging levels set at info, but we still use the block form frequently as it’s also a convenient way to encapsulate all the logic associated with forming a log message. For instance:

log_info do
  msg = if user.first_login?
    "first login"
    msg: msg

Continue reading more about YetiLogger here, including test support and the internal message formatters that you may find useful outside of a logging context.


I hope you liked reading about YetiLogger and that you find some use for it in your projects. As always, let us know via issues or pull requests against the repository about any questions or improvements you may want to see. We’ve been using YetiLogger pretty much untouched for a few years now and have been quite happy with it. It’s been a huge time saver in terms of instilling conventions that we find useful for debugging issues via logs, and hope you do too.

Using the MongoDB Oplog to trigger asynchronous work


Over the past few years at Yesware, we’ve settled into a frequent pattern for handling asynchronous chunks of work in various apps in our microservices architecture. Typically thus far in our Ruby applications, the pattern has gone something like this:

  1. An incoming piece of data arrives, through an HTTP endpoint, or a database write, or a read from a message queue. We need to do no more than 1 or 2 quick things with that data before responding or exiting so that we can keep up with the onslaught of incoming data. Usually this means writing the piece of data to a database, then enqueueing to a message queue for further asynchronous processing.

  2. Once the data or a pointer to it has been written to a message queue, a worker reads that job, processes the data, then publishes to some other queue, or takes some kind of direct action.

This pattern is not a bad one, and it has served us well in many cases. However, it adds overhead in the form of extra services to stand up (a message queue), boilerplate code to maintain for things like intermediate data structures, and undesirable complexity. For a recent project we decided to try a new approach. This project involves reading from a high-volume exchange on RabbitMQ (our latest message queueing system of choice), writing much (but not all) of that data to what we expect will soon be a very large MongoDB cluster, then updating a smaller Postgres database with aggregated stats about the data we’ve just written.

The filtering of raw data and storing into MongoDB is fast, but aggregating the data into Postgres will take far too long to allow the RabbitMQ queue to keep up. Now, you haven’t truly lived until you’ve let an overflowing queue send the RabbitMQ cluster into flow control, throttling your publishers such that some of your precious data is dropped instead of published, but we’ve noticed an interesting trend. Our customers tend to like it better when things work and data doesn’t get lost, so we do our best to ensure flow control remains a distant threat. Asynchronous aggregation it is. Since we’re writing the data to a MongoDB replica set, why not use Mongo’s inherent replication functionality to trigger the downstream processing?

MongoDB Replication

Replication in a MongoDB cluster is handled by secondary nodes requesting records from the primary node’s oplog, then applying those changes as they come in. Since the primary stores its oplog in a Mongo collection, any other process can read that collection and do whatever it likes with the changes as they occur. There is some prior art for this on the web, but not much in the Ruby world. However Stripe developed a nifty gem a while back called mongoriver that does just that. It reads from the oplog, maintains its position in the oplog with a timestamp stored back into Mongo, and uses an evented model to issue callbacks when various types of operations occur. Sounds great, right? It kind of is, but we encountered a few bumps during implementation.


To use mongoriver, you need a MongoDB replica set. Without a replica set, there is no replication (duh!), which means no oplog (doh!). We usually develop against a single node Mongo in development, but to get this working in a development environment, we had to set up a couple more nodes and convert them into a replica set. This is as simple as creating new locations for your extra clusters, then using the mongo console to start the replica set. MongoDB has a great summary of that here.

Once that’s going, you’ll need a couple of classes in your Ruby app to handle the oplog: an outlet that is triggered when new operations occur, and a worker to set up the tailer and stream the oplog. The outlet is the easy part, let’s start with that.

class FilteredThingOutlet < Mongoriver::AbstractOutlet
  # This method will be called on every insert in the oplog, and will be given 3
  # params: the DB name, the collection name, and a hash that is the document being
  # inserted.
  def insert(db_name, collection_name, document)
    # We only want to publish documents of the right type that have a user_id
    if collection_name == "filtered_things" && document.keys.include?('user_id')
      # Publish the full document (in our case this also wraps the document in a Thrift
      # struct) for downstream processing

Easy. The worker is a bit more complex, as it needs to first set up a tailer that can read from the oplog, then stream output from that tailer to the outlet. Also, in order to maintain its position in the oplog, we use a PersistentTailer that knows how to save that position to a live mongo connection. In a simple development environment, this can usually be the same connection that the oplog is reading from.

class FilteredThingOplogWorker
    # Get the MongoDB connection from the MongoMapper model (more about MongoMapper in a bit...)
    mongo_connection = FilteredThing.collection.db.client
    # This will persist the oplog position to the DB every 60s into a collection called
    # ‘oplog-tailers’ by default
    tailer = Mongoriver::PersistentTailer.
          :existing, # Use an existing MongoDB connection (instead of creating a new one)
          'filtered_things' # A name for the position persistence to use. Using something
                            # similar to the data in the collection being tailed makes
                            # sense here.
    # Hook up the oplog stream to our handler class, FilteredThingOutlet
    stream =,
    # Stream 4ever!

We then wrap this in a simple rake task that calls and watch the streaming happen. In a development environment, this works swimmingly. But in a production environment, there are typically separate users for each database, even if those users have the same name and password. In our case, the databases are filtered_data, where the data lives, admin, where the oplog is, and _mongoriver, which is the default name of the DB to which Mongoriver will persist its position. Unfortunately, this means using separate authentications for each database, but we can at least authenticate multiple times on the same connection. In addition, the default 60 second persistence is perhaps a little conservative for our tastes, but that’s also easy to change by passing an option to the tailer. The worker then becomes a little more complicated.

class FilteredThingOplogWorker
    # This will persist the oplog position to the DB every 10s with the
    # :save_frequency option
    tailer = Mongoriver::PersistentTailer.
      new([mongo_connection], # Now defined by the method below
            save_frequency: 10, # Persist position every 10s (overriding the 60s default)
            db: '_mongoriver' # Store the position in this DB
    stream =,
  # Get a Mongo connection that has permissions to tail the oplog, and to store
  # the state in the _mongoriver DB. This means authenticating with 2 additional
  # DBs. The user/pass combos are the same on all 3 DBs (admin, _mongoriver, and
  # filtered_data), so no extra config is necessary.
  def self.mongo_connection
    # Only need the extra authentication in production
    if Rails.env.production?
      # 'hosts', 'user', and 'password' should be pulled in from the environment, or
      # from a Mongo configuration (ie, mongo.yml) do |conn|
        conn.db('admin').authenticate(user, password)
        conn.db('_mongoriver').authenticate(user, password)
      # Everywhere except prod, just reuse the FilteredThing Mongo connection

There are a couple of things worth pointing out. First, we’ve specified the database _mongoriver for storing the position. Technically it can be called whatever you like, and it could even be the same database from which you’re reading the oplog. However, if it is, then you have to deal with the fact that the outlet callbacks will fire when the position is written, since it’s just another insert. I think it’s cleaner to have a separate database for the position, even if it only has 1 collection – oplog-tailers – with only 1 document. Incidentally, the oplog-tailers collection name can also be overridden via the :collection option.

In addition, because there is a set frequency at which the tailer will save its position, the outlet callbacks may fire on duplicate oplog entries in the case where the worker restarts or crashes in the middle of the window. We’ve designed our downstream aggregation processing to gracefully handle duplicate publishes, so that isn’t a problem. In other use cases, it might require extra work to ensure the downstream processing is idempotent, since there will certainly be duplicates at some point.


As you can see, this new pattern requires very little code, which of course means much less maintenance overhead and general complexity. In addition, since it relies on MongoDB’s existing replication framework, which has to be fast in order for replication to function properly, the total throughput from document write to RabbitMQ publish is nearly immediate and not subject to any additional dependencies.

However, there is a substantial, though not insurmountable downside to this new style. We’ve historically used MongoMapper instead of Mongoid as our MongoDB ORM of choice at Yesware. We’ve written many plugins for it and lean on it pretty heavily across many of our microservices. But version 0.13.1 came out in 2014, and while there does seems to be some recent activity on master, it doesn’t appear to have an active maintainer anymore. In addition, now that Mongoid no longer requires its own driver (Moped) but instead uses the default MongoDB Ruby driver version 2, the choice of MongoMapper for our ORM has been questioned. Some of our recent microservices have used Mongoid with success, although in a pretty basic capacity – they aren’t attempting to use more advanced features like covered queries, for instance, that are not supported by Mongoid. Among other things, the connection objects were completely rewritten in version 2 of the Ruby driver, and Mongoriver doesn’t work with them, which means Mongoriver and Mongoid are not compatible (possibly older, Moped-based versions of Mongoid work, but that does not interest us). In fact, like MongoMapper, Mongoriver looks like it may have been abandoned of late; its most recent comment is also from 2014.

This means that we’re using a seemingly unmaintained gem (mongoriver) which relies on another seemingly unmaintained gem (mongo_mapper), and neither can be updated to use the most recent Ruby driver. This is fine for now, but will eventually be a problem, since in the future we’ll probably need to upgrade to a Mongo too new to support the older driver. We’ve considered starting our own forks of MongoMapper and Mongoriver to get around this problem, and we may well do that, but the potential burden of that extra work is definitely a downside with this strategy. It’s close to a turnkey solution for now, but may not be for long. For anyone considering adopting Mongoriver, this is a meaningful consideration.

Despite the potential maintenance downside, the addition of Mongoriver to our workflow seems like a smashing success so far, and I expect we’ll be looking to it to make data passing easier in other places where we can piggyback on the existing replication infrastructure that MongoDB already provides.

Extra Credit

At Yesware, we love to be woken up at 3 am because something went terribly wrong. Wait, that’s not right. We love it when something goes wrong at 3 am and pages us. No, that doesn’t sound right either. We hate it when things go wrong, but on the rare occasion when it does, we want to know about it ASAP. (Preferably not at 3am. Yeah, that’s it.) This means we love monitoring, and nearly all of our features involve some degree of monitoring so that we know their health at all times. This feature was no different. Since we’re storing the tailer’s position in the oplog every 10 seconds, why not record a metric there so we can alert on any potential lag? Let’s take a look at the record that gets written to oplog-tailers with the tailer’s position.

    "_id": {
        "$oid": "55cd8d4ef4da36b0fe2aad19"
    "service": "filtered_things",
    "state": {
        "time": {
            "$date": "2016-03-10T02:36:29.000Z"
        "position": {
            "$ts": 1457577389,
            "$inc": 1
    "v": 1

It’s a pretty simple document, basically containing an _id like every MongoDB doc, the service name we told the tailer to use, a timestamp, and a position. We created a rake task that we can call every 10 minutes to fetch this document, compute how far behind realtime the timestamp position is and record that to our statsd server.

task :filtered_things_oplog_worker_delay => :environment do
   include YetiLogger #
   mongo = FilteredThingOplogWorker.mongo_connection
   # Fetch the current record
   record = mongo.db("_mongoriver").collection("oplog-tailers").
              find(service: "filtered_things").first
   # Determine the last time this job ran, and where it is in the oplog
   last_run_at = record["state"]["time"]
   oplog_at =["state"]["position"].seconds)
   seconds_behind = last_run_at - oplog_at
   log_info(worker: FilteredThingOplogWorker, msg: "oplog tailing state", seconds: seconds_behind)
   Metrics.gauge("work.FilteredThingOplogWorker.oplog_behind", seconds_behind)

Then we set an alert when the lag falls above a threshold that concerns us, and our confidence is increased. And hopefully we’re all sleeping soundly at 3am.

Some things you wanted to know about fonts (but were too afraid to ask)

As a web developer, your most common font problem is probably “Why doesn’t this character look like what I expect it to look like?”

Your problem is either

This character is legible but isn’t in the nice font I/my company shelled out big bucks for


This character isn’t even legible

In the latter case, your character may look like ’this’, or it may just be a bunch of �����.

First I’m going to define some concepts with which we can troubleshoot most font problems. Then I’ll apply them to the three typical font problems I mentioned above.

Encoding: A set of mappings from sequences of bits to characters. For example, ASCII is a set of mappings from sequences of seven bits to characters, in which 01101000 maps to h.

Code point: The key in such a mapping, e.g. 01101000.

Glyph: An image of a character. For example, these are all glyphs for the character a:


Font: A set of glyphs representing a range of characters. These are the glyphs that make up the font Comic Sans Regular. The characters supported by a font typically belong to a group, such as ASCII characters or math symbols.

Now that we have a lexicon with which to talk about these things1, we can make some headway with troubleshooting each of the three font issues I mentioned:

  1. Correct but ugly characters. This must mean whatever is rendering the characters (since you’re probably a web developer, it’s probably your browser) isn’t using the intended font. Either it doesn’t know that it’s supposed to use that font, or it knows but doesn’t have access to that font, so it can’t find the glyph images it needs to render.
  2. Wrong character (’). If, say, you’re expecting but getting ’, this must mean the bit stream underlying your string is being parsed incorrectly such that the bits representing ' are somehow being mapped to ’ instead of '. In this case, since you’re expecting one character but getting three, your bits don’t appear to be broken up properly to begin with. If you get one character but it’s the wrong one, that also means the bits -> character mapping went awry. Either way, this sounds like an encoding problem (encoding == mappings, remember?) Maybe your string was encoded in a 8-bit encoding such as ISO-8859-9, but being interpreted with a 7-bit encoding such as ASCII.
  3. No characters (���, ???, what have you). Either there’s no mapping from those bits to a character in the encoding your browser is using (for example, ISO-8859-1 doesn’t know what to do with 0x1F), or there’s a mapping but there’s no glyph for that character in the font your browser is using (say, because your stylesheet specifies Comic Sans and the character is Japanese).

In later posts, we’ll walk through further debugging steps for each type of issue.

1 If you find my grossly simplified glossary unsatisfactory, here’s a more thorough and entertaining overview written by someone smarter than me.

Product Development Evolution at Yesware

When I started at Yesware just over two years ago I was its first ever Scrum Master. I had worked as a Scrum Master for years at a few different companies and I was really excited to come to Yesware. I loved the people, the location in downtown Boston, the office space (and that has since gotten even better!) and the culture.

One thing that wasn’t so great at the time was the product development process. I was hired to help with that.


The first thing I noticed was that standups didn’t seem to be delivering much value. The team went around in a circle and talked about what they were up to, but it wasn’t always clear which work items they were referring to, and how many items they were working on at once. To manage project work, the teams were using a bug tracking tool. That’s great for bug tracking, but not great for Scrum teams. So I switched tools to something designed for Scrum teams (JIRA) and I began using the work board feature as the visual in standups. When team members spoke about what they were doing it was very obvious if it was actual sprint work or if it was outside work. I think standups improved.

I take the rules of standups very seriously. We only get fifteen minutes. People should not go into deep dives; take it into a separate conversation. People should let their fellow team members know if they accomplished what they said they would in the prior standup.

Since I work with all the teams at Yesware, I am in a ton of meetings, but that doesn’t mean that everyone should be. Standups are so valuable because they reduce the need for other meetings. Everyone knows there’s an opportunity to sync up with the team at the same great time, same great location every day. When issues arise, the involved individuals can hold a follow up meeting, not dragging everyone into the discussion if that’s not prudent.

In case it’s not obvious, I love standups.


So, standups were improving. I wish I could say the same for planning meetings!

At the time, we were working in two week sprints. Our meetings were scheduled for four hour blocks and more often than not the teams used up the entire four hours. It is really hard to sit through a four hour meeting, even if you love meetings, which I do. Teams would discuss sprint work in great detail, and then several days into the sprint everyone would forget what we had talked about in planning. Definitely not a good use of time.

Also, since estimating in story points and tracking velocity had worked well at my past companies, I wanted to do that at Yesware. It just didn’t work here though and it was another time suck. So we stopped spending time on any detailed estimating and now do only high level, t-shirt size estimates.

Where we are now

Fast forward to our new and improved process! We don’t strictly follow Scrum anymore, but instead do kind of a hybrid of Scrum/Kanban/Yesware special sauce. We still do standups, because they are awesome! Every morning (but not too early) we meet up, talk about what needs to be done, and get on with our day.

We still do planning meetings but they are nothing like before. We take more of a just in time approach to planning. Teams have two planning meetings a week on their calendars but they only happen if there is work to discuss. We talk about what we are doing now, and once we have enough detail for people to get to work, we break and go back to our desks. Oh yeah, and each meeting is only scheduled for an hour. There’s no worrying about anyone falling asleep due to meeting exhaustion.

Other improvements

In addition to the day-to-day process changes, we’ve made improvements at a higher level as well. I’d like to highlight some of those.

Besides standups, my favorite meeting from the Agile world is the retrospective. I love to talk about things we should change to work more efficiently. We have retrospectives about every two weeks and it’s a closed door meeting, meaning only the team members may attend. People are free to say whatever they feel, and what happens in retro stays in retro. Many process changes have come as a result. For example, discussing issues with using our work board has led to changing work-in-progress constraints and simplifying the board’s workflow. Even little things, like putting a status dashboard on a big TV and tracking standup tardiness to encourage faster meetings, had their roots in retros.

The simplification of daily process has made it easier to see the big picture and what’s really going on. Back when we did Scrum I kept an eye on how many work items a team member was working on, but it wasn’t a focal point for the entire team. Now it is! We closely monitor our work in progress to make sure we are not spreading ourselves too thin and working on a bunch of stuff when we could be delivering fewer things sooner.

Finally, reducing the duration and frequency of meetings has allowed us to improve our scheduling. Back in the old days we would have essentially all day meetings at the start of each Sprint. Now the majority of meetings are in the morning, generally right after standup. People attend their meetings and then have a big chunk of uninterrupted time to actually get stuff done. The cost of context switching is well known and I try hard to minimize that at Yesware.

Wrapping up

Things work differently than they did when I started here, but I think nearly everyone feels it’s been a change for the better. We certainly haven’t figured everything out at Yesware. But we have taken plenty of steps to improve our process, and we continue to tweak it.

VSTO Lessons Learned

On the MSDN website, the Office and SharePoint Development in Visual Studio page discusses two primary options for developing an Office add-in: 1) use the latest and greatest Office add-in technologies targeting Office 2013, and SharePoint 2013, or 2) use Visual Studio Tools for Office (VSTO) to presumably target versions of the Office 2010 and older.

If you intend to distribute your Office add-in to the general public, don’t use VSTO.

VSTO offers many features to develop an add-in, including graphical editors to define Office UI customizations. It has extremely simple tools to manage deployment, and it magically loads code whenever the user starts your targeted Office application.

The problem with deploying it to the general public is that the VSTO Runtime (VSTOR) is a prerequisite that the user must have installed, and dealing with the VSTOR in the wild sucks.  In our experience, a number of users simply could not install the VSTOR, and there were issues even with those who could.

VSTOR upgrades can leave behind problematic artifacts, and resulting failures often produce opaque error messages. You can find a number of complaints on the MDSN Forum about one particular artifact that VSTOR upgrades leave behind. Below is one example, with no apparent, official resolution—just an incomprehensible error message:

The value of the property ‘type’ cannot be parsed. The error is: Could not load file or assembly ‘Microsoft.Office.BusinessApplications.Fba, Version=, Culture=neutral, PublicKeyToken=71e9bce111e9429c’ or one of its dependencies. The system cannot find the file specified. (C:\Program Files (x86)\Common Files\Microsoft Shared\VSTO\10.0\VSTOInstaller.exe.Config line 10)

MSDN Forum

Additionally, if your VSTO add-in takes too long to load, office applications can disable it. This is likely to happen, because the timer starts before the CPU reaches the add-in’s instructions. That’s right – your application can be penalized for “performance reasons” before your add-in is even loaded, nevermind starting to run. And if any of your potential users are on a slow computer, there is little that optimizing your code can do. Imagine our frustration trying to trace performance issues in our application when our users reported that it was disabled due to “being too slow” when the many reasons that it may be “too slow” didn’t involve our code at all.

The deployment mechanisms that VSTO borrows from ClickOnce can also fail, again with inscrutable error messages such as “Value does not fall within the expected range.” Here is another product’s support page that discusses a workaround that isn’t particularly user friendly.

Apparently, the VSTOR is prone to becoming corrupt. Our customer service representatives communicated that while troubleshooting with the user, our application started to work correctly when the user did nothing more than run the “Repair” tool for the VSTOR.

Common themes for these issues involve the development team spending too many hours on the following actions:

  • Investigating incidents with paltry and inscrutable logs
  • Identifying potential causes and fixes
  • Experimenting with fixes under various foreseeable circumstances
  • Experimenting with fixes in the wild
  • Refining those fixes as users give us feedback

Some of these issues seemed intractable, even as users went to great lengths to avail themselves for troubleshooting. Others went silent after prolonged communications with our support team going through our support script. I imagine a large portion of folks simply walked away after encountering the first issue or two trying to set up our product; with the number of possible issues, many folks could fall into this category.

With all this doom and gloom, what alternative is there? There is some talk about approach that involves implementing the IDTExtensibility2 interface. There is also the suggestion to use a tool like Add In for Microsoft Office and .net, which seems to do the heavy lifting implementing IDTExtensibility and providing features that have allowed us to replace VSTO and overcome the aforementioned problems by avoiding them (the VSTOR) altogether.

Moving to Add In Express has been undeniably the right move for Yesware for Outlook product development rather than sticking with VSTO. No difficult-to-install, buggy prerequisite exists. Therefore, prospective users’ ability to install our product has greatly improved. The ADX loader has resolved the frequent complaints about our Yesware for Outlook being disabled due to slow loading.

There are still some issues for us to improve. We kept the ClickOnce deployment mechanism since we focused on removing the VSTO dependency first, but ClickOnce can sometimes fail in the same user unfriendly ways that were already mentioned before. Add In Express offers a deployment mechanism which leverages the Windows Installer which is likely to make our installer reliable. We are always working on improving our product, so expect an MSI option for Yesware for Outlook soon!