Contract Work

Tuesday, April 29, 2014

Building Analytics



A key part of any build is figuring out what colleagues want to track and how to effectively manage that data. In this case, there is a lot of stuff to track. I recently finished building a hefty analytics components to the app and wanted to share my approach and some things that drove me crazy.

First, I had to really think through which analytics were important. The initial ask was for about 30 different types of analytics, but then, when adding in timeframes, and additional category components, we were talking about close to 400 queries… not super useful for an early launch. Additionally, through my experience with Neighborsations, I could look at these requests and know which ones were most important for raising funds, for identifying user paths, and which ones wouldn’t really be useful until we had a larger critical mass using the application. I also made sure to ask my team what the most important success factors were to them (ie- if this number isn’t what we want it to be, then the business is not successful and we need to make some hard choices, quickly… that’s actually a vital part to determining core metrics and something I learned from Steve Wendel who’s awesome at pushing you on those hard questions.)

The easiest way to do the queries was through writing DB queries, putting them into a model with methods and then creating a view. The app is an ember-appkit-rails app which kinda mushes the rails api and ember together but I decided to keep this simple and just run everything outside of the ember piece and just keep it as typical rails.

There were also a couple of options for running the number… we could have done a rake task or set up a chronjob. Right now, the approach is just to have the business folks hit that page whenever they want new numbers (with the understanding that they shouldn’t hit it too often because it kicks off a bunch of queries that hit the database).

To seed the data, I was originally leaning towards factory girl but decided to start with just creating the data I needed in the tests. What I didn’t realize is that the app automatically pulls in fixture data, so everytime I had a to eq it would fail because the number would be incorrect. That taught me… after spending probably too much time trying to figure out how to not have the test pull in the fixtures, I realized I should just embrace them, learn how to set up the fixtures for my tests correctly and use them to generate the data I needed.

Then I created a model with the class AdminAnalytics. Each analytic was a separate method. I think if I wanted to go back and continue to refactor further, then I could easily break the model into separate classes. For example, instead of having one overarching AdminAnalytics class, it would probably make more sense to have a Users class, a Posts class, and Ingredients class, etc.

Then it was down to the queries. I didn’t have much SQL experience and some of these queries were pretty complicated so it helped me to first write out all of the steps that I was looking for before translating it into an actual query. Once I did that, I could take those queries and use ActiveRecord to give me the rails magic that made the queries a little easier. For example, when joining tables in queries, you don’t have to note the join table… you can just note the two tables and activerecord will figure out the relationship between the two tables on its own. Then, because a lot of the queries involved profile type names or time parameters, I refactored by making those things arguments on the method that could be put in place in the view.

Then I created a controller which was literally one line and was able to create a view that just called the instance variable @analytics and the method with whatever arguments I needed. Bam! Analytics.

Now, once this was all done, we actually decided to pull the analytics out of the app. Instead of bundling these things together, we made it a separate app entirely that talks to the primary app in order to get the numbers needed. When we pulled it out, we used sequel which made it easier to pull those queries in (although still more difficult than just doing it right in the app). The nice part about the app was being able to use ActiveRecord but it also made the analytic piece dependent on a bunch of different models. For example, to get the total number of users, you just do User.count or to get the total number of posts, Post.count. You’re already depending on two different models here, User and Post. So, part of pulling this piece into a separate app was so that we were no longer relying on multiple models in order to get these numbers.

The other thing I really wanted to do was set up a simple dashboard using Dashing. I’d seen people whip these dashboards up in no time flat, so I figured I’d take a stab at it… boy was I mistaken. I think my journey to a dark place started with the decision to use dashing-rails instead of dashing. See, I figured, if I used dashing then I would have to create the connections to the database in order to get the information for the queries I was running. If I used dashing-rails, then the dashboard would be a part of the app and this part would be easier. (If you’re thinking about the paragraph above and thinking, wait a second, you pulled the whole thing into it’s own app anyway in the end, yes. I realize this and boy is hindsight 20/20). Dashing-Rails involves a bit more setup. You’ve gotta set up concurrency, you have to use a database that has lots of threads like puma, etc. The issues started there and just didn’t stop. First, I had an issue with puma and so I upp’ed the number of threads possible and that seemed to fix it. Then, there was a database connection issue. The DB connection would time out almost immediately. We got this working as well, but it still times out. After a certain number of rounds, the thing just kicks the bucket. Then, I had an issue where all my dashboard widget boxes would show up but nothing would show up in them. Sometimes, if I commented things out and then reimplemented them one at a time, they would work again. And there was no rhyme or reason about when the dashboard would start up and kick off, versus when it wouldn’t.

So, here’s the really annoying part. I FINALLY got the issues fixed (with the help of lots of pairing with a few different, very patient people) and pushed the dashboard to production, where everything promptly broke and nothing rendered correctly. Then we got it rendering correctly, but the queries still aren’t running correctly. All the while, I’m kicking myself because I was the one who said, “oh, and I’ll build this awesome dashboard to go along with the simple analytics view. It’ll be fun and shouldn’t take too long.” I think the dashboard was mostly a lesson in when to give up. I should have scrapped it after the first week and a half, but I was a little too stubborn and a little too determined to get it done.

So, that’s how I built the analytics. For my final thoughts, I think the moral of this story is to always keep it simple. Start small, get that shipped and continue to add and improve.

No comments:

Post a Comment