Eric Goldsmith (AOL)
Joshua Bixby: I am Joshua Bixby, President of Strangeloop Networks. Welcome back to the Web Performance Today podcast. This week I am talking to Eric Goldsmith, Operations Architect at AOL. Eric is a performance evangelist in his public life and in his work day life, he is in the extremely enviable position of being able to leverage large scale analytics technologies to collect and analyze data from AOL website, something I am quite envious of. He has been a pioneer in mining this data and it was a pleasure to talk to him about why we have to think like data scientist, why teaching stats in a corporate culture is an uphill battle, and how the RUM world has changed in the past seven years. I hope you enjoy.
Joshua Bixby: Here with Eric Goldsmith, architect at AOL, long time architect. Eric how are you doing?
Eric Goldsmith: Good. How are you?
Joshua Bixby: I am wonderful. You are one of the unsung heroes of our community as someone that has probably more cumulative experience in big data and performance than most of the people listening probably altogether. Tell me about your experience at AOL. Tell me about the last I guess 9 years.
Eric Goldsmith: I’ve been here 9 years that’s right and during that time I have done a lot of work in the 2005 and 2010 time frame with web performance and building up tools in data collection and so forth to bring visibility of that throughout the company and of late, I have been working more on the big data side of things. You mentioned that I am an architect. I am in data technologies organization with AOL and we collect lot of data from both synthetic measurement tools as well as end user instrumented tools like real user metrics. We collect all that data and provide the analysis and reporting services to provide insight to the business.
Joshua Bixby: And you are one of the original web page test guys. Tell me about the birth of that.
Eric Goldsmith: Well, it was originally an internal tool that kind of grew out of the modem days when AOL was modem based and it was actually originally called a ward dialer because it was literally dialing modems. Then in about the 2005, 2006 timeframe things transitioned, that tool was transitioned to be more web focused, web facing, and it was built out to support the extra information that was able to be collected from browsers, the waterfalls, and so forth, as well as multiple geographic locations to be able to get that additional insight.
Joshua Bixby: So who were the main guys? I mean I know you are obviously one of them, Patrick is. Who else would you sort of put the mantle around?
Eric Goldsmith: Here well there were three musketeers; Patrick Meenan, myself, and David Artz.
Joshua Bixby: David Artz, that’s right of course.
Eric Goldsmith: Yeah.
Joshua Bixby: And you I do not know if I said this to you before but you are the best looking face web page test has ever had. No offence to Patrick but you know when you got up there at Velocity and spoke I think all the ladies, the two ladies were swooning. Patrick has no ladies in his sessions as far as I can tell. The two ladies go somewhere else. There is an article that came through one of the Twitter feeds, which was something about controlled experiments and puzzling outcomes. Was that you?
Eric Goldsmith: Yes, I re-tweeted that.
Joshua Bixby: Re-tweeted it. It is a Microsoft study right because it is interesting. It makes me think that we are thinking about this idea of RUM and actionable metrics, that one was fascinating because that is actually something we run in to a lot which is these things are sometimes counterintuitive, did that speak to some of the things you do in your job?
Eric Goldsmith: It does and one of my mantras is – a lot of people when they think big data, to bring it back to that topic just momentarily, they think boy, I’m just going to collect all the stuff I am throw it in a hadoop cluster and then just go and analyze it and start looking for insight and that’s really not the right approach in my opinion. It is a starting point that type of data mining that I call it, is great for establishing corollary evidence or correlations but doesn’t necessarily result in causation, it can't prove causation, so it's a starting point. My little mantra is, mine the data for correlation and then experiment for causation. So once you started seeing some correlations that look interesting, then you've got to really start doing experimentation, you get into – putting together hypotheses, looking at what data you need to prove or disapprove the hypotheses, ensure that you’ve got that data, or that you're going to be able to get that data and then collect it and analyze it with the hypothesis in mind. So it’s a lot of the right way to do this stuff is a little more sophisticated than a lot of people are led to believe or want to believe.
Joshua Bixby: I do not know if you follow the podcast, This American Life, so they had an episode recently about – I think it was called So Crazy It Just Might Work. I do not know if you've heard but the one about the cancer researcher where the guy had what was the machine the one that hit it with sound waves that starts with an R, I cannot remember I am trying to think.
Eric Goldsmith: I cannot remember either.
Joshua Bixby: So where the musician and the researcher got together and it actually reminded me of what you are saying because the researcher kept saying, this cancer researcher kept saying this is science like you have to control, you have to experiment. The controls have to be right. It has to be reproducible and the artist was saying well I saw cancer cells die. I mean this is – I believe it but it actually reminds me of how data science in our world is becoming so important and the approach that data scientist take is very different is very different than the approach that the data artist takes right and it just reminded me – that was a great episode by the way – but reminded me of that idea, which is we have to be scientists, create hypotheses, look at the data, and do experiments. But one of the things that has always struck me about Velocity culture in particular but I think this extends out to the culture of the large tech giants which is you with ease have cluster of 100 Hadoop servers. You with ease have segmentation platforms and you easily can take 1% of your traffic and get statistical viability in 12 seconds or whatever the Facebook guys always talk about, but that’s not actually accessible to most people.
I mean most people have a hard time even getting the basics off the ground. So if you were to sort of extract yourself out of the AOL world and I know that you have worked in other areas but if you were talk to an architect at a dot com or someone that has doesn't have, you know small dot com or medium sized, what would you advise them about this idea because it comes off your lips as if you know let’s experiment, let’s collect all that data well that is easier for you. I know it is not easy but it is easier for you. If you got transferred in to the body of a CTO at a top 200 retailer in the mid-tier, what would you do?
Eric Goldsmith: That is a great question and I do not really have a great answer. One of the reasons that I like working at AOL is the ability to work at this scale, I revel in this scale. When you don't have that scale things are more difficult there is no doubt about it but I still think the data is there. The ability to get statistical significance is there, all the fundamentals are still there it just will take a little longer to achieve the goals because the volume is lower but the fundamentals are still same.
Joshua Bixby: How much do you struggle with teaching your non-techie executives to be stats experts. I mean you know some guy has great idea that, hey let’s do ads this way, let’s do this thing that way let's say culturally not necessarily you and they look at 20 minutes of data and they have already made up their mind where statistically you need a day, you need a week, and you need a month, do you face that battle or has your culture learned how to deal with stats?
Eric Goldsmith: No they have not. I think it is a problem that is universal and it is a challenge to bridge that gap just to be able to understand the technical side and the appropriate depth to do things right and also communicate that and convince people on the business side that you do need to do it a certain way that you do need to run a test longer than 20 minutes in order to get significant results. For me I’ve been fairly successful just trying to articulate that in a non technical way with examples and so forth. Usually the business people will get it. There is going to time pressures no matter what sometimes you just cannot do what you like to do from a purely scientific standpoint but it is a case that has to constantly be made and re-communicated. It is a not a problem that is going to go away any time soon.
Joshua Bixby: No I am feeling the same pain. I want to circle back to one of the topics we talked about earlier around analytics in the future and you know something like Google analytics which now has some RUM in it. Do you see that that is a path for how RUM will be utilized, do you think that that will be part of an analytics tool? What are your sense on that?
Joshua Bixby: Yeah I would love the number of feature requests they get and how much of it relates to this because that is always been my feeling. I just did a blog post about this a few weeks ago where I basically disregarded Google as a RUM tool in the sense that of the people I interviewed no one actually is using it for RUM. People said I have seen the screen once or I have gone to it occasionally but I have not seen a critical mass there at least among the customers we work with that are taking it seriously as a RUM tool which you know if no one is using it I would guess the feature set is probably limited but I would love to get some insight in to that. Tell me about as you look across the last 10 years of data science what's changed? And I ask this because I am so fascinated by your perspective because I have always thought of AOL and Yahoo as sort of the two harbingers of the future in terms of guys who started to take data science seriously early, organizations that you know from a media perspective and hype perspective have been passed by but still remain the bastion of sort of the roots of the entire community. I mean everyone that has done something important or the vast majority has come out of one of those two schools of thought so as you've looked through this history and see from the perspective of one of the early adopters has anything changed I mean you guys collected RUM seven years ago what has changed?
Eric Goldsmith: Well in my mind what has changed is the tooling has gotten sophisticated enough and has grown in that 10 years to allow the entire set of data to be stored and entire set of data to be analyzed not just samples. In the past when you had to deal with samples or small data sets and tools that could only work with small data sets you had to do lot of sampling and lot of more statistics a lot of how the way you do statistics on samples sometimes defers than if you have the entire data set your approach changes. Being able to work on the entire data set have the entire data set at your disposal allows you to dig into the outliers and allows you to dig in to the tail and allows you to get more insight of the data in my mind.
Joshua Bixby: So that is a huge difference and I would agree. I mean I just bought one terabyte drive just a little drive for my computer. It cost me 50 bucks and it makes me feel old but I am still amazed about that and I know really one terabyte drive actually is a lot less but is a nice little portable one and has a fancy case, but I am still amazed at the cost of data. I mean I can’t get over that as a business owner who has paid for data for now almost 15 years. We used to ask our guys to delete files off disks to save space. I haven’t done that for 5 years.
Eric Goldsmith: I still, I am like you, I was brought up in the time when storage was obscenely expensive and its just that habit of deleting stuff and keeping things under control is just so ingrained in me that I still do it regardless of how others feel.
Joshua Bixby: That’s funny, see I can’t say that. I have a MacBook Air and I got a warning, this is embarrassing, I got a warning last week that I am out of disk space and its like 250 Gigs and I am thinking what could I possibly have and it is 40 gigs of photos and it is 40 gigs of videos for watching on the airplane and it is 10 gigs of email, I mean I am that guy. I am the guy that says I love my MacBook Air but I wish I had a terabyte drive on it. So I can’t say that I have that same rigor.
Eric Goldsmith: That brings up another good point though about what’s changed and that ability to store all that data. It is not just having all the data to do the analysis but it is having the history to go back and look, so if we want to compare something that happened today or a change in a product or whatever it is, with data from a year ago, 2 years ago, 3 years ago, we have got all that data because it is so cheap to store it.
Joshua Bixby: So the data scientist in me which I am not a data scientist, but I am going to adopt that tribe for a while, because I think it is a pretty cool tribe to be in, the data scientist in me thinks that is cool, and then another side of me thinks, how often are you going to do that. The pages are so different, the functionality is so different. The browsers are different. The networks are different. You know one year to the next, does it really help you? Do you find that these correlations help your business other than, oh, isn’t it interesting, because I get a lot of stats as a business owner. I get a lot of stats, isn’t this interesting, I am like, it is interesting, but why do you spend 3 hours on that, so is it actually helpful?
Eric Goldsmith: You raised a valid point, you raised a valid point. There is so many things changing all the time and depending on the metric it may not be, it may be not be relevant to look back a year ago and try to compare the results; that’s a good point.
Joshua Bixby: Are there examples of where it is relevant, can you think of examples of where, let’s talk about a year, because in our world that’s a long time. Are there examples of where a year comparing something today, to a year ago to actually make an actionable business decision other than, oh look people are using different browsers and look it was slower, are there specific tangible examples you can think of where that has been helpful?
Eric Goldsmith: Oh lot of times we look at usage of a particular site over time.
Joshua Bixby: Okay, that makes sense, yeah.
Eric Goldsmith: Being able to compare year over year or even further back as you spoke.
Joshua Bixby; Yeah, this Thanksgiving this happened and so…
Eric Goldsmith: Exactly. It is helpful for capacity planning, it is helpful for just historical usage trends and so forth.
Joshua Bixby: Big things in the New Year. What are you seeing? Any bold predictions for me? Anything I wouldn’t expect?
Eric Goldsmith: No. I can’t think of anything.
Joshua Bixby: Ah you have been in this world too long. You know all those bold predictions don’t come out. I have a blog post, I write at the end of every year with bold predictions and you know at least three quarters of them don’t come to be true but it garners some readerships, so I am digging here, come on give me something.
Eric Goldsmith: Well one of the areas that I am really spending time on and interested in is again collecting the real user metrics with enough data to allow me to properly segment, clean it up and make it actionable and being able to get access to that data, be it nav timing, be a resource timing, API, the other things. Having that available is going to be a game changer and as you know nav timing is available on a lot of browsers now notably missing from Safari.
Joshua Bixby: I know. Let us just dwell on that for a second – Safari, Steve’s dwelled on it for a while, okay now we have our seconds of mourning, let’s keep going.
Eric Goldsmith: Right, so hopefully I'd love to see Safari join the game next year and give access to that data getting even more granular with the resource timing being able to look at it in more depth and just at a whole page load level but at individual resource load times and getting into other browsers, it's in IE 10 now on windows 8, I would love to see that in all the other browsers and just being able to get access to all that data, all that additional data that is needed to make RUM really actionable in my mind, that is really what I am looking forward to.
Joshua Bixby: That’s a good one. Now it is funny that you mentioned that the resource timings. If I was an executive at a CDN I am probably really nervous about that one. You know you and I have both used CDNs extensively and we know that if a tree falls in the forest and no one is around to look or hear sometimes it is not heard but you guys probably have a lot of insight into how CDNs work and don’t work other than no naming specific companies but you have probably seen that a lot of those resources aren’t at the edge or they are not served quickly or there is a problem on that server, I mean, this is the resource timings I think is really going to shed a lot of light on CDN usage.
Eric Goldsmith: One of the dirty little secrets in the CDN world is generally the way they work is that the servers at the edge, CDN server is on the edge, a request comes through them, they pass that through to the origin server and then cache it, or whatever the cache headers in the origins say to do, but if you’re following Sauders rules, it is at least 30 days, so it’s still in that server for 30 days and will be available for anybody else who needs it, except not really, because there is only a limited amount of space on those servers, so stuff get evicted, that you just assume it is going to be there for 30 days, but not really, and having insight into how much is really available on the edge, how much is really there versus what you think is there, that is going to be interesting.
Joshua Bixby: And I think expectations are certainly skewed by the fact that all these guys have really figured out how to get great test results, right. I don’t like to use the word gaming, although I have used in the past, but if your server is on the same network in the same building sitting next to the Gomez and Keynote servers and you pin content to that server, your test will look good for a month but when that pinned content needs to go to the next customer that is going to sign a 2 year deal, you know it doesn't look as good over time, so yeah, I think resource timings is going to be great for end users to start figuring out and really holding accountable to CDNs to having their content to the edge like they promise, so I am excited about that one as well.
Eric Goldsmith: Right. But something else that I think would be interesting in that data, in that resource timing data is the cache freshness or the cache ship rate at the end user's browser. This is where it kind of gets fuzzy, because there is some privacy concerns there. Because if you can tell via the timing results that something came from a user's cache then you know they have been there before and that's maybe crossing the privacy line.
Joshua Bixby: Yeah. I mean certainly my European friends with their cookie insecurities would probably say that but I mean in our world and you guys and many others you know everything about where that browser, you know almost everything about who that person is and what demographic they are in, and I mean, it doesn’t cross it anymore than existing retargeting technology or ad technology does today, right?
Eric Goldsmith: Well, in my world, I am also involved in the Do Not Track initiative for AOL and you know that working group that you see and so forth. I am involved in those aspects those privacy aspects of this data as well.
Joshua Bixby: It is a pleasure to chat again and as I say you are one of these unsung heroes that needs to get more profile, so are you going to start putting new topics at Velocity, can we get you out to some of these conferences, because you have gone quiet since what, 2010. You stepped off the circuit man.
Eric Goldsmith: Well, let me explain what happened there, I got kind of pulled sideways into big data, and so I have been doing different conferences, Strata and other things, but I got there because of RUM data, because when you are collecting that volume of data, you need some mechanism to analyze it and Hadoop and Pig and all of the stuff that goes along with the big data world and so I got kind of pulled sideways into that.
Joshua Bixby: And that Strata stuff is amazing, I mean I am good friends with Alistair Croll and some of the guys that spend their time being evangelical about that world and man that is as exciting or even more exciting than the world, obviously these two worlds mesh as you say, but that is a really exciting world, I think that stuff is amazing.
Eric Goldsmith: It is and I am hoping to this coming year 2013 kind of bridge the gap between the two and move back a little more, be a little more visible in the web performance base with RUM work than I am, time will tell.
Joshua Bixby: You should, we need you back here. No we need you back here, because there is not enough going on, there is not enough big data science being spoken about given that those things have started to become two distinct communities like Ops and Dev traditionally have and then these conferences come along to try to bring them together. We can't let big data run off on its own and be divorced from the Dev community, the operational community. Not that I welcome anyone, but I would welcome your continued involvement.
Eric Goldsmith: Well thank you.
Joshua Bixby: It is wonderful to chat, have a great day.
Eric Goldsmith: Thank you. Great talking to you.
Joshua Bixby: Thanks, take care. Thanks for listening and thanks again to Eric for making the time to chat. If you want to hear from some other big thinkers in the performance space check out webperformancetoday.com/podcast and if you have a suggestion for a future podcast topic or guest drop me a line at email@example.com. Have a great day.