eWHEREness by Perry Tancredi

  • Visualizing botnet activity

    Every now and then someone does something novel with our data.  We got a request from someone at Green Phosphor to map some IPs from them so they could apply their 3D visualization tool, targeted primarily at the pharmaceutical industry but with wide applicability, to network log data from a US state government agency.  They used our data along with the log data to visualize the activity from different IP addresses on different ports correlated to country across time.  The tool show stripes of low level activity showing traffic from all countries that seem to trigger major activity from certain countries.  The implication is that the activity is botnet activity and shows relative infection by country.  You can see the video at http://www.youtube.com/user/arkowitz#p/u/2/sB2jEhDZyBk.

    Full story

    Comments (0)

  • Fraud Analyst Q&A

    Ever wonder what happens behind the scenes at a credit card company fraud prevention department?  This Q&A on reddit.com with a Visa employee provides some insight.  You can even ask a question if you want.  There's nothing ground breaking, but it does make an interesting read.  It makes you wonder exactly what your credit card company thinks of you.

    Full story

    Comments (0)

  • One Billion Spam Messages

    The holidays are a busy time for spammers, and one of Quova's partners in helping stem the tide just reached a big milestone.  Project Honey Pot just received its one billionth spam message.  The spam itself was a phishing scam pretending to be from the US Internal Revenue Service (IRS), and the email that it was sent to was harvested back in 2007.  To "Celebrate" one billion spam messages, Project Honey Pot put together a report looking back at five years of spam data, and came up with some interesting statistics.  Also, it turns out that one billion messages is not a lot of spam, Project Honey Pot estimates that their one billion represents 125 trillion actual spam messages sent since 2004.

    It turns out that most spam is sent on Monday, and most spam is sent at 12:00 GMT, and 23:00 GMT is the quietest.  

    • Project Honey Pot ranks Finland the best in IT security, followed by Canada and Belgium (the US is sixth, and interestingly, Estonia, which has seen its share of attacks is tenth).  China is ranked the worst.    
    • The average time from when an email is harvested to when a spam is sent to that email has gone down from almost 50 days in 2004 to less than 22 days in 2009
    • While spammers may send more email in general during the holidays, spam sent on Christmas and New Year's day is much less than other days (21%  and 32% less, respectively).
    • Chase tops the list of the most phished organizations, and Facebook is second and gaining.
    You can read the report here.

    Full story

    Comments (0)

  • Does the Comcast/NBC deal mean the end of free content online?

    Comcast recently announced a deal that would give them majority control of NBC through a sale of all of Vivendi's stake and some of the soon-to-be former major owner, GE's stake.  There are several interesting questions about a future where a major content distributor has such a large stake in a large content creator (NBC Universal of course owns much more than the NBC branded channels, it also owns networks including Bravo, the USA Network, the SciFi channel, Telemundo, the USA Network, and more).  Paul Gallant was quoted in an AP article asking whether Comcast could use its new control to slow the growth of online video as a competitor to the traditional cable TV business.  I think Comcast is well aware of the decline of cable TV as a long-term business model, which is one reason they want a stake in NBC.  I think rather than trying to stem the rising tide of online video, and Comcast will start working on how to monetize it better, and encourage it.

    Hulu.com, which is owned by NBC, has already announced plans to start charging for content.  Millions of people already watch commercial-free TV online through their paid Netflix subscriptions, though the selection isn't on par with Hulu.  Comcast hasn't ignored online video, and along with Time Warner Inc. (not Time Warner Cable), announced in June an agreement to offer services that would bring more content online, but only to viewers who already subscribe to traditional cable services.  This isn't new, BellTV and Rogers in Canada already have similar services.  Time Warner calls theirs TV Everywhere, Comcast calls it On Demand Online and plan on renaming it to Xfinitiy (you can read their press release here).  They've been in trial mode, and plan on releasing to subscribers this month.  Comcast clearly is trying to figure out their online strategy and controlling NBC gives them a lot more leverage to get a piece of any potential revenue. In order for On Demand Online to be successful, Comcast would have authenticate subscribers effectively, which might be a challenge.  If it were to succeed, it could also threaten the access to premium content on Hulu as content providers could make their most valuable content only available to cable subscribers, thus maximizing revenue.  The current deal with NBC indicates that Comcast might be hedging its bets so that it will be well placed if either paid content on Hulu or online cable subscription services win out.

    So what of the fate of free content online?  A few months ago, Elizabeth Blair did a story on NPR about the economics of the movement of TV viewers to online services like Hulu.com.  In the report, Elizbeth quoted two interesting figures, that the advertisements on traditional broadcast media have a CPM ("cost per mille" or cost per thousand views) of about $25, while an ad on Hulu.com has a CPM of $40.  That's significant, but of course not all websites can charge a $40 CPM, and even shows on Hulu.com don't make as much as shows on traditional networks because the typical show on Hulu is only interrupted by three commercials, while there are several more on television.  The report on NPR also made the point that three commercials is just about the limit of what online audiences have patience for, so one could make the assumption that the only way for online revenue to catch up to broadcast revenue in the current model would be for the online CPM to increase dramatically, or for Hulu to directly charge for content.

    Comcast must understand that hanging on to cable subscriber revenue is going to be an uphill battle, especially with younger generations not accustomed to paying $100/month or more for content they can get for free online (either legitimately by accepting ads or through less legitimate means).  Soon, we'll find out how receptive Hulu's audience will be to paid for the content.  Either way, Comcast will be part of the mix if the NBC deal goes through.

    (If you want to know more about TV Everywhere, check out the FAQ on NewTeeVee)

    Full story

    Comments (0)

  • Geolocation as a tool against spammers

    Sometimes I come across an "edge" use case for geolocation.  This example from a conversation at the QDB (IRC Quote Database) is a good one, and shows how geolocation plus a good deal of free time can be used to fight spam.  Note that the conversation, like many informal conversations, includes some language that you might not want to load on your screen at work.  Specifically a couple of f-bombs and an s-bomb.  The use of the words is expletive, however, not descriptive, so in context should not offend (most).

    Full story

    Comments (33)

  • Is MySpace the next GeoCities?

    geocities and myspaceToday is the day that Yahoo closes down GeoCities.  GeoCities, of course, was the once widely popular free homepage hosting service.  I remember seeing to my first GeoCities page (as a result of an AltaVista search, my favorite search engine at hte time because it could find obscure pages), and thinking two things.  First, that the page itself was poorly done (a charge easily leveled at many MySpace pages), and second, that I was looking at something I hadn't seen before.  There wasn't as much user-generated content on the web in 1998 and the fact that GeoCities allowed anyone to publish their own web page easily changed the way I looked at that content.  All of a sudden, I had greater access to information, but it was more clear than ever that I had to decide what level of trust to put in that information.

    Later, I noticed a third thing about GeoCities: the advertising.  It wasn't subtle, to be sure.  By the time advertising showed up on Geocities pages, I was using Google as my search engine and I pretty much avoided any GeoCities results because I didn't have the patience to wait for the ads to load before I got access to the content.

    MySpace is different from GeoCities in a lot of ways.  The most important is the social aspect of MySpace that never existed to the same extent on GeoCities.  MySpace, like GeoCities however, is losing their reach and readership.  A few things will keep MySpace around, though.

    One of the early accomplishments of MySpace was that it was able to attract musicians and music fans.  Facebook will probabaly never offer the same freedom that MySpace does, which is one thing that attacts artists to MySpace, and Facebook would need to make a concerted effort, to attract the musicians and fans away from MySpace.

    MySpace's attraction to creative types is part of the other reason MySpace is going to stick around.  NPR had a story last week about MySpace and Facebook and made the point that while Facebook is attracting users away from MySpace, MySpace is holding on to some core demographics.  According to the story, Facebook and MySpace users are increasingly identified by race, class, and lifestyle.

    Demographics, of course, are useful to advertisers.  Targeted ads perform better, and any site that has a well defined audience will be attractive to advertisers.  GeoCities never had that, and MySpace does.  MySpace may have lost its place as the leading social networking site, but its core of users that remains is valuable, and that will let MySpace put up a bigger fight to survive than GeoCities could.

    Full story

    Comments (0)

  • Geography Matters... Right?

    George Michie from the Rimm Kaufman Group (RKG) recently wrote a two-part post titled the Geographic Impact of PPC. In it, he analyzed the data from a report RKG conducted to determine, as George put it, "if some well-known truths from the catalog industry also apply to the world of Paid Search, namely that geography matters."  I was intrigued. Of course geography matters, the only question is how much. Maybe the study can help shed some light on that. Right?

    Not so, at least not according to this report. In his description of the findings, George states that "we have to conclude that knowing the zip code of the user doesn’t have much value for retailers in Paid Search bid management." Huh?

    Let me start out by saying that George is very honest that there are major limitations to the study (see Part 2), and basically admits that no conclusions should be drawn from it. He does go on to draw conclusions, however. One of which is that ads placed for national retailers do as well or better when served to online users in the same zip code as a store. So... does geography matter or doesn't it? If the second conclusion is true and I were a retailer, I'd place more ads where I have stores, right?

    I like the fact that George and RKG are asking these questions.  Some of their findings are shocking, and they will evoke some discussion, which is good, but the study is flawed.  I find it hard to believe that either RKG or George actually thinks that geography doesn't matter in online advertising. For anyone who's in the online advertising business, experience and intuition tell us otherwise. Almost every online marketer includes some geotargeting options. The largest advertisers who make the most money for their customers and analyze performance more than anyone have the most granular geotargeting, and charge more for it (you can draw custom shapes with Google and Yahoo! added zip targeting last year). Quova has case studies from advertisers like 24/7 RealMedia as well as large and small organizations like Continental Airlines and WyzAnt that show that local ads and offers can increase conversions from 50-200% (WyzAnt and Continental respectively). And that's not just for local businesses, Melissa Mackey at Search Engine Watch just two months ago wrote about how national advertisers can reap the same benefits local advertisers have for years.

    So where did the study go wrong? Again, George is honest and admits there is very little correlation between the factors in their data when looking at traffic value.  Even in their best model, 85% of the variance was caused by random noise or other factors not available in the data. They did get somewhat better results (though still effectively useless) when looking at rural vs. densely populated areas, but even then 55% of the variance was unexplainable.  He compares those results to what Johannes Kepler found when trying to explain planetary orbits.  Kepler completely rejected one of his early attempts because the variance was just 2%, and kept working until he was able to describe the true model of elliptical planetary orbits.

    I think it's safe to say that the reason there is so little correlation in RKG's data is because one of the original assumptions was wrong. RKG mapped the IP addresses of 3M PPC clicks to individual zip codes based on data from another geolocation provider. To be fair, it's unclear whether that provider knew how the data was going to be used, but if they did, they should have made it clear that mapping IP addresses to user zip codes can be problematic.  Lots of traffic is aggregated through ISP proxies or other routing points. Many IP addresses are dynamic, so can be in one location one day and somewhere else the next.  Users often browse at work and shop close to home, and don't live and work in the same zip.  Even how users connect to the Internet can affect location (e.g. it's much more difficult to locate dial-up and satellite connections). George says that they originally tried to do the study with a cheaper provider and found the data unreliable.  So how do you describe data that results in 85% variance? I'm not saying that IP addresses cannot be mapped to zip codes. They can. But if you don't know which IP addresses can be reliably mapped and which can't, then you can't make any real conclusions based on that data. RKG's results show how starting with one bad assumption can lead to wildly unexplainable results.

    Also, one of the limitations of the study George lists is important: that the zip codes are "zip codes of the IP, not the user. We really don’t know the location of the user, we know the location of the IP address." That can obviously make a huge difference. Most IP geolocation technology locates IP addresses somewhere upstream of the ultimate end-point. That location might be a DSLAM (for DSL), an ISP, or even a satellite. It's rare that all the downstream IP addresses all share the zip code and almost a certainty that they don't share the same demography. DSL connections can be located fairly accurately because of the physical distance limitation to the DSLAM of about 5km. As an experiment, draw a circle with an 5 km radius around where you live and figure out how many zip codes and demographics that covers.

    Quova actually provides our customers with tools to help them determine the likelihood that a user is actually where the IP address is. That likelihood varies on several factors (the "factors not available in the data" George mentioned). At Quova, we provide a value called the Confidence Factor, that gives our customers an indication of how likely it is that the user is located were the IP is. That Confidence Factor is an aggregate of the other information and evidence we collect and of our Network Geography Analysts' knowledge of how the internet works. We also provide most of that evidence as raw data, so you can find correlations, if they exist, with any one piece of evidence.

    The fact that RKG didn't have the information to tell them what data to trust and what not to unfortunately makes their findings only interesting talking points, not real conclusions. I might sound critical of the study and I don't mean to be. I don't think George would disagree with anything I've said and don't think even he would claim that the findings are anything other than anecdotal. If nothing else, he raises some very interesting questions about how national retailers should think about online advertising.

    If RKG has a real interest in this, I hope they follow Kepler's lead and try again. I look forward to more studies from RKG and others.

    Full story

    Comments (2)

  • Welcome

    Welcome to my section of Quova's eWHEREness blog.  I manage Quova's products and services and will be posting my observations about the state of geolocation and related topics, and probably some non-related ones, too.  I'm also interested in your feedback and hope that there will be some good comments and discussions here in eWHEREness.  Also, if you have any feedback about our products and services or want to hear what others are saying, please visit our discussion forum.

     

    For my first post, however, I thought I'd write about something fun that might be interesting to folks interested in geography.  I was walking around the Mission neighborhood in San Francisco and wandered into the Artist Xchange, where I saw these prints of hand-painted maps of the city by Niana Liu:
     

    niana liu i live here! prints                   Niana Liu Potrero

    These are part of her "i live here!" series.  My cell phone pictures don't really do them justice.  They are prints from water color originals, and the effect of hand-painting something as precise as a street map is really neat.  Also, I like that she choose some of the more obscure neighborhoods to represent.  I haven't lived in San Francisco very long, but I don't think I'll ever really get used to all the neighborhoods here and the inevitable indignant corrections that come with them ("no, I live in the Inner Lower Heights").

     

    I also recently went to the Maker Faire here in San Mateo, CA, and I came across a table by Kelso Doesnt Dance at the Bazaar Bizarre:

    Kelso Doesnt Dance

     

    Kelso makes "upcycled" wallets and other accessories out of maps, among other things.  I didn't get a chance to meet Kelso in person, but I did chat with her mom who was manning the booth and was nice enough to let me take this picture.  You can see more of Kelso's wares on her website at http://kelsodoesntdance.com and you can pick up her road map wallet wallets and other things on her shop on Etsy.

     

    OK, that's it for now.  I promise to write about more serious things in my next post.

    Thanks for reading.

    Perry

    Full story

    Comments (0)