Is Apple in Danger of Anti-Trust Proceedings ?

Update:
*I just learned that Apple dropped all DRM – and the two-tiered system… which completely invalidates my arguments below. So… nevermind!*

Apple just blocked the Palm Pre’s ability to interface with iTunes, as reported in [Information Week ](http://www.informationweek.com/news/personal_tech/smartphones/showArticle.jhtml?articleID=218500862).

Granted – Palm’s interfacing with iTunes was questionable and a hack — it basically emulated being an iPod to trick the software into compatibility. Apple’s reasoning for the update is that it “addresses an issue with verification of Apple devices”, which would often be fair for most companies and software makers…

However one needs to think about Apple’s market share and new positioning as a vertical mobile provider. Not only is Apple the dominant player in the marketplace, but they’ve fixtured themselves into the entire chain and locking out competition at every step — they’ve become the a primary force that is controlling the Hardware and Retail of music devices, the Software to install on them, and even the music distribution.

Sound familiar? Just a few years ago, the US and the EU took Microsoft to task for something similar — bundling and tying the Windows & IE browser together, and then into PC’s at reduced rates to block out competitive operating systems and browsers.

Palm’s approach was largely unethical in many ways and definitely a dirty hack. It’s also something that can — and will — probably be re-introduced as developers play cat&mouse with Apple to work-around device ID hacks , just as others have been re-enabling jailbreaks to the iPhone on every OS update.

But Palm’s recent situation brings into light some larger questions — through iTunes, Apple bundles the following things together: iPhone / IPod support and management, music purchasing, music management.

By integrating all those things together, Apple has created what is essentially not only a distinct market advantage, but an anti-competitive practice :

– users can not put the songs purchased through Apple on a non-Apple device , unless they pay for a premium for DRM free content
– users need to run additional software in order to install non-Apple procured music from their library onto a non-Apple device
– owners of non-Apple devices are obviously penalized for owning their device — through needing to handle countless workarounds , paying more for content — and are given incentives to abandon their current setups for a complete Apple solution

While Palm’s approach to the situation is largely questionable, Apple’s handling of it is illustrating a lot of the same parallels to what forced Microsoft to unbundle its software… and I’d wager that Apple may very well have ‘shot themselves in the foot’ with this, since few people knew/thought/cared about it before today.

Within the context of US and EU anti-trust laws — and not thinking about ‘fairness’ to Apple… it seems to me like Palm could push the Justice Department for Anti-Trust measures against Apple — and that could affect not just the iTunes support of devices, but the entirety of Apple’s music retail business.

I’m not a lawyer – nor do I pretend to be one. But this seems to have little to do with interpretations of US statutes and laws than it does referencing history and case law. A good lawyer may very well be able to give Apple a free pass on this — but you don’t need to be a lawyer to see that there is more than enough correlations between this situation and successfully prosecuted anti-competitive practices for this to make it to the courts.

For more info on US anti-trust practices, check out this interesting article on the tying/bundling distinctions and applicabilities — with some great case studies on what is considered illegally Anti-Competitive and what is not [Antitrust In Distribution – Tying, Bundling and Loyalty Discounts, Resale Pricing Restraints, Price Discrimination – Part I](http://www.metrocorpcounsel.com/current.php?artType=view&artMonth=April&artYear=2006&EntryNo=4751)

On that 'Zombie Photos' report…

CNN and BBC have both covered something called ‘Attack of the Zombie Photos’ – an experiment out of the University of Cambridge that tested to see how long a photo that was deleted by a website would really be deleted.

I found the experiment to be incredibly flawed and misleading.

The researches tested not the networks themselves, but internet cache copies. So a network could very well have deleted the image from their servers, but that change(deletion) had not propagated to their Content Delivery Network (CDN) in time — ie: the photo was primarily deleted from their servers, but a distribution copy on another (possibly 3rd party) server had yet to be deleted or timed out.

While the researchers did indicate that they were testing the CDN in a graph, their text barely made mention of it, their analysis none, and they routinely called the CDNs “photo servers” — as if they were the primary record. It seems as if the report was more about FUD ( fear , uncertainty , doubt ) than examining real issues.

You can view the report here : [Attack of the Zombie Photos](http://www.lightbluetouchpaper.org/2009/05/20/attack-of-the-zombie-photos)

My comment is below

> I think this experiment is great in theory, but flawed in practice and conclusions.
You are not testing to see if an image is deleted from the Social Network, but from their CDN. That is a HUGE difference. These social networks may very well delete them from their own servers immediately, but they are not exposed to the general internet because a (often third party) cache is employed to proxy images from their servers to the greater internet. Some of these caches do not have delete functionality through an API, the content – whatever it is – just times out after x hours of not being accessed. It also is often ‘populated’ into the cache by just mapping the cache address onto the main site. Example: http://cdn.img.network.com/a may be showing content for several hours that was deleted from http://img.network.com/a

>Perhaps you know this already – but in that case you are presenting these findings in a way that serves your point more than the truth of the architecture.

>In terms of your inference of the EU and UK acts, I wouldn’t reach the same conclusions that you have. Firstly, one would have to decide that an unmarked photo, living at an odd cache address with no links in from a network identifying it or its content, would be deemed “personally-identifiable data” — I would tend to disagree. Secondly, while the purpose of it may be to “share it”, it would really be “share it online” – and dealing with cache servers and the inherent architecture of the internet , I think the amount of time for changes to propagate after a request for deletion would easily satisfy that requirement. I also wonder if the provision to access ‘user data’ means that it is done in real time or in general. I’m pretty sure all these sites store metrics about me that i can’t see.

>Again, I will also reiterate that we are talking about ‘cached’ data here — and that the primary records have been deleted of the requested data. At what point do you feel that privacy acts and litigation should force the use to access / view *every* bit of data stored :
– primary record
– server caches
– data center/isp caches
– network ( university , business , building , etc ) caches
– computer / browser caches

> Your arguments open up a ‘can of worms’ with the concepts of network optimization. I wouldn’t be surprised if your university operates a server on its internet gateway that caches often requested images — would they too be complicit in this scheme for failing to delete them immediately ? How would they even know to do so ? How could the network operator identify and notify every step in the chain that has ever cached an instance of the image ?

Subversion Deployment Strategies

I’m wondering how people like to handle deployment of projects maintained in subversion.

After MUCH testing, I’ve developed the following method that seems to work well for me:

– Trunk is managed in /trunk
– Tags are in /tags; When I tag something, I ‘svn cp /trunk /tags/version-date’ like most people
– I maintain a special tag at ‘/tags/-current-deployed’ – I merge trunk into that whenever I deploy, then ‘svn up’ that tag on the server.

This methods does a few things:

– it allows me to make changes on the ‘current deployed’ on the server if something needs to be fixed there ASAP, then merge into trunk — without breaking the ‘sanctity’ of a tag
– I really just ‘svn up’ and ‘svn merge’ — i never have to ‘svn swtich’ then ‘svn up’ the project. i find that easier.

I’m sure others have good strategies. What’s yours?

Dear Hulu, I told you so.

Dear Hulu,

I told you so.

A few weeks ago you blocked Boxee.

I said in my article [Hulu’s dumbest move: blocking Boxee](http://www.destructuring.net/archives/2009/02/19/hulus-dumbest-move-blocking.html)

> See, internet people are smart and savvy. It won’t be long until someone makes a plugin / patch / version of Boxee that identifies itself as Safari or Internet Explorer or something else — re-enabling the ability for users everywhere to revel in streaming video.

Guess what. They did. And it wasn’t even hard.

Boxee now shows Hulu content through Hulu’s own RSS feeds.

So take this warning…

If you’re lucky, and they’re still identifying themselves as Boxee devices — don’t block them. Embrace them.

Don’t follow the advice of your content producers — educate them. Help them understand new media.

Tell them:

> Someone is sitting back on a couch. there are probably multiple people watching. this is probably on a big screen. we can charge 5x the CPM for this display.

And have a business that works not only for you, but your consumers and corporate partners alike.

You’ve been given a second chance — don’t miss it.

Examining foreign keys in Postgres

I often need to inspect fkey usage in Postgres to audit views , and it can be a PITA combing through data.

I found this little gem on the postgres mailing list by Thomas Kellerer

> SELECT c.table_name || ‘ references ‘ || tu.table_name
FROM information_schema.table_constraints c JOIN information_schema.constraint_table_usage tu ON (c.constraint_name = tu.constraint_name AND c.table_schema = tu.table_schema)
WHERE c.constraint_type = ‘FOREIGN KEY’ ;

That lists all your table fkey relations

With a little bit of magic we can clean it up to alphabetize a bit…

> order by c.table_name , tu.table_name

and if you want to find what tables are referencing the table you’re on…

> WHERE c.table_name = ‘$$NAME$$’ or tu.table_name = ‘$$NAME$$’

and to make things a bit more clear… we can add in the constraint name

> SELECT c.table_name || ‘ references ‘ || tu.table_name || ‘ via ‘ || tu.constraint_name
FROM information_schema.table_constraints c JOIN information_schema.constraint_table_usage tu ON (c.constraint_name = tu.constraint_name AND c.table_schema = tu.table_schema)
WHERE c.constraint_type = ‘FOREIGN KEY’ ORDER BY c.table_name, tu.table_name;

I’m not sure how the actual fkeys are stored in terms of columns, but if anyone has an idea, i’d love to update this call to list the columns on each table

Hulu's dumbest move: blocking boxee

Starting this week, Hulu will block the boxee client from its streams.

The heads of both Hulu and Boxee addressed it on their corporate blogs, calling it unfortunate
[Boxee Posting](http://blog.boxee.tv/2009/02/18/the-hulu-situation/)
[Hulu Posting](http://blog.hulu.com/2009/2/18/doing-hard-things)

The summary is very simple. Hulu’s content providers got mad to learn that people were watching Hulu on TV sets, not on computers. TV ads & Neilsen metrics provide for more income than online view ( online viewing was designed as a way to make extra cash, not make big $$ and cover production costs ) — and the content producers don’t want to steal people away from network viewing. They want to get people to watch ads on their couches in groups, on the big screen, where revenue is better and ROI/efficacy is higher.

The situation is obviously unfortunate to boxee & other set-top-device/media-center users… but also a huge setback for Hulu.

Let’s be clear about this – Hulu didn’t want to do this. They make money every time someone watches a video on their site – whether its Boxee or a PC. They don’t care how people watch the content they distribute. It’s the content providers that forced this action.

But in caving in to their providers, Hulu became responsible for the largest mistake they will have made.

See, internet people are smart and savvy. It won’t be long until someone makes a plugin / patch / version of Boxee that identifies itself as Safari or Internet Explorer or something else — re-enabling the ability for users everywhere to revel in streaming video.

And once people do that, those media devices are identifying themselves as not-boxee. Those devices are saying “Oh, hey there. look at us, see, we’re NOT connected to a TV, we’re just a computer”.

And those devices aren’t saying “Yes! we’re a boxee device! Someone is sitting back on a couch. there are probably multiple people watching. this is probably on a big screen. you can charge 5x the CPM for this display.”

And that is what Hulu – and the brain-dead shortsighted content producers – lost. Boxee and other set-top devices are how people will consume media in the future. It’s happening already. The rise of their popularity combined with the costs of cable in a flailing economy have many people dropping subscription services for broadcast networks and online viewing.

So in short, Hulu shot itself in the foot.

Boxxee users will get hulu content again, surreptitiously, but Hulu will only be making standard CPM rates for it — and losing out on a chance to charge for premium CPMs.

By the time Hulu can make the content providers wake up, far too many projects will be made to liberate video onto set-top devices — and they’ll start spending all of their money on bribing projects to identify themselves as set-top devices , or lawyers to threaten and sue developers. ( and that is successfully or not, as they’re spending money and not winning anything)

So congratulations Big Media, you’re an idiot. Again. Next time technology & culture shifts, try embracing it and designing new revenue streams on it — not try to quash it and live in the past.

a quick note on – mac notebooks

I’ve gone through a fair amount of Mac notebooks.

I think I had 5 of the aluminum Powerbooks over the years — if you really used them as a portable device, they would either fail from usage or get too banged up.

A few years ago I decided to just ‘screw it’ and get the cheapest iBook around. While a powerbook never lasted more than 14 months without needing repairs, the iBook lasted well over 3.

A while back, a friend made fun of me for accepting that there was an “Apple Tax” — the cost of upgrading each year or so because iPods don’t last long ( seriously – those hard drives just FAIL after a year or so )

I realized I was ok with this, and after modeling some expenses vs. usage, I’ve been a firm believer in buying the ‘cheapest’ mac portable around for a while… and this is my rationale:

1. The low-end Mac portable is $1300 (current model) or $1000 for the last generation.
2. The next ‘size’ up for screens are $2000.
3. Applecare past the original 1st year is $350
4. Most people tend to buy a laptop for 3 years.
5. The differences between models are honestly negligible for 90% of users, and can mostly be offset by dropping in $100 more in RAM
6. If you buy the cheapest model *without* the warranty, you can buy the new ‘cheapest model’ in 18 months — that machine is often cheaper and faster than the last machine you purchased – and faster than the last ‘highest model’ .

My point is this: if you spend $1300 on a macbook today, you can probably buy a $1200 macbook in 2010. Since you budget these machines to work for 18 months, you don’t need the warranty. Then in 2011 you will have a machine that is 1 year old and (likely) faster than machine you would have purchased today for $2350+ ( If you’re budgeting the machine for 3 years, you need the warranty ). Stated differently: $2500 will buy you 2 low-end macbooks with no warranty over the course of 3 years; $2350 will buy you 1 high-end macbook that needs to last 3 years.

For me this has been a no brainer.

A primer on Web-based Business P&L Operations & Optimization

A Primer on Web-based Business P&L Operations & Optimization

# Overview

A lot of what I have been doing professionally over the past few years has involved the management of P&L sheets ( Profit & Loss ). For those unfamiliar with the term, P&L sheets are basically a management guidebook for charting how your organization, department, or product spends and generates revenue. Not only do they provide you with the figures that you need for annual reports, for budgeting, and for investment opportunities, but they also facilitate your understanding of your financial strengths and weaknesses, and suggest where you may be able to improve.

The first time that I had to deal with P&L sheets was during the infancy of FindMeOn. I was attempting to forecast corporate overhead and monthly / yearly expenses during our investment pitching, and this stuff was *completely* foreign to me. I resorted to recruiting a specialist onto the team to help me handle this P&L stuff. Unfortunately, this person was only a self-described ‘specialist’, and didn’t have the expertise that they had claimed — so the workflow fell on me to compute all of the Cost of Sales for overhead and fixed costs. The ‘specialist’ would then review and standardize the data, which was nothing more than a sign-off as ‘correct’. Potential investors were not impressed. Our spreadsheets were ultimately chicken scratches made by a madman (me!) with math that barely make sense to the untrained … and I didn’t know any better until a VC friend was kind enough to sit me down and give me a stern “talking to”.

For some insane reason, there isn’t much information out there for budding startup folks to reference on this topic.

Before continuing , I’d like to thank some friends for giving feedback on an early draft of this document:

– Avi Deitcher, founder of Atomic Inc, an operations / management consulting service. Aside from standard consulting , Avi routinely functions as an interim CIO, COO, CTO for startups. He’s one of the few people that I know who recognizes that both Microsoft and Open Source products have their places, and who maintains a brilliant blog on operational strategy at http://blog.atomicinc.com/.
– Rick Webb, Partner and COO of The Barbarian Group, recently named Digital Agency of the Year … again. Brands, agencies, startups, and even Internet Week and the Webby’s turn to Rick for sage advice as the guru of all things online. Rick often blogs about smart stuff over at http://www.barbariangroup.com/employees/rick_webb.

A full P&L sheet for a web oriented company is a huge document, taking into account labor, taxes, overhead, infrastructure, vendors, multiple departments etc etc etc… to give a full summation of the company’s activities. To handle this data, I use an Excel document that has a half-dozen sheets with hundreds of columns and rows to account for everything a company spends / earns / generates. The document is never quite complete, nor is it easy to describe. Explaining a full P&L sheet in one article would be a daunting task, and likely overwhelming.

Instead, this article is designed as a primer — very limited and with a focus on the Cost of Goods Sold and Sales with relation to scalability issues on fixed cost items like bandwidth.

If you really are interested in knowing about P&Ls for tech companies in-depth, in the near future you’ll be treated to some articles on this subject by Avi. You might get a series of articles co-written by us as well — we’re currently working out the details.

Before I move forward, I want to admit to an ‘error’, as my friend Avi has pointed out to me: I conflate several concepts in this overview in my attempts to make this more understandable to the non-Business School crowd. I also gloss over certain important items to make this article more digestable. Specifically, those items are:

– Cost of Sales
– I focus on the cost of serving up Web content of any type in regards to bandwidth.
– I do not cover hosting related costs: power, software / cpu licenses, hardware, colocation fees, management overhead, engineering staff, etc.
– I do not cover corporate infrastructure costs: technologists who build / maintain your product, content creators for your product, a sales force, other employees, office overhead, etc.
– Optimization of Cost of Sales
– I limit myself to the sections that I cover above
– Revenues from Advertising business
– I introduce this topic due to how it + COS can affect product strategy, however, this may not be applicable to your particular business

Avi also mentioned a very important differentiation that people should keep in mind, which I’ll quote below:

> You should be discussing COS (cost of sales), which is what you use in service, and not COGS (cost of goods sold), which is what you use in products. Yes, I know, many people mix them up, but not the real experts. Allow it to be part of your education of the readers.

So … Avi nailed it, and I admit it — I constantly conflate COS and COGS. In my defense, it’s because the web projects that I work on tend to straddle the two – and we often might build verticals off the same infrastructure as a Service to institutional clients and a Product to individuals (or vice versa).

# Core Concepts

The approach that I will take with this article is to provide a quick and simple primer for people with little or no business background so that they are better able to understand the costs of operating a web based business. Understanding this stuff is *extremely* important to both startups and to established businesses alike:

– Costs are a function of scalability in userbase and in usage. It obviously costs more money to deliver your product to more consumers more times.
– Costs are a function of product design. Your product, at its heart, is really a User Experience. Depending on the way you have designed the product, the architecture, caching, etc, this User Experience that you want to deliver may be either cost effective or cost prohibitive.

Here’s an important rule for you to remember:
> _Your online property – whether it is a service, publication, network, or marketing exercise – is a product._

If you are able to better understand the costs related to your product, you may be able to alter the User Experience or the Product Design in order to make it more profitable. If your web-property is a service, you need to calculate how much you must charge clients in order to break even. In one of my proposed B2B projects at FindMeOn, we realized that a campaign that we wanted to launch with a non-profit was prohibitively expensive — the client couldn’t even cover the projected bandwidth costs, much less any break even or profit for our labor. In another B2C project I worked on, we realized that we would have to charge users a high monthly fee to be able to provide the service level that we wanted, which led us back to the drawing board to fine tune the product — in terms of web engineering, application design, and user experience / capabilities.

I want everyone reading this article to walk away with at least one more concept ingrained in their minds:

> _ Understanding scaling costs is important, because you may have to rethink your Product Design and User Experience to meet Business Goals. _

It does not matter if you don’t understand the technical intricacies of scaling costs for your product — you can always hire a specialist like me to assist you with that. You simply need to understand that beyond labor and servers and overhead required to produce your product, it costs money to deliver the final digital product to consumers — and those costs vary.

# Understanding Costs Through Real World Examples

I’ve always found that the most efficient methods to both learn and to teach are by example, so to illustrate these concepts I’ll utilize a few real world examples that I have encountered over recent months.

In the following examples, I *mostly* focus on making products more cost-effective on a per-page basis based on ad support. Let me repeat this – two of the examples focus on ad supported business models, and in both I refrain from discussing types of ad optimization / targeting strategies. One could easily offset high costs with very efficient advertising or charge-models that allow for a product with a more resource-intensive user experience. Similarly, these examples are based on ad supported revenue as a concern — and note the word concern. This does not mean that the ads are the only revenue, nor that they are profitable or even cover the associated costs — this simply means that it costs money to deliver content, and this cost is at least partially offset through the display of ads.

## Example 1 – A Publishing Oriented Website

A publishing oriented website wanted to deliver a rich user experience through an innovative and rewarding site design that featured lots of high quality images. I was brought in after the prototype was built to their original specs, and I took the prototype for a test drive to derive potential usage patterns and formulate their COGS. What I discovered was not good for their bottom line.

The original version of the site followed a fairly standard paradigm in web publishing: there were 10 articles on the front page, with each one consisting of a block of text and one or more images. What wasn’t standard was the pageload — the articles’ aggregate content and images ended up at around 1.3MB of bandwidth, and the site’s HTML, CSS, JavaScript and associated images were an additional 600k.

The site’s current revenue stream was advertising based, and there were 2 IAB standard ad blocks — that meant that every unique visit to their website cost them an initial 1.9MB in bandwidth, but only generated revenue from displaying two ad units.

Similarly, the performance for repeat users was equally disappointing. Repeat visits usually receive a discount for cached files – html pages, support files (js / css), images or flash media – however, in this instance, the site integrated new content in such a way that caching was not optimized. I calculated that someone who visits their website after an update would incur 700kb of bandwidth and only be exposed to 2 ads on their front page.

By understanding the associated costs and business interests, I was able to work with their product design team to reshape the User Experience (UX) and the site-design into something that was more cost-effective – all while maintaining a positive user experience.

– The initial pageload was dropped down to a total of 400k
– Repeat pageloads were dropped down to approximately 75k (after content update)
– ‘Full Story’ content was replaced by a mix of ‘Full Story’ with Teaser / Excerpt content — designed to entice people to ‘click for more’ and create new ad impressions through focus pages.
– Images were tiered to include previews on the front page, medium resolution photos within the focus pages, and a new class of image-focus pages that were ad supported and included the hi-res photo.
– The site was slightly redesigned and optimized to lower the size of support files to 250k

## Example 2 – A Niche Social Network with Rich Media

A pre-beta niche Social Network included the usual mix of social networking fare (profiles, friends, messages) along with some premium content (high resolution photos & videos, etc.) for their target demographic.

Social Networking projects can be difficult to balance because of the nature of experience and the advertising that they provide for — different classifications of pages generate different usage patterns, necessary bandwidth and advertising rates. As a very basic overview: Home pages, Profile pages, and published content garner significantly higher rates as they demand more of a user’s attention; “messaging” functionality, modal pages, and gaming/application pages simply don’t monetize well, as they don’t foster user attention – people tend to tune out the ads.

To address this company’s needs, I worked with the product team to achieve the follwoing goals:

– Allowed for higher quality images and a richer user experience on the ‘Tier 1′ advertising pages (profiles and published content)
– Dropped the modal pages to the simplest html possible, while maintaining good design / UX. Migrated as much of the css / js / etc. into external files; aggressively cached and recycled this content onto the function / modal pages.
– Limited the amount of AJAX used. While it affords an amazing UX, AJAX also (typically) means “interactions without ads”. By reviewing and selectively limiting AJAX usage, we were able to drive more Ad impressions without hurting the User Experience.
– Determined ideal video sizes (in terms of bandwidth) to balance revenue goals against. Used interstitial ads and segments to handle videos in excess of this number.
– Decided which products and functionalities could be used as loss-leaders to maintain brand loyalty, and which could not be; set parameters for performance review.

Through simple techniques like those mentioned above, we were able to make the networks’ operations incredibly more cost-effective — making their P&L sheets more profitable.

## Example 3 – A Social Media Campaign for a Brand

Not all web properties are business entities — many are produced as Marketing / ROI campaigns for brands. Because (micro)sites like these rarely have any sort of revenue stream and exist mostly as an advertising expense for brand loyalty, they’re a bit of an anomaly in terms of streamlining costs.

In instances such as this, I work with the Creative Team ( Digital / Production Agency & Advertising Agency) and the Brand’s Internal Marketing Team to clearly define a User Story that best meet the brand’s goals (ie: what is the user’s takeaway / brand impact), to list out any ROI indicators, and to define a User Story that best meets the Creative team’s user experience.

Many people assume that the brand’s Creative Team and Internal Marketing Team want the same thing — that which is best for the brand. This is really a half truth — while they are both indeed focused on what is best for the brand, the Creative Team almost always thinks about this within the scope of the current Campaign / Project, while the brand’s internal marketing team is also thinking about long term growth and how the current campaign or medium ties into other campaigns or mediums that are simultaneously being marketed to users.

When streamlining a branded social media campaign, you need to first know what the brand’s own goals are (how many people they want to reach, what is the messaging) and then what the Creative Team’s user experience goals are. You are then able to create a technical user story, map it to a clickpath, calculate the cost of each impression, and suggest optimizations that won’t derail the Creative Team’s UX and Campaign while simultaneously meeting the brand’s goals.

# Calculating Costs

To calculate costs for a web-based product, I break things down into 3 phases:

– Define Typical User Paths
– Profile User Path Elements
– Input everything into generic spreadsheet software that can handle multiple worksheets which reference one another.

## Define Typical User Paths

As a start, let’s clearly define what I mean by the term user path:

> A user path (in this context) refers to how a typical user will interact with your web property during an engagement; it is the finite list of pages they encounter in a single visit.

The word / phrase userpath can refer to many things in different contexts – this is what it refers to within the scope of this article.

Note how I said typical user. If you’re a startup founder, I may be about to break your heart…

The internet isn’t a Field of Dreams. While it’s nice to be hopeful that “If you build it , they will come” – you need to realistically know that people won’t be flocking to you in droves like you’re Kevin Costner with a mystical ballpark with the children of the corn. User’s are going to trickle in, and more importantly they’re not going to have the usage patterns that you either want or expect.

If you’re reading this article, here’s a surprise — you’re not a typical internet user. (What a great self-selecting audience I have here!). You may be a “Power User”, a “Proficient” user, or some other marketing-speak styled distinction about someone who works in digital media or uses the internet as their profession. Regardless, it doesn’t matter what you *are* … it matters what you are *not*, and you are not typical nor emblematic of internet users as a whole. You think of, use and interact with websites very differently than most – and may have problems with grasping typical user patterns. Let’s try to fix that.

A realistic goal for user experience is this:
– A user will visit your website 3 times a month
– Each one of these visits includes viewing 3 pages

(it is realistic, with a good product, to achieve 5 visits of 9 pages each, but that depends entirely upon the product – these numbers are a safer starting point)

Unless you’re Facebook, Twitter, or some other “Crack App” of the moment, these numbers will apply to more than 90% of your users. I’ve rarely found more than 6% of site visitors to be ‘active’ users for the measured period — and keep in mind that even the most successful web services tout their active memberships to be only about 25% of their registration base.

Fanatical users are a limited bunch — and while your application may be designed for their experience, most of your web traffic will come from casual users.

As a rule of thumb, I aim for numbers that represent the 80th percentile of users — ie, the common experience shared by 80% of your users.

It’s really hard believing numbers like this. I still don’t entirely… but I’m “in the industry” and think of technology differently than most others. I cannot imagine not checking into any of my major apps a few times a day — and I’m a “member” of dozens more websites than non-industry friends. But I also admit that I too have that old Friendster account that I use for 10 minutes a few times a year, a Perlmonks account that gets me for a few 3-page visits when troubleshooting code each month, and I even jump onto MySpace for a few minutes every now and then. At times I’ve been a fanatical user of each of those services — now I barely think of them in passing.

Truth be told, that’s how most people are… and for every person that reloads your website 100 times in a single day, there are dozens that don’t get past the first page.

If you want hard data on this , I suggest the following

1- Look at your own server logs
2- Look at any quantcast / compete / alexa score

You’ll see that the numbers come out the same. You’ll likely also see another characterization — I’ve found that there is a sharp transition between the casual user and the diehard user. Its fairly amazing … people either love your product or they are ambivalent.

Getting back to the userpath…

With realistic numbers in mind, you need to create some sample clickpaths that entice those 80% of your users to interact with your site. If you’re feeling fancy, make another userpath for the powerusers that you hope to attract. Generating the userpaths is dead-simple — you just click around your site, assuming that you visited the index page, or got to it from a search engine / ad — and try figure out what users are likely to click on.

Even better — if you’ve already launched (or just don’t want to estimate / forecast these numbers), you can install some tracking & analytics packages then generate a report based on real data. This is my preferred method, however, it cannot be done before there is a beta — and this sort of work is really beneficial when you’re developing an initial product. For pre-launch products, I use information from sites like Alexa, Compete and Quantcast to determine usage patterns of similar-in-functionality or competitive landscape websites.

Userpath’s aren’t very complex — here’s one from my publishing example above:

Publishing site userpath

1 User visits homepage
2 Clicks on Article Focus
3 Clicks on Video Focus
4 Clicks on Related Content button

## Profile User Path Elements

You can profile User Path elements relatively simply with a wordprocessing / spreadsheet program and a browser like Safari or Firefox. Note: Firebug in Firefox and “developer mode” in Safari make this much easier.

For every page in your userpath, you should create a new page / worksheet to include the page elements. I’ll break them down into 5 categories:

– The actual Page HTML
– Included Images
– External JS Files
– External CSS Files
– Rich Media (flash, video, etc.)

I’ll also break down each category into usage classifications:

– is only used on page
– is used on some pages (ie: section pages)
– is used throughout the site

The reason why we classify the elements is that it allows us to understand and to forecast caching for re-usability.

## The Spreadsheet

I use the same basic setup for every spreadsheet that I build: one sheet for the userpath, which references a sheet for each page in the userpath.

The userpath sheet really just contains a listing of all the different pages in a userpath, along with some functions / calculations. We include references to data contained in sheets that are dedicated to breakdowns of each actual page in the userpath. Each page in the userpath gets its own sheet to handle a breakdown of all the content contained within.

### Sample Sheet: Userpath

This is what a Userpath worksheet in my spreadsheet software usually looks like:

Page Type Delivery Bandwidth(per page) Impressions (per Gigabyte) CPM @ 50¢ # of pages in visit delivery bandwidth per visit
Index Page (ref) (func) (func) ? (func)
Article Page (ref) (func) (func) ? (func)
Focus Image Page (ref) (func) (func) ? (func)
Focus Video Page (ref) (func) (func) ? (func)

This userpath sheet lists the different ‘classes’ of pages that the users will encounter, along with their calculated bandwidth and the number of times each page is encountered — per-visit. This turns an almost intangible user experience into a graphed representation.

Here is what you need to know about the specific columns:

– we reference the ‘Delivery Bandwidth’ per page type to date in its own sheet (see below)
– we calculate the ‘Impressions per Gigabyte’ – or how many impressions of this page fit in 1 gigabyte of data
– we calculate the CPM for delivery with bandwidth priced at 50¢ / Gigabyte – how many impressions can i have if bandwidth costs 50¢ / gigabyte
– we have # of pages per visit as a number. We can quickly change this when trying to optimize new paths. This is traffic forecasting. Some pages might only get 1 view, while others receive 20 views.
– we calculate the amount of bandwidth used for this pagetype per delivery

I typically compute a few more values on this page too:

– Subtotal for all bandwidth per visit
– how many site visits you can get per gigabyte
– how many site visits you can get at different CPM levels

The userpath page acts as both an index and as a conclusion — mapping out many referenced sheets and summarizing them as well. The only data entered / manipulated on this page is the number of pages per visit — everything else is calculated based on data referenced from other sheets.

The usage of many references, lets makes this live document. As the site progresses or as I start optimizing (by altering planned site traffic and directing users onto different types of pages to consume the premium content; or adjusting the bandwidth on each page) the underlying costs and associated income are quickly measured. This approach makes it easy not only just to gauge performance, but also to set goals and to devise new ways to meet these goals.

### Sample Sheet: Page Type

For every classification of page, I create a worksheet that reads a little something like this:

Page Type kilobytes re-use discount total cost (in K)
HTML Size ? 1 (func)
Images Used Across Site ? .35 (func)
Images Used On Parts Of Site ? .7 (func)
Images Specific To Page ? .95 (func)
JS Used Across Site ? .35 (func)
JS Used On Parts Of Site ? .7 (func)
JS Specific To Page ? .95 (func)
CSS Used Across Site ? .35 (func)
CSS Used On Parts Of Site ? .7 (func)
CSS Specific To Page ? .95 (func)
Flash Used Across Site ? .35 (func)
Flash Used On Parts Of Site ? .7 (func)
Flash Specific To Page ? .95 (func)
Images Used Across Site ? .35 (func)
Images Used On Parts Of Site ? .7 (func)
Size of Video File / Stream ? 1 (func)

Huh?!? You ask. This looks confusing! Well, that’s because it is! Unless, of course, it isn’t! Or maybe it is? N’est pas?

The above chart is really just a template that I use for every classification of a webpage in a userpath.

Classifications are pages that are ‘similar’ – either built off of the same general template, or that have similar size & function on a sitemap. In publishing sites, you would have classifications such as: Index, Archives, Entries, Entry-Photos, Entry-Videos, Ancillary; in social networking sites, classifications would be: Dashboards, Profiles, Modal / Messaging, Ancillary, etc.

For each page classification, I break down the client-downloaded-assets into specific components: the actual HTML, external images, js, css, and rich media. I then classify each of those components by their usage type: items that are loaded only for that page, ones that are used throughout the page-class, or ones used site-wide.

By tallying up each type of site content, and classifying its usage type, we are able to get an idea of what is actually sent to the browser each time. Not to mention we get a good reference on how to better optimize the site further down the road.

Let’s think of a real-world example: imagine a publishing site where you read an article about a celebrity and view a couple of photos of said celebrity at a press function. On this ‘Article’ classification, we can think of page assets as such:

– Used Across Site

The main css / js files; the site design template images, etc.

– Used on Parts of Site

‘Section’ images on the site design – like ‘active’ status on the nav, in-site promotions, category specific design files, any special js / css used for this template, etc.

– Page Specific

The HTML, the photos of the celebrity appearing only on this page

For those unfamiliar with caching, here’s a quick overview: every time a user visits this website, they download the css / js files unless there is an active cache. Once that file has been cached, it remains until some sort of expiry (closing the browser, waiting 10 months, waiting until the file has changed on the server, etc.) and will not download again. If a user clicks on the index page, then a category page, and then two discrete articles, they will only download the uniquely different files one time each.

In order to get these numbers you can do one of two things:

1. For an active site, you can use server logs and analysis software (or Regular Expressions and some clever scripting) to get accurate numbers for your 80th percentile
2. Safari and Firefox both have features that report the media on a page, along with their size and URI (uniform resource locator). These reports can be coerced into a spreadsheet page with some light scripting or a bit of “copy and find / replace fun”. Using these tools, it becomes fairly simple to figure out what the file loads are, and where the current and potential re-usabilities lay.

To account for caching, I use a discount rate — which I have shown in the above example as 35, 70 and 95. This means that *on average* these items are only served from our server on 35%, 70% and 95% of the requests. Note how I use the concept of served – the numbers that we care about are how many times the content is fetched from our servers, not needed. Users have their own caches running locally on their browser or machines, networks / ISPs and offices often have shared caches too – so there are many times when the browser may request a file, but that request is served by a server / service that we don’t pay for.

# A Fun Note On: Bandwidth

A lot of people wonder “How much should bandwidth cost?”

While the answer is always a joking “as little as possible”, the actual question is a bit more difficult to answer.

Every service provider charges for bandwidth using a different price, and a different method.

It’s not uncommon to see one hosting company to charge .10¢ for a gigabyte of bandwidth, while another charges $3 or more per gigabyte.

Bandwidth is also charged by different units.

Managed hosting companies (they provide hardware and you pay monthly for exclusive or shared usage) tend to charge in terms of (giga)bytes — or for each actual bit of data transferred. Co-location facilities (you provide hardware and they provide a place to store it) tend to charge in bits (mega) — in which you pay for the size of the ‘pipe’ needed to deliver your content 95% of the time. Content Delivery Networks (CDNs) use a mix of the two; it’s a crap shoot to guess. Some firms also charge connection or per-request fees for clients accessing the data.

I personally like to price things out at the exact vendor costs that I am using at the time, but when projecting ‘expected’ expenses, I like to use 50¢/GB as a good ballpark. It’s the midpoint of most companies that I work with – across data-centers, hosting firms, and CDNS – so I am able to easily obtain a figure in the range in which my expenses are likely to run.

I also like to convert bandwidth to Gigabytes for simplicity. That’s how I serve content, think about storage and handle CPMs. Call me lazy — but it’s easier to standardize all of my bandwidth oriented concerns using a singe unit of measurement.

So how do you standardize units ?

## Quick Reminder: Bits and Bytes

– there are 8 bits in a bite
– when abbreviated, a lowercase b means bit and an uppercase B means byte.

## Quick analogies for pricing models :

GigaByte based pricing – this is like the water company charging you each month based on the total number of gallons you consume.

Connection-Fee charges – In addition to the per-gallon charges, you’re also charged a fee every time you turn the faucet on.

Throughput charges – Instead of billing you for the water, you’re billed for the size of the pipe your house needs to receive all its water.

## About : Throughput based pricing

Many companies charge bandwidth based on throughput — the number of bits delivered in a month. Your throughput is measured at the datacenter every 5 minutes or so, and logged to database. At the end of the billing cycle, they use something called 95th percentile billing — they drop off the top 5% of usage spikes and charge you for the size of the pipe needed to deliver the rest.

If you do a bit of math, you’ll learn that a 1 Megabit connection has the theoretical potential to deliver about 320 gigabytes a month. That’s at full-speed nonstop, which is unrealistic. In practice, you’re likely to see a 1 Megabit connection drive somewhere around between 150 and 200 GigaBytes a month between standard usage and the periodic bursts. So when you need to convert the monthly cost of a Megabits/second pipe to GigaBytes, use 175.

# A Note On: Optimizing

Once you are able to estimate your initial page bandwidth, and use a spreadsheet full of related cells, you’ll have a good tool to utiliaze when setting goals and working to achieve them:

– Your page class sheets will give you a good idea of what needs to be optimized, and what does not.
– Your userpath sheet can have page-visits for each item altered to show different ways of herding users
– You can make multiple workbooks, userpath sheets or page-class sheets to test out different concepts.

My personal tips:

– I often clone a page and experiment with how I can better leverage cached images on the site to try and lower the bandwidth cost
– On my userpath sheet, I’ll often have multiple user paths that each look like that first table. They’ll reference different pages, possibly / often alternatives to existing pages, and also have different numbers for pages-per-visit, as I alter different potential user experiences.
– On ad supported websites, on the main P&L sheet, I group these bandwidth costs with the costs for the server infrastructure that supports them AND the ad revenue generated since they all affect one another. I generally leave the labor out of this equation and keep it with the corporate labor costs.

# A Note On: Advertising

I use the phrase “Ad supported” quite a bit, so I feel the need to remind you about the types of advertising revenue that you can realistically expect to generate. Realistically is the key word.

If you’re a niche site and you have a good team driving your company, you could be aiming for a $25 effective page-CPM (the aggregate of all ads on the page) when you are functioning at your peak. While you may command $75-100 as an ideal page-CPM rate, after you deal with sales and inventory concerns (you’re not selling 100% of your adspace at the numbers you command) your average will be around $25 — if you’re lucky.

There are plenty of other costs associated with running your company too – you have to pay for content production, licensing fees, your office space, lawyers, accountants, marketing, etc etc etc. Making your ad supported operations as efficient as possible is really important, because you not only have to cover the costs of delivery — but of production, overhead, and then hopefully profit as well.

Here’s another Startup reality check-

For all other sites — and especially for niche sites on non-publisher content pages — you should aim for a $2 CPM and feel happy if you get that. You should be ecstatic if you make anything more. Even with the most advanced targeting and tracking, you might bump up to $4-8 if you’re very lucky. The industry standard right now is well under $2 for ads that very few people notice on websites that very few people care about. Unless you have the largest PR initiative behind you, that is going to be your web project.

Many large social networks are selling overflow / non-targeted CPMs in the .10¢ to .75¢ range, and only commanding more money from their less-used advanced targeting buys or site sponsorships. The more targeting that you either have or push for, the more infrastructure that you need on your side — either specialists to handle sales, or technologists to implement 3rd party tools. So while it is possible to make more, you will absolutely need to spend more in order to do so.

In essence, you will not get a $25 CPM out of the gate for premium content pages, and you will likely be making under $2CPMs for everything else.

# A Note On: Application Priorities

I noted before that 90% or more of your users will likely be casual — account for about 3 visits of 3 pages each per month. What I should also add is that it is not uncommon to see 70% or more of a site’s traffic driven by one or two URL’s on your site — such as the homepage, a profile/dashboard page, or a latest-news page.

It’s important to use analytics tools to figure out exactly what these pages are, and be extremely aggressive in your optimization and monetization strategies — performance on these pages can make or break your business.

It’s also important to identify these pages for prioritization in your Product Management and Design. These pages can drive most of your revenue, and set the tone for your user experience – leading to stronger connections with your userbase. You will often want to prioritize the integration of new user and management features into these pages.

You’ll also want to keep constant metrics on your application’s usage — your userbase demographics will change as you scale, and users often shift their own patterns as your competitive landscape changes.

A company I recently worked with realized that 72% of their traffic was coming from 3 pages that shared the same admin tool — despite earlier patterns of heavy usage on another part of the site. This was because of both from a 20x scale in users from their startup to production phases, and the introductions and failures of other players in their space. By keeping tabs on their metrics, they were able to fine-tune their product. Instead of spending their finite cash and resources to enrich the overall user experience throughout the site, they decided to focus on streamlining the management operations for the new high-traffic pages. The result? Those pages became even more popular with users and cheaper to maintain — providing increased revenue to overhaul the rest of the site, and a larger/more loyal userbase to advertise the new features to.

# A Note On: The Numbers

This is not an exact science.

Nothing in this article is exact. Many of the figures are based on forecasting usage patterns. When actual server logs are used to show actual user / download patterns, their numbers vary — and are accurate for their site and their site alone.

The point of this exercise and skill is to determine the likely *average* usage patterns, and to optimize a web service to meet them.

These figures also do not take into account a lot of VERY important ideas:

– You may need to scale servers — add additional servers to offset CPU and memory stress. Each additional server can cost between $20 to $1000 a month to operate
– This does not include any sort of labor costs — no engineers, content producers, management, or anything of the like. You will often need to scale your labor to accommodate more consumers.
– There are tens, hundreds, thousands of obvious things that I’ve omitted from this overview

# A Note On: Startups – Being Scared (or Not) and Entrepreneurial Spirit ; or How much should you budget ?

I wrote this article to talk about a concerns for companies that are focused on generating revenue through web-applications. I did not write this article to talk about making cool and kick ass webapps.

If you’re trying to build a business – one with employees and healthcare and offices and a water cooler – you need to be thinking operationally and about income streams. You can have the world’s greatest webapp, but if it can’t support itself or its staff — you just don’t have a business, you have a hobby.

There’s nothing wrong with that. Not everyone is trying to build a company on the outset. If you’re a startup person, there are some good – and proven – ideas in this article that can help you cut costs and be more efficient. Are you trying to build the next kick ass webapp for fun or in your spare time? If so, you can forget about 90% of the things that I’ve written about in this article — your application can probably scale to ~100,000 users without you having to worry about bills.

My friend Rick Webb recently blogged about Dodgeball.com’s demise — their product was great, the users were loyal, and the costs were low — even he could cover them himself. But the company wasn’t able to be financially self-sufficient, so the company that acquired them shut it down. This is probably a bad example, and has more to do with the parent company being inept in its handling of their acquisition, its technologies / capabilities – and most imporantly its team.

If you’re building something like Dodgeball just to build it, then do it. Don’t look back. Have fun, learn some things, and maybe turn it into a company down the road when you meet the right Sales and Operations team. But if you are a company, or you have a technology that you’re trying to turn into one — think for a moment on how you can be profitable, and if you can even achieve that.

I lost a lot of money with my startup FindMeOn, and I learned some tough lessons: We focused on product development when we should have focused on business development; and we focused on business development when we should have focused on product development. It sounds like a Catch-22, but its a simple idea — much like a seesaw on a playground, you need to find balance between what you want your product to become and what you need for it to be a source of actual income. So instead of capitalizing on the technology we invented to integrate social networks and port data, I found myself pulled in directions every-which-way — trying to make the product better and to create revenue streams to keep things moving forward at once. Small teams can’t handle sucessfully, you need focus. The result was us stopping work on the product, and just amassing an army of lawyers to handle a portfolio of patents.

If I knew in 2005 what I know now, things would have worked out very differently.

When Rick first gave me feedback on this, he recommended that I address the ballpark costs involved in covering a startup – so I wanted to conclude on that note.

I thought about the issue for some time, and came with my final thought: Getting a startup off the ground will cost you everything you’re willing to put into it; nothing less, nothing more. If you care about your business, you’ll find a way to make things work — or fail trying.

For example: If you are a small firm, or a beta product, you’ll easily find a host that gives you ‘unmetered’ bandwidth until you serve 500GB for well under $50 a month, and not much more after that. If you’re a large firm with an inept IT department, you’ll get the same package for more than $500/month from a ‘premier’ host — one that also charges you $5/GB for bandwidth over your allocation.

Technology and service options have grown so much in the past 4 years — that you can find ways to make anything work. In 2005, we spent nine weeks coding a scalable image and document serving / storage system from scratch; in 2006, that was rendered completely-useless by Amazon’s S3 – which not only handled serving and storage , but backup as well. Their solution wasn’t just better than ours — it was far cheaper even before adding in all the engineering, maintenance, and hardware that we no longer needed.

Of course, many people think that Amazon charged “too much” to serve images and was unreliable in general. Panther Express came out of nowhere — founded by the original DoubleClick team — and offered CDN services for the masses, targeting disgruntled Amazon customers as one of their demographics. Panther quickly rose to prominence – bringing an enterprise level CDN like Akamai to the masses at a much better price point and integration cost. Today there are also companies like Slicehost that offer virtual servers for $20/month — versus $200/month for similar specced dedicated hardware not too-long-ago.

So if you’re a startup — where there’s a will, there’s a way. Covering your fixed costs will almost never be an issue. If you do ever have this problem, you’re probably wildly successful and in prime position to quickly receive funding or a selloff based on your userbase alone — and you can worry about becoming profitable then. ( On the flip side, you might also have an app that wastes bandwidth or is not realistically profitable, and should rethink your product and approach ).

# A Final Note

Just to illustrate how crazy numbers can be: In a single month, friends who had a design firm spent $7,000 on their web hosting charges; other friends who run a global advertising network spent under $2,000. I’ve seen amazing amounts of traffic served on $20 virtual servers and on $150 dedicated servers.

If you are a startup and you want to build an application — just find a way to do it. It exists. I assure you. The costs of scaling your product are negligible — the only substantial costs will be for your own labor and time. If you get to the point that you need to worry about costs optimization, you already have a VC check in your hands.

If you are a startup trying to be profitable / sufficient from day 1 — and there are a lot of you out there now — your tech costs should be well under $1,000 a month if you have under 100,000 active users. If you have more than that, optimize your application and get an advertising specialist or new VP of business development in your office — because you should be making money. You need to think about these numbers not for daily operations, but for making yourself look profitable for investment or acquisition.

And finally – if you are a brand thinking about running a digital project, think of your scaling costs in terms of CPMs — how much are you willing to pay per user impression for the visit? You’ll probably have to use some sort of ‘premier’ web host and CDN, which will cost between $1,000 and $3,000 on setup / monthly fees alone. Beyond that, you’ll be paying significantly less per CPM to maintain a good relation with your customers than you would on advertising CPMs to attract new ones.

Bruce Schneier should be Obama's CTO

Everyone likes dropping names on who Obama should hire as the ‘National CTO’. Here’s mine: Bruce Schneier should be Obama’s CTO. He’s one of the few people in the country that doesn’t just know technology, but fundamentally understand it — and more importantly, the implications.

A lot of great names have been thrown around, but they make little sense to me.

The most popular name right now is Vint Cerf. He may be the father of the internet, but his skills and qualifications are in telecommunications. While telecom is integral to Technology programs, its really just a component. I think he’d be a great national SysAdmin or tech guru, but CTO?

A lot of people say Bill Gates – while I do admire him, he’s a software person. More importantly, would he really be able to divorce himself from his allegiance and interests in Microsoft?

The same thing goes to Eric Schmidt of Google; despite my opposition to them in light of our Patent/Trademark issues, would the head of an internet company be appropriate? And could he really be neutral , in light of the massive lobbying activities Google has undergone on everything from advertising to M&A to the radio spectrum ?

Jerry Yang could probably do a great job of making people use Change.Gov every day, but is that what a CTO is about?

Fundamentally, CTO is a management position, one that seeks to ensure the technology planning and infrastructure best delivers on the business goals. A high profile CEO from a tech firm, or CTO from a web firm may not have those qualifications .

The CTO would likely be overlooking

– Long range technology planning
– The integration of national departments ( ie: information sharing between FBI, NSA, CIA, DHS etc )
– The likely digitization of medical records
– The shift to , and implications of, new energy resources
– The role of technology in education, communications, and even national defense
– the list goes on

The only person I can think of that could handle that is Bruce Schneier. He’s not famous for running a huge company, inventing a specific technology, or having his name in the news — he’s famous for being a security expert , and one that is right. Instead of being a specialist in specific areas of technology, he’s someone that understands how different pieces of technology and systems are all interwoven; how they create a grid and affect one another; and for (often) brutal attacks on bad implementations.

That’s the kind of technology leader we *need* in government; I can only hope — whomever Obama ultimately chooses — its the kind that we get.

How to find out about Apple product releases in advance.

Some friends have been clamoring lately about Apple ‘rumors’ regarding product lines and release dates for the MacWorld expo. Everyone is trying to get the inside scoop – trying to get Apple employees and Market Research Firm analysts to spill the beans.

They’re wasting their time.

History has proven that Apple manages product knowledge amongst its employees incredibly well — most people there can’t leak info simply because its provided on a ‘need to know’ basis until they actually need to know. That means that most employees don’t know about a product until its in the stores. The ‘Market Research Firm Analysts’ just make crazy guesses based on PR releases from suppliers. They’re often wrong.

There are a class of people who do know what’s going on though – and if you want to find out about Apple, Microsoft, Intel, or other tech-company products, you should be turning to them. They’re called Portfolio Managers for Hedge Funds and Private Equity firms.

Every quarter, like clockwork, the top brass of Tech companies fly to the NYC area and make a round of meetings to talk, sometimes even demo, forthcoming super-secret products to the people who manage institutional investing.

This isn’t shown to the market/research analysts at the firms; it’s not shown to the ‘industry experts’ who rate the stocks; its shown to the people who manage portfolio sections and do trades in bulk.

I’ve always found this practice odd since I heard about it a few years ago. Despite my existing opinions of the stock market, I fail to see how behavior like this differs from the universally panned ‘insider trading’. Nevertheless, this is the Status Quo for Wall Street — and everyone who is a couple of degrees away from people in institutional investing knows about it.

So if you want to predict or guess about tech company announcements — forget looking at PR releases by people in supply chain, and start looking at the movement and grouping of stocks by institutional investors. Maybe even ask around, and see if someone there will spill the beans — whatever does get released next week, they’ve seen it already, three months ago.