Archive for the ‘taxonomy’ Category

“Categorizing is necessary for humans. But it becomes pathological when the category is seen as definitive, preventing people from considering the fuzziness of boundaries, let alone revising these categories.” — Nassim Nicholas Taleb

We’ve come to know where we live as much by local food movements as our schools and property values. In the last decade “buy local” has come to mean that we’re creating environmental sustainability, healthier diets, and more close-knit communities by eating the food planted by the farmers who work the same lands that support us and our grassroots economies.

Can the same be said for information, really? If we limit our inputs to what’s nearby aren’t we limiting our perspectives? If we shop locally for our news how can we generalize the broader forest from the specific trees? Our collective self-interests stick close to home. Aren’t we further compromising our own narrow focus by foresaking the interdependencies and complexities that can only form by holding our own hides up to a more inclusive global perspective?

Localizing information sounds like an open invitation to invite in what Taleb calls “the contagion” or the herd mentality that traps independent thinkers into parroting the same parochial mindsets:

“The process of having these people report in lockstep caused the dimensionality of the opinion  – they converged on opinions and used the same items as causes.”

That uniformity of perspective-taking is not about one’s sourcing as it is about reporting; for our purposes this is the 24/7 news cycle. This formula is not set to increase to a 25/8 cycle no matter how dense the news flow, how rich the implications, or how clued in the recipients. Rather Taleb is suggesting a need to defy the pattern by tabling judgments; a need to ignore loops expecting to  be closed in time to declare some daily distortion, fueled by the need for definitive outcomes.  Filling air time is one thing. But we confuse it for filling our mental shopping carts with all the evidence we need to decide …

  • guilt or innocence
  • one party over another
  • winners and losers

None of this defines a local information as a movement — a force for good — or even for food for thought. Localizing the information we’re fed means sourcing our news providers well enough to know their locales and to see through their own self-referential conceits, blinders, and potential conflicts of interest. Until we know where a fact was selected, when an interview was granted, or who took the time to file a FOIA, we will be taking our information sources on the same blind faith that poisons us on factory beef and processed food.

Whether our informants are networks or neighbors we need to know of the company they keep before we can build the same independent perspective we insist of our news providers. The leading bias is self-selection. Nature abhors a vacuum. Talk may be cheap and free speech may prove expensive. Vacuums are pure legend to the media will never acknowledge the existence of one. Still, that doesn’t obviate our need as researchers to cultivate a balanced media diet.

Localizing the intentions of our news providers is one place to start.

"When you use more than 5% of your brain, you don't want to be on Earth."

I’ve probably teared up more at an unfair hockey fight and I’ve had more emotionally-engulfing movie goings. But as far as a life philosophy that plays out on screen, no self-contained cinematic mythology holds my candles quite like Albert Brooks’ Defending Your Life.

For most of the story Brooks’ day of judgement is about to play out in the purgatorial trappings of a Disneyesque  lodging and office complex. Is the protagonist to advance on the eternal enlightenment path to some higher plain? Will he shuffle back on the next tram for a return date with “the little brains” on earth? That swipe at us live inhabitants is a line delivered by Rip Torn, Brooks’ defense attorney who testifies to a 53% utilization of his own cranial capacity. Us little brains use 2-3% — the remainder of our mind-shafts are crowded out by lethargy and fear.

When One Framework is Worth a Thousand Taxonomies

I’ve wondered what fears could be confronted and ultimately shed so that I could soar perhaps from 2-3% up to 5-6?  In that spirit I’ve recently stumbled across a framework called Bloom’s Taxonomy. Like the defense lawyer slam at our low-performing mental capacities and fear-mongering, Bloom said that 80-90% of our highest brain function in the lowest realm of sense-making. He calls this “knowledge.” Knowledge is accessed through the following retention portals:

remembering, memorizing, recognizing, recalling identification, recalling information, who, what, where, when, how, describing

Kinda oafish, no? It’s deciphering 101. It’s on or it’s off. X=Y or fuggedaboutit.

The pattern-matching of keywords is not the face that launched a thousand ships but the probability gag that seated a thousand monkeys at their typewriters in order to write the great American novel or the great American Internet start-up — what ever cashes out higher.  Just Ask Jeeves! These are the well-trodden grounds of that cloistered chamber you and I have come to know as web search. Its premise is still tuned to exact match good enough-ness. That’s because we can be sold nouns even more easily than the notion our mental blanks are being filled in my omnipotent language engineers. We frugal consumers cave to deals on things — not to actions about ideas. Nouns are the merchandise — not the verbs that help us to backorder our understanding of what we actually do with our bill of goods. Unless we’re potential suspects in a case, no one is interested in our trail ‘o stuff — unless they can sell it to us again.

The next order of mental processing is to isolate noun phrases from their predicates. That means getting the search engine to distinguish actors from their actions, reducing outcomes to a range of questions we’re ready to answer — or at least lower our surprise should they arise. That kind of conditional logic exists in our mental reflexes whether we’ve had our morning shower or coffee.

It’s interesting that in the pecking order of brain function the inverted pyramid of journalism ranks somewhere in the custodial closet of the ivy-coated shrines of higher learning. Not incidentally these are the unremarkable terms on which IBM’s Watson, the question answering machine, beat its human Jeopardy contestants to the buzzer. It took a fact base so bottomless it would turn baseless in the gear shafts of the most fervently applied quiz show savant. Watson’s algorithmic swagger chewed through mounds of trivia like a smoldering ash heap of documentation fertilizer.

Elementary School My Dear Watson

The conquest prompted one of the IBM partisans to reflect in the New York Times on finding Watson more meaningful work:

“I have been in medical education for 40 years and we’re still a very memory-based curriculum,” said Dr. Herbert Chase, a professor of clinical medicine at Columbia University… The power of Watson- like tools will cause us to reconsider what it is we want students to do.”

At the same time Watson’s next gig as a physician’s assistant begs a more immediate question: how do we humans need raise our learning games to Bloom’s next levels of comprehension, application, analysis and synthesis? How do we aid and abet the healthy transfer of between us inquiring pea brains?

Knowing a lot about an academic discipline is at best, tangential to teaching it. Having a natural understanding of a subject can be an unnatural fit for passing that understanding along to others. Assuming that academics are better at publishing papers and attending conferences than in educating students, the question falls to the insatiable learners among us: how do we teach ourselves on a level beyond the aspirations of Watson’s parents? How do we convince supple, young minds that a healthy dose of skepticism about humans is only the first of a storehouse of rational and instinctive reasons to doubt the merits and intentions of question answering machines?

The current cover story of the Atlantic Monthly offers up Mind Versus Machine. Here science writer Brian Christian serves in the oppositional role of the two Jeopardy adversaries to Watson. The objective of the annual Turing Test is for AI (“artificial intelligence”) programmers to convince a sequestered panel via screen text that a machine could out-human its creator in a range of topics spanning from “celebrity gossip” to “heavy-duty philosophy.” The advice Christian was given when cramming for this contest?

“Be yourself.”

Gee, and I thought I knew how to body surf with the more cryptic sharks!

Five minutes of IM messages later Christian was crowned the winner of the Most Human Human Award — chiefly for two reasons:

  1. His dominating volleys (he’s not waiting on Alex Trebek to pounce, pry, or provocate)
  2. His insights into how the bottom feeder knowledge spoon-fed to his AI adversary highlights natural human intelligence in the experiential realm:

One of my best friends was a barista in high school. Over the course of a day, she would make countless subtle adjustments to the espresso being made, to account for everything from the freshness of the beans to the temperature of the machine to the barometric pressure’s effect on the steam volume, meanwhile manipulating the machine with an octopus’s dexterity and bantering with all manner of customers on whatever topics came up. Then she went to college and landed her first “real” job: rigidly procedural data entry. She thought longingly back to her barista days—when her job actually made demands of her intelligence.

That’s a lesson well worth reteaching ourselves the next time we find ourselves needing to justify more question/answer sessions scheduled in the upper eschelons of Bloom’s taxonomy.


I notice that whenever I give my S-Y-N-C talk the note-takers reach for their pens when the discussion comes to verbs. The action-based taxonomy that I advocate is a simple and effective way to anticipate (and eliminate) some common barriers to enterprise architecture before we crash into them:

* Hair-splitting — The chances for semantic quibbling over what to call stuff are greatly reduced when things become actions. There are many fewer ways of describing a predicate than a subject. The likelihood for shared agreements increases.

* User-centric — Instead of fighting over what to call things an action-based taxonomy helps us agree on how and why our customers draw on our content supply.

* Reporting — You can’t plot the outcomes you’re supporting (new IP, project requirements, business development) without building an architecture atop the actions needed to trigger those developments.

* 80/20 Rule — If every 80/20 rule lined up in single formation they would all be parading to the battle hymn of mother necessity; that the perfect is the enemy of the good. In our marching orders action is the most telling of all metadata elements because it reveals those deepest and fleeting mysteries of all uncharted KM waters — who wrote this sucker and who was their intended audience? Figure out that side of the shipping manifesto and: (1) you’re 80% of the way from content supply to knowledge demand; and (2) your cargo gets unpacked. Why? Because it has an identity that speaks to users.

* Disambiguation — Probably there is no greater praise for verbs than giving some long-delayed respect they deserve for disambiguation. Next time you hear yourself mutter: “use it in a sentence” tell me the word that drives you to the home of understanding isn’t a verb. And while it may be their job that’s no reason to overlook their vast powers of clarification.

One of the prime points I’ll raise tomorrow at the Boston Gilbane Conference: Where Content Management Meets Social Media is the garbage-in, garbage-out notion that post-Google content production is really all users about picking through the information scraps for that one unsuspecting gift of credible, airtight, leverage-worthy understanding.
It’s about the yard sale of search results where my precious time is suddenly not billable because I’m no longer on the clock — I’m on Google and anything that turns up is free and clear or I turn it down.

The idea that a corporate intranet could become the pleasure center of a redemptive search ain’t gonna happen because of one vendor over the next or because our corporate intranets are rebranded as incubators for the soon-to-be monetized grand designs of our wannabe thought leadership elites. No one slaves over their best strategic thinking on a corporate intranet. No one honestly believes that the answer to better content is to pay a premium for it (or any fee at all).

The ultimate triumph over unlimited content in a time-sensitive world is to hook-up the content pipes to the quantifiable demand for knowledge (information worth putting to use). Do that and that surplus of supply can be a blessing. Of course that means understanding what your customer need to inform their decision-making. Does that mean invasive surveys? Does that mean reading the long tail of your search logs in dubious hope that a pattern emerges? Does that even mean that your users know what they want (in advance of seeing it)?
Here are a few pragmatic pointers:

1. Do that hookup maneuver in your metadata structure. Connect your taxonomy to how people complete their work (actions) — not some unwinnable debate about what to call things (nouns).

2. Make your search tool do the heavy user lifting. They should not have to guess about where their next productive experience is coming from. Conversely you must be vigilant with your providers to make sure they are sensitive about where they locate their content — otherwise your users have to care too (they’re probably more interested in telling their life stories to Survey Monkey).

3. Create one dignified and significant workflow where an important milestone triggers the telling of those teachable moments that keep people like me employed as KM professionals. Maybe it’s dissecting a win-loss. Perhaps it’s an illustrious use case. Either way it’s an instructive lesson about how to model success and draw important distinctions that were not obvious prior to when the story takes place.

4. Include in the storytelling the other relevant links and deliverables that document the life of the project in question. That’s how to grow the content base in step with the knowledge deficits you’re trying to balance.

Yesterday I attended Laurie Damianos’s discussion at the Boston KM Forum (“Tag Me — Social Bookmarking in the Enterprise”). I had the good fortune of meeting Laurie first at last spring’s Enterprise Search Summit.

I found that my number of questions for Laurie has increased since our last interview for the Provider Base piece set to go in the Nov/Dec Searcher. That increase is not because Laurie dodges good questions. It is inspired by the topic — the richness of the subject itself.

To her credit as a speaker Laurie led an attentive and engaging group whose inputs were both numerous and broadly distributed. Here are some of the more engrossing threads of our dynamic session:

Life in Email —

The immediate remedy at Mitre began as the antidote to a ton of email sitting on some restricted fileserver archive. Increasing access points to content was the business case. A persuasive case was made that there was an over-reliance on 1:1 communication (email) whose knowledge might prove useful to others. Interestingly the Mitre approach includes bookmarking email message based on embedded links in the message.

Anatomy of a Tag —

Do the users make up their own terms? Apparently they have the choice between a pre-formed set of suggested tags or their own. The form includes the original bookmarker and others who’ve bookmarked their entries. Laurie refers to the comment feature as “a reverse blog.”

Links to Nowhere —

Pointers need owners or the link goes stale. The broken link icon shows benefits of a link scan process that tests for 404 errors. Each owner is notified of the bookmarks they develop which they can choose to ignore, fix, or delete. Hovering over a padlock tells the user how to pick the lock (i.e. what password to use or group to contact). Users can mouse over faces and get lots of detail at a glance. When they leave the company the residual bookmarks are placed in separate account — they can be copied for 90 days or let expire.

Social Tags (Supply) and Search Terms (Demand) —

The terms that bubble to the top of the results pages are not repesented in Mitre’s subject taxonomy. According to Laurie the taxonomy is growing … slowly. The search term is creating equivalencies between search terms and tags. Governance rules are in place to maintain the folksonomy so it is not altered by an intermediary. Laurie’s team allows the differences to remain, not trying to normalize different forms of the same expression.

Expert Finders —

Administrators and gatekeepers had all the topic-related documents so they’ve been falsely deemed as experts. The same fiction occurs when a top-tagger is confused with being an expert in the subject they’re tagging. There’s no gaming of the system because there’s no built-in incentive to compete head-on or outrank the next prolific tagger.

Social Bookmark Reporting —

There’s a seven day window of the most recent popular tags. This breaks the dominance of librarian taggers as most prolific contributors. Tagging activity shows how users are related by interest area. Users can view bookmarks by department. Different sorting options including tags, bookmarks, bookmarks by department. Laurie noted some surprisingly bad taggers even as the firm’s KM enablers or “knowledge stewards.” Lynda Moulton noted that it takes mindset to do it consistently and effectively. People are getting it.

Tagging by the Numbers —

The system holds…

* 21,000 bookmarks
* 99,000 tags with 12.5K unique to the system (– doesn’t account for spelling and punctuation discrepancies)
* Average number of tags have doubled from 2.7 to 5.4
* The past three years the average is that bookmarks are 83% external

Performance Benchmarks —

According to Caterina Fake of Flickr 9-15% of population are contributors in social communities. This equates to 85-90% of users as lurkers. Fourteen percent of user population contributes at Mitre among those with access. Half of employees use the system.

Next Steps for Tagging —

Laurie mentioned an organization called LCC (“Language Computer Corporation”) that examines semantic construction of document to relate tags to each other by generating “did-you-mean-this prompts” to the content provider. It also make recommendations to other users: “you need to talk to this person.” It’s based on common interests they’re sharing beyond the recognition that their interests are shared.

Other Tagging Resources —

FURL caches the bookmarked resources. Users request feature but can’t provide it internally because of copyright restrictions. Scuttle is easy to deploy and extend. Twine is another solution with an interesting social component. ConnectBeam and DogEar were also mentioned as self-contained tagging platforms.

As the first post on the SIKM talk suggests knowledge planners are the de facto arbitrators between what the user base is seeking and what the provider base contributes towards the fulfillment of those requirements. However it is also true (at least in an outfit of under 1,000 headcounts like mine) that we’re really talking about the same individual. There is a built-in reciprocity in any system when its inputs and outputs are shared by its members on both sides of the exchange.

In an organization where utilization is king, centralization is non-existent, and administrative overheads get the hairy eyeball, Knowledge planners need to leverage this awareness for all its worth! Add to that the fact there are no formal reward structures in terms of recognition or bonus compensation as providers and the “feel good” pay-it-forward benefits of becoming a faithful KM contributor start to pale in comparison to just keeping one’s head above water.

So how do you plan for knowledge in a firm where any task not immediately billable is at best on when-I-get-to-it status and in the scarcity mentality of recession cycles placed on life support — if supported at all? The operationalizing of KM requires a parallel be drawn between the firm’s business opportunities or pipeline with the IP it generates for pursuing those opportunities — READ: document pipeline. Thinking about a new business presentation? Don’t even hold that thought without consulting KM first!

How do we hardwire this reflex into the firm’s go-to-market activities?

How does the urge to check KM become less of a “to-do” and more of a “got-it-done” determination topping your best-in-class impulses?

There are several knowledge flows that weave well into project workstreams and feed new business initiatives. These are the case summaries that tell the story of past client engagements. From a process flow perspective their filings are as closely tracked as any uploads or new site requests. That’s because each submission also contains the definitive deliverables — those final presentations that represent significant IP generated within past projects — hence the document pipeline needed to keep KM as current as the projects themselves.

What does this process look like to colleagues?

Well from a top-down perspective the directors get regular compliance reports — sorted by director. From the bottom-up view junior level consultants get the training they need to host their own team-only workspaces for creating and storing in-process materials. They get comfortable with the tools and the knowledge planner prepares them for the eventuality that they will elevate the significant deliverables when their project closes — Surprise, surprise — it’s from the ground up that the IP capture effort is conducted by the less seasoned consulting staff.

The result is that even overtaxed, distracted nonbelieving thought leaders still see their best work staged in KM. Even without recognition programs, billable KM work, and a strong penchant for non-involvement we see a system where over two-thirds of all employees log in at least once a month and half at least once a week. Not bad for a firm that collectively experiences twice a month paychecks, once a month staff meetings, and precious little else.


Although the condition lacks the resources of a well-established disorder, those that develop it can make a total A.S.S. out of themselves with little to no outside intervention. In fact the total A.S.S. is not overwhelmed by TMI or intimidated by non-sequiturs. Au contraire.

The A.S.S. populace finds comfort, even a calling, in being able to box, package, group, intercept, and ultimately classify the frequency and nature of persistent, flowing, and overabundant information. Any virtual folder, RSS feed, or content bucket will do. Name the content management system or the virtual community and the total A.S.S. will go the distance, naming the buckets before tracking the frequency and defining the nature of what fills them — evenly of course.

The H.I.T.s (“high information thresholds”) sustained by these deviant thinkers can result in some improbable but productive outcomes. Attention measurement systems are the design of A.S.S. thinking applied to the problem of closure around TMI — namely how do I logoff with confidence when I don’t have the option of being on a call, at a site, in a meeting, over a barrel, and under the wraps of any potential wrinkle that could interrupt or complicate my work day, career goals, or someplace inbetween.

Want to split the difference?

Even when they are not this practical a good attention measurement system helps business analysts and media watchers to apply meaningful standards to market behaviors and the corporate spending done to promote and discourage them.

The bottomline is that if you’re about to meet a person (who you know way more about from Googling them online) the more you need to announce your candidacy for attention surplus status. There’s no need for a treatment or community awareness — just the freedom to make a total A.S.S. out of yourself.


Last week I attended the ESS show at the New York Hilton. I think the most salient smoke screen to hit my radar was the game-changing notion that a successful search deployment (and there are more than a few to be found) shifts the stakes from a user to provider-centric view of enterprise content. But the first few steps are tentative. This is not so much a groundswell as a groundbreaker — and once it catches fire, a deal-breaker too: Want to improve user experience? Increase provider participation.

The distinctions that once colored content producers and contributions continue to bleed together. If I offer an opinion about how well an instruction helps me do my job am I a passive consumer or an engaged community member? If I download a bunch of presentations and preview several others am I identified as a content collector? Does my mounting collection signify a degree of influence that the author has over my efforts to absorb, master, and ultimately leverage this material? No matter what the motivation, no matter what the conclusion … what we do with content will one day eclipse the content itself.

My favorite new feature on display at the show was a passive approach to meta content, a.k.a. content about content. A vendor named BA-Insight has devised an ingenius way to capture the secret life of documents. These secrets reveal what sway the ideas conveyed by our peers have over us. In the publishing world this is a simple units sold formula. Behind the corporate firewall this is called a Wiki that gets updated almost as often as a freshly proposed solution is minted as a new product innovation, business model, or marketing approach. No matter what the end game it’s a provider’s market and the easier it is to reach it, shape it, and build on it, arguably the better the outcome.

So how did this play out at the vendor booth? BA-Insight’s Longitude product collects all kinds of passive feedback — downloads, previews, tagging, and other recordable sessions events take the opt-in approach to a whole new level of discretion. Essentially the record button is pegged to the user ID, rendering the idea of an active observer to that of the actions taken by that same user. This approach to meta-content is completely passive, preserving an untampered search session. That means no gaming, back-scratching, user surveys, or votes to cast (the equivalent of internal pop-ups). This is user feedback of the purest, organic degree. Perhaps the purity is why the team from Accenture that provided the case study has so far shied away from this feature?

The other nice-to-haves are alluring to any professional services shop that belches, respirates, and wheezes in PowerPoint. For instance Longitude’s preview pane decouples bloated ZIP folders (the last refuge of a knowledge provider with no time or incentive to upload their stuff. Not only can you cut and paste right out of preview but you also see the pockets of relevance by page number, adhering to your keywords.

All the whizbangetry was blown away however by Kevin Dana’s elegant and sparing AJAX customization that registers a keyword lookup on Accenture’s back-end index. This search suggestion feature would come in handy for any enterprise where the user base is not graded on keyword creativity but their ability to reshape existing outputs in the form they’ve been tasked to regenerate. Creative? Well maybe on someone else’s clock and with someone else’s IP!

The search sugestion feature comes to mind when considering Tuesday morning’s panel led by Jean Graef on social search where a discussion about user inputs into enterprise search quickly led to a familiar tradition-bound versus web 2.0 fight on who was better equipped to carry the findability mantle into the next round of version-dot-placeholder. MITRE’s Laurie Damianos says that much of the collective intelligence (and foundational content) added in her enterprise comes from employees’ pre-existing Del.icio.us tags. MITRE has built and fielded a social bookmarking prototype, creating public profiles from RSS feeds for internal indexing. Damianos says the effort has led to a referral system, promoting common tags and using recommendations for similar labels. She also raised the often overlooked question of content lifecycle management and the link root brought on by broken links and outdated page references. The team currently enforces freshness by purging all tags that go inactive after 90 days.

I thought the real panel-stumper was put to the next roundtable on BI Tools hosted by Steve Arnold. Graf Mouen of ABC News asked what Arnold, Northern Light’s David Seuss, ISYS’s Derek Murphy and SAP’s Alexander Maedche saw in terms of their accounts investing the needed resources in something more critical than tagging feeds, search tools, preview panes, and text analytics — that resource is on the firm’s own domain experts. All deployments regardless of technology, vendor, cost, and implementation smarts can only go so far without their participation. The sober answer of “not much” belied the unnatural state of seeing Seuss and Arnold in actual agreement.